PDF objects in MuPDF

Object types

null

represents the null PDF object.

bool

either true or false.

number
int

holds a C int.

int_offset

holds a fz_off_t.

real

holds a C float.

name

holds a null-terminated string.

string

holds a buffer and it's length.

array

holds a list of other objects.

dict

holds a list of key value pairs of other objects.

indirect

holds a pointer to an indirect object (object und generation numbers).

Common properties and methods.

The opaque pdf_obj * type is used to represent all types of objects.

Each object has a reference counter. The counter is incremented with

 pdf_obj *pdf_keep_obj(fz_context *ctx, pdf_obj *obj);

and decreased with

 void pdf_drop_obj(fz_context *ctx, pdf_obj *obj);

Once the counter gets 0, the object will be deallocated.

TODO: flags marked, dirty, ...

Type queries

 int pdf_is_null(fz_context *ctx, pdf_obj *obj);
 int pdf_is_bool(fz_context *ctx, pdf_obj *obj);
 int pdf_is_int(fz_context *ctx, pdf_obj *obj);
 int pdf_is_real(fz_context *ctx, pdf_obj *obj);
 int pdf_is_number(fz_context *ctx, pdf_obj *obj);
 int pdf_is_name(fz_context *ctx, pdf_obj *obj);
 int pdf_is_string(fz_context *ctx, pdf_obj *obj);
 int pdf_is_array(fz_context *ctx, pdf_obj *obj);
 int pdf_is_dict(fz_context *ctx, pdf_obj *obj);
 int pdf_is_indirect(fz_context *ctx, pdf_obj *obj);
 int pdf_obj_num_is_stream(fz_context *ctx, pdf_document *doc, int num);
 int pdf_is_stream(fz_context *ctx, pdf_obj *obj);

null objects

These are constructed with

 pdf_obj *pdf_new_null(fz_context *ctx, pdf_document *doc);

bool objects

These are constructed with

 pdf_obj *pdf_new_bool(fz_context *ctx, pdf_document *doc, int b);

The value is read with

 int pdf_to_bool(fz_context *ctx, pdf_obj *obj);

number objects

These are constructed with

 pdf_obj *pdf_new_int(fz_context *ctx, pdf_document *doc, int i);
 pdf_obj *pdf_new_int_offset(fz_context *ctx, pdf_document *doc, fz_off_t off);
 pdf_obj *pdf_new_real(fz_context *ctx, pdf_document *doc, float f);

The values are read with

 int pdf_to_int(fz_context *ctx, pdf_obj *obj);
 fz_off_t pdf_to_offset(fz_context *ctx, pdf_obj *obj);
 float pdf_to_real(fz_context *ctx, pdf_obj *obj);

name objects

These are created with

 pdf_obj *pdf_new_name(fz_context *ctx, pdf_document *doc, const char *str);

The string in a name object is obtained with

 char *pdf_to_name(fz_context *ctx, pdf_obj *obj);

The returned string is valid for the lifetime of obj. Return the empty string, if obj is not a name object.

To compare name objects, you use

 static inline int pdf_name_eq(fz_context *ctx, pdf_obj *a, pdf_obj *b);

string objects

These are created with

 pdf_obj *pdf_new_string(fz_context *ctx, pdf_document *doc, const char *str, size_t len);

where str has to be encoded in either UTF-16BE with BOM (FEFF) or PDFDocEncoding.

A string object's contents are obtained with the functions

 char *pdf_to_str_buf(fz_context *ctx, pdf_obj *obj);
 int pdf_to_str_len(fz_context *ctx, pdf_obj *obj);

The returned string is valid for the lifetime of obj. Return the empty string and 0, respectively, if obj is not a string object.

dict objects

These are created with

 pdf_obj *pdf_new_dict (fz_context *ctx, pdf_document *doc, int initialcap);

Query dicts

The number of key value pairs is queried with

 int pdf_dict_len(fz_context *ctx, pdf_obj *dict);

To get the i-th key-value pair, use

 pdf_obj *pdf_dict_get_key(fz_context *ctx, pdf_obj *dict, int idx);
 pdf_obj *pdf_dict_get_val(fz_context *ctx, pdf_obj *dict, int idx);

There are several functions to get the value of a provided key: