ucw-json/json.h

JSON library context

The context structure remembers the whole state of the JSON library. All JSON values are allocated from a memory pool associated with the context. By default, their lifetime is the same as that of the context.

Alternatively, you can mark the current state of the context with json_push() and return to the marked state later using json_pop(). All JSON values created between these two operations are released afterwards. See json_push() for details.


struct json_context {
  // Memory management
  struct mempool *pool;
  struct mempool_state init_state;

  // Parser context
  struct fastbuf *in_fb;
  uint in_line;                         // [*] Current line number
  uint in_column;                       // [*] Current column number
  bool in_eof;                          // End of file was encountered
  struct json_node *next_token;
  struct json_node *trivial_token;
  int next_char;

  // Formatter context
  struct fastbuf *out_fb;
  uint out_indent;
  uint format_options;                  // [*] Formatting options (a combination of JSON_FORMAT_xxx)
};

The context is represented a pointer to this structure. The fields marked with [*] are publicly accessible, the rest is private.


struct json_context *json_new(void);

Creates a new JSON context.


void json_delete(struct json_context *js);

Deletes a JSON context, deallocating all memory associated with it.


void json_reset(struct json_context *js);

Recycles a JSON context. All state is reset, allocated objects are freed. This is equivalent to mp_delete() followed by mp_new(), but it is faster and the address of the context is preserved.


void json_push(struct json_context *js);

Push the current state of the context onto state stack.

Between json_push() and the associated json_pop(), only newly created JSON values can be modified. Older values can be only inspected, never modified. In particular, new values cannot be inserted to old arrays nor objects.

If you are using json_peek_token(), the saved tokens cannot be carried over push/pop boundary.


static inline const char *json_strdup(struct json_context *js, const char *str);

Create a copy of a string in JSON memory.

For example, this is useful when you want to use a string of unknown lifetime as a key in json_object_set().


void json_pop(struct json_context *js);

Pop state of the context off state stack. All JSON values created since the state was saved by json_push() are released.

JSON values

Each JSON value is represented by struct json_node, which is either an elementary value (null, boolean, number, string), or a container (array, object) pointing to other values.

A value can belong to multiple containers simultaneously, so in general, the relationships between values need not form a tree, but a directed acyclic graph.

You are allowed to read contents of nodes directly, but construction and modification of nodes must be always performed using the appropriate library functions.


enum json_node_type {
  JSON_INVALID,
  JSON_NULL,
  JSON_BOOLEAN,
  JSON_NUMBER,
  JSON_STRING,
  JSON_ARRAY,
  JSON_OBJECT,
  // These are not real nodes, but raw tokens.
  // They are not present in the tree of values, but you may see them
  // if you call json_next_token() and friends.
  JSON_BEGIN_ARRAY,
  JSON_END_ARRAY,
  JSON_BEGIN_OBJECT,
  JSON_END_OBJECT,
  JSON_NAME_SEP,
  JSON_VALUE_SEP,
  JSON_EOF,
};

Node types


struct json_node {
  enum json_node_type type;
  union {                               // Data specific to individual value types
    bool boolean;
    double number;
    const char *string;
    struct json_node **elements;        // Arrays: Growing array of values
    struct json_pair *pairs;            // Objects: Growing array of pairs
  };
};

Each value is represented by a single node.


struct json_pair {
  const char *key;
  struct json_node *value;
  // FIXME: Hash table
};

Attributes of objects are stored as (key, value) pairs of this format.


static inline struct json_node *json_new_null(struct json_context *js UNUSED);

Creates a new null value.


static inline struct json_node *json_new_bool(struct json_context *js UNUSED, bool value);

Creates a new boolean value.


struct json_node *json_new_number(struct json_context *js, double value);

Creates a new numeric value. The value must be a finite number.


bool json_number_to_int(struct json_node *num, int *dest);

Convert a numeric value to an int. Returns false if the value is not numeric or if it is too large for an int.


bool json_number_to_uint(struct json_node *num, uint *dest);

Same as above, but for uint.


bool json_number_to_s64(struct json_node *num, s64 *dest);

Same as above, but for s64.


bool json_number_to_u64(struct json_node *num, u64 *dest);

Same as above, but for u64.


static inline struct json_node *json_new_string_ref(struct json_context *js, const char *value);

Creates a new string value. The value is kept only as a reference.

String values can contain an arbitrary UTF-8 string with no null characters. However, it is not recommended to use UTF-8 values outside the range of UniCode codepoints (0 to 0x10ffff).


static inline struct json_node *json_new_string(struct json_context *js, const char *value);

Creates a new string value, making a private copy of value.


struct json_node *json_new_array(struct json_context *js);

Creates a new array value with no elements.


void json_array_append(struct json_node *array, struct json_node *elt);

Appends a new element to the given array.


struct json_node *json_new_object(struct json_context *js);

Creates a new object value with no attributes.


void json_object_set(struct json_node *n, const char *key, struct json_node *value);

Adds a new (key, value) pair to the given object. If key is already present, the pair is replaced. If value is NULL, no new pair is created and a pre-existing pair is deleted.

The key is referenced by the object, you must not free it during the lifetime of the object. When in doubt, use json_strdup().


struct json_node *json_object_get(struct json_node *n, const char *key);

Returns the value associated with key, or NULL if no such value exists.

Parser

The simplest way to parse a complete JSON file is to call json_parse(), which returns a value tree representing the contents of the file.

Alternatively, you can read the input token by token: call json_set_input() and then repeat json_next_token(). If you are parsing huge JSON files, you probably want to do json_push() first, then scan and process some tokens, and then json_pop().

All parsing functions throw LibUCW exceptions of class ucw.json.parse upon errors. If you want to catch them, call the parser inside a transaction.


struct json_node *json_parse(struct json_context *js, struct fastbuf *fb);

Parses a JSON file from the given fastbuf stream.


void json_set_input(struct json_context *js, struct fastbuf *in);

Selects the given fastbuf stream as parser input.


struct json_node *json_next_token(struct json_context *js);

Reads the next token from the input.


struct json_node *json_peek_token(struct json_context *js);

Reads the next token, but keeps it in the input.


struct json_node *json_next_value(struct json_context *js);

Reads the next JSON value, including nested values.

Writer

JSON files can be produced by simply calling json_write().

If you want to generate the output on the fly (for example if it is huge), call json_set_output() and then iterate json_write_value().

By default, we produce a single-line compact representation, but you can choose differently by setting the appropriate format_options in the json_context.


void json_write(struct json_context *js, struct fastbuf *fb, struct json_node *n);

Writes a JSON file to the given fastbuf stream, containing the JSON value n.


void json_set_output(struct json_context *js, struct fastbuf *fb);

Selects the given fastbuf stream as output.


void json_write_value(struct json_context *js, struct json_node *n);

Writes a single JSON value to the output stream.


enum json_format_option {
  JSON_FORMAT_ESCAPE_NONASCII = 1,      // Produce pure ASCII output by escaping all Unicode characters in strings
  JSON_FORMAT_INDENT = 2,               // Produce pretty indented output
};

Formatting options. The format_options field in the context is a bitwise OR of these flags.