JSON vs XML: when did these data formats originate, and how are they used in RPG (and development in general)? Both have a tremendous history behind them and are either derived from, inspired by, or parent to a variety of other means to hold data. This article will draw a few mentions to content mentioned in our previous blog posts on REST vs SOAP APIs (and a comparison). This third in a series of three articles pits JSON versus XML, and compares the benefits of each of them. The first article details XML’s history, and the second one looks into JSON. Before getting too deep into the weeds, let’s define a few important words mentioned in this series.
The realm of client-server communication is loaded with acronyms and terms to keep track of. This section elaborates upon a few key terms:
JSON vs. XML – the two data formats are everywhere. Each one has unique strengths and weaknesses, with JSON being the preferred format for most data interchange across REST APIs; XML certainly still has its place.
JSON | XML | |
---|---|---|
1 | Released in 2001 | Released in 1996 |
2 | More concise | More verbose |
3 | Easier to read | More complex |
4 | More lightweight | More text-heavy |
5 | For data interchange | Markup language / document-oriented |
6 | Used with REST | Used with SOAP & REST |
7 | Includes strict data types | Lacks strict data types |
8 | Explicit Array / Object support | No such support |
9 | Does not support schemas for strict validation | Supports schemas for strict validation |
10 | Doesn't require extensibility | More extensible |
11 | Supports only UTF-8 and UTF-16 | Supports multiple encoding standards |
12 | No comment support | Supports comments |
1. Since its inception in 2001, JSON took about 10-15 years to surpass XML in popularity. XML itself was released in 1996.
2. JSON is much more concise (or sparing the use of excess words) than XML. This is abundantly clear when viewing two comparisons of these formats. For example, an XML statement like this…
Alex
Belle
Charles
… can be expressed analogously through JSON like this…
{
"Parent": {
"Child1": "Alex",
"Child2": "Belle",
"Child3": "Charles"
}
}
Excluding white-spacing and indentation (i.e. “minifying” the code), the XML totals at 84 characters, while JSON accomplishes the same at 64 characters.
3. Similar to the previous example, JSON’s brevity also makes it easier to read. This is because every key in JSON is represented only once for a given value or object/array, while XML’s markup format means every element is represented twice (in the opening and closing tags).
4. A third beneficial aspect of JSON’s condensed nature is that it makes transported data lightweight. Fewer characters in calls translates to fewer bytes of information passed around.
5. JSON vs XML: each have different design philosophies when it comes to determining their strengths in how they interchange data. JSON is much more of a data-centered format than XML, which is more document-oriented. Document-oriented formats tend to rely on metadata – data about the data. For example, HTML relies on metadata contained in attributes and tags to determine how to transform and display the web page based on a linked stylesheet. XML does not contain display information within the document (generally this is found within an XSLT document instead), but it does make use of metadata in the form of attributes, tag names, namespaces, and even the hierarchical structure of the overall document to give additional context to the data contained within. JSON does not make use of similar metadata for the most part except that it does have somewhat of a hierarchical structure with regards to nested objects and arrays. There are no attributes in JSON; where one might have used attributes on an element in XML one might instead choose to create a child object in JSON to contain those values as separate key/value pairs that are still linked together in one single parent.
6. It’s no surprise that different data formats may be more tightly associated with different modes of HTTP interchange. For example, both JSON and XML can be used with REST APIs, but not SOAP. JSON is a stateful (state-aware) protocol, and is data format-independent – it is a very flexible protocol. SOAP as a standard was first described when XML was the most widely used data format, and so XML is the only accepted data format for SOAP. SOAP relies on strict formatting for header elements and tags within the document, and so the use of XSDs and WSDLs available only with XML is common – this kind of strict validation is not available with JSON. Additionally, all SOAP requests are POST requests, and are stateless – it is not a stateful protocol like REST.
7. One major difference when comparing JSON vs. XML is that JSON data includes strict data types (string, number, boolean null). XML does not natively define data types, although XSDs allow for some definition of what data is expected in a given field.
8. JSON has explicit data types to represent a single object – a parent element that contains one or more child tags – or an array – a collection of tags. Both data types have specific notation to demarcate the beginning and end of the object/array within the document. In XML, what would be equivalent to a JSON object type is just an element that, instead of containing data, contains only other elements. The XML equivalent of an array is accomplished by repeating the desired element (generally within a containing parent element). The parent elements of the “object” or “array” are notated the same as any other element in the document – there is no specific demarcation. If your XML document is not displayed with indentation, it can be difficult to determine when you have reached the end of a parent element.
9. Schema documents – known as XSDs – are used to provide element and attribute definitions and validation for XML documents. They can be used to define custom field types that can be referenced using namespaces within XML documents. JSON does not have an equivalent to XSDs or namespaces – it instead relies on its native data types, and the community generally uses external tools (like OpenAPI) to define and document their APIs.
10. Extensibility is the means by which a given software system or format can be enhanced beyond its base capabilities without major rewriting of code or alteration of its basic architecture. Extensibility is accomplished in XML by the use of user-defined tags, and further expanded by use of XSDs for additional definition of field types and validation. However, JSON is not considered to be extensible because, by design, it does not require extensibility since it is flexible enough and is not restricted by the markup language format.
11. JSON simplifies encoding by only allowing for UTF-8, UTF-16, and UTF-32 encoding. XML can accept a variety of encoding formats, increasing its versatility and adding complexity. For example, these are valid examples of how an XML document might be encoded:
12. As it is designed to be a lightweight and flexible data-interchange format, JSON does not allow for commenting. As stated by Douglas Crawford (who wrote JSON’s specifications), “I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability.” XML documents allow for comments using the form:
Modern development is trending towards REST and JSON due to their flexibility and overall approachability, but XML has not lost its place. While the concise nature of JSON might make for a smaller payload, some applications still require the strict validation that only an XSD and an XML document can offer, and so they stick with a SOAP service. As developers, we also know that sometimes it comes down to how someone 20 years ago wanted to store the data, and momentum from that decision is driving your work today. Regardless, we’re here to help.