JSON vs XML: when did these data formats originate, and how are they used in RPG (and development in general)? Both have a tremendous history behind them and are either derived from or parent to a variety of other means to hold data. This article will draw a few mentions to content cited in our previous blog posts on REST vs SOAP APIs (and a comparison). This first in a series of three articles explores XML, as well as its origins, general structure, and typical uses in RPG. The next article in this series explores JSON in a similar fashion, while the third draws comparisons between the two. Before getting too deep into the weeds, let’s define a few important terms mentioned in this series.
The realm of client-server communication is loaded with acronyms and terms to keep track of. To begin with, let’s elaborate upon several key terms, and what they mean.
Let’s dig into XML and find out what each of its terms mean. As mentioned in the “Definitions” section, XML stands for Extensible Markup Language. What do each of these terms mean?
Oracle’s documentation describes extensibility as “an ability of the software system to allow and accept the significant extension of its capabilities without major rewriting of code or changes in its basic architecture. Extensible systems provide technology, tools, languages that are designed so that developers can expand or add to its capabilities.”
For example, when compared to HTML, XML is extensible. While HTML requires you to use a specific set of predefined tags (such as <body>, <head>, <div>, etc), XML accommodates user-defined tags (such as <OrderNumber>, <Quantity>, <Location>, <Status>, etc.).
Techtarget.com says that “Markup refers to the sequence of characters or other symbols that you insert at certain places in a text or word processing file to indicate how the file should look when it is printed or displayed or to describe the document’s logical structure. The markup indicators are often called tags.”
Techtarget.com also explains that “A markup language is a computer language that uses tags to define elements within a document. It is human-readable, meaning markup files contain standard words, rather than typical programming syntax.”
While most languages programmers use are tied closely to the development of a program (think RPG, COBOL, or CL), markup languages can be considered a metalanguage (i.e. a language used to describe another language).
To get a broad understanding of XML, as well as some trivia, let’s look into XML’s predecessors, and where they came from. Furthermore, we’ll explore the offshoots of XML and similar data formats built parallel to it.
The earliest roots of XML come from GML, as developed by Charles Goldfarb, Edward Mosher, and Raymond Lorie at IBM in the 1960s. Made long before the advent of HTTP, GML was designed not for data interchange across the web, but for representation. GML documents were built using a series of markup tags intending to specify what a document should look like, rather than representing exactly what it would look like.
Extending from GML came SGML, which was formalized as an ISO standard in 1986, and defined as having a scope which includes “Standardization in the field of document structures, languages and related facilities for the description and processing of compound and hypermedia documents.”
Darla Ferrara at ThoughtCo describes SGML’s use well:
SGML, HTML, and XML are all markup languages… In this family of markup languages, Standard Generalized Markup Language (SGML) is the parent. SGML provides a way to define markup languages and sets the standard for their form. In other words, SGML states what some languages can or cannot do, what elements must be included, such as tags, and the basic structure of the language. As a parent passes on genetic traits to a child, SGML passes structure and format rules to markup languages.
HTML, one of the most familiar coding languages, is an extension of SGML and was released in 1993. It has evolved greatly over the years, and is the primary language used when displaying web pages.
Released in 1996, XML is another SGML extension, and is one of the most widely-used data formats available today. While HTML worked well for building websites, its limited extensibility made it unsuitable for storing data in an efficient manner. XML was created in response to that. As described by Chris Collins in ‘A Brief History of XML,’
When it comes to data storage and interchange, HTML is a bad fit, as it was originally intended as a presentation technology, while SGML is considered too complex for general use. XML bridges this gap by being both human and machine readable, while being flexible enough to support platform and architecture independent data interchange.
XHTML is an extension of both HTML and XML, and its initial release was in 2000. As stated by w3schools.com, “XHTML was developed to make HTML more extensible and flexible to work with other data formats (such as XML). In addition, browsers ignore errors in HTML pages, and try to display the website even if it has some errors in the markup. So XHTML comes with a much stricter error handling.”
XML is built using a tree structure, and begins with a single root element, which expands into a series of child, parent, and sibling elements. An XML document might be structured like this:
Alex
30
Taylor
32
XML is composed of four principle items: an opening prolog, elements + their respective tags, and optional attributes.
XML elements are composed of everything contained in an element’s start and end tags, as well as the tags themselves. In the example above, everything contained within…
...
… is an element, and everything contained within…
Alex
is an element too. Properly structured XML documents must contain a unique root element which contains the rest of the root’s data.
XML tags form the structure of an XML document and define the scope of elements. Tags consist of a starting tag like…
… and an end tag, like…
End tags should have a solidus or forward slash (“/”) before the element name.
A prolog is a specially formatted element that defines an XML document’s version, as well as which type of encoding to use (encoding is the process of converting Unicode characters into the binary, machine-readable format; UTF-8 is the default). An XML prolog is optional, but if it’s included, it must begin the XML document, as shown above:
An attribute is used to specify a single property of an element. In the above example…
… contains the attribute ‘category’ with an attribute value of ‘daughter’.
XML is one of the most firmly entrenched data formats available, and one can find companies everywhere using it. XML presents structured information in an easy-to-read text format and is used in nearly every industry. XML’s flexibility allows it to represent itself as documents, data, transaction information, configurations, order forms, and/or receipts.
Many of our customers and business partners continue to use XML and its various forms. Many XML-based APIs (most of them REST), provide publicly accessible endpoints for consuming their business data. For example, programmableweb.com provides a large catalog of public APIs and documentation pages. Looking up “XML API” in their search function yields 121 pages of results.
Compared to its predecessors, XML is a more tightly defined, though still flexible means of storing and transmitting data. XML comes with a variety of benefits:
When building XML documents or when sending XML data over HTTP, having a suite of tools helps drastically reduce your development time. When connecting to a new API, assisting new customers with a proof of concept, or prototyping a new web service, our development team uses a variety of tools to accomplish programming goals. We recommend resources like these in order to help your own development.
XML is typically used in various parts of your RPG programs. As mentioned above, XML is frequently used when APIs interchange data. When sending XML to a web service, your program will need to construct the XML. A variety of options are available such as through native tools on your IBM i or via open source development tools. One of the fastest ways to develop XML of any complexity or depth is with RPG API Express’s compiled templates and composition subprocedures. As mentioned in our documentation site,
To generate XML, RPG API Express relies on what is referred to as the “template engine” or “composition engine”. This engine uses a specially marked up pseudo-XML file divided into sections and with variable fields embedded in it as well as a set of RPG API Express subprocedures. The combination of the template file and the subprocedures allow you to build XML of any complexity or depth needed to meet your business requirements.
After constructing XML, your program will often send it to an API through a transmitted GET or POST request via HTTP. This step is somewhat implied; you need to send out an XML request in order to get a response after all! Upon receiving your XML request, the server your program is connecting with sends back a response (almost always containing XML, if XML is what you sent). Your RPG programs can transmit XML by using the RXS_Transmit() subprocedure, which requires only a few lines of RPG code for full configuration with an API.
After receiving XML, your program will usually parse the information. In relation to an API workflow, parsing is a process that involves extracting the contents of a semi-structured file (i.e. XML & JSON) for further processing. When parsing XML in RPG, programs will typically store the parsed data into DB2 variables declared within the program itself. Other uses include parsing directly into program variables, or sending XML into a file within the IFS (for instance, by using the RXS_PutStmf() subprocedure).
Companies everywhere are exchanging XML. For example, thousands of companies use Oracle’s XML Publisher, which is an XML-based template system for EDI data exchanges. Note, it is vastly different from RPG API Express’s template system, which employs command line tools for generating XML.
When going for a hybridized “RPG + open source” route, there are a few free options available, although some of these come at the cost of requiring colossal complex data structures, especially when building multiple XML child elements nested many layers deep. Some of the options available include tools like IBM’s XML Toolkit and DATA-GEN (a tool which helps developers create XML documents), as well as older open source options like CGIDEV2 (which uses HTML normally; XML is similar enough and can be tweaked for use within its programs). Finally, SQL is known for its broad range of capabilities and it could be inserted into an SQLRPGLE program by leveraging XMLTABLE + a column layout.
While open source is helpful (especially for learning new skills in non-production shops), it comes at the expense of not having a dedicated support team who can answer questions related to XML or RPG web services in general.