Home  /  Resources  /  Blog  /  JSON vs XML, Part 1: XML in RPG

JSON vs XML, Part 1: XML in RPG

What's the difference?

JSON vs XML: when did these data formats originate, and how are they used in RPG (and development in general)? Both have a tremendous history behind them and are either derived from or parent to a variety of other means to hold data. This article will draw a few mentions to content cited in our previous blog posts on REST vs SOAP APIs (and a comparison). This first in a series of three articles explores XML, as well as its origins, general structure, and typical uses in RPG. The next article in this series explores JSON in a similar fashion, while the third draws comparisons between the two. Before getting too deep into the weeds, let’s define a few important terms mentioned in this series.

Definitions

The realm of client-server communication is loaded with acronyms and terms to keep track of. To begin with, let’s elaborate upon several key terms, and what they mean. 

  1. XML – Extensible Markup Language – XML is a widely used data format, similarly structured to HTML. XML can be used in both SOAP and REST API calls.
  2. JSON – JavaScript Object Notation – JSON is arguably the fastest-growing data format used when communicating over HTTP in the present day, thanks to its memory-efficient format.
  3. WSDL – Web Service Definition Language – WSDL is an XML-based notation used for describing functionality of a SOAP-based web service. WSDL is used at development time to describe a web service program with procedure names, input/output parameters, the URL of the web service, and the enveloping mechanisms and transport to be used (i.e., SOAP over HTTP).
  4. XSD – XML Schema Definition – An XSD is a file type that describes the structure of an XML document.
  5. REST – Representational State Transfer – REST is an architectural style, rather than a protocol (protocols define how diverse modules interact, while styles define how sets of protocols are organized). REST utilizes existing web protocols and tools to communicate with APIs.
  6. SOAP – Simple Object Access Protocol – SOAP is a data-centric protocol that uses WSDL files for client-consumer communication.
  7. RPG – RPG is the primary language used when developing business applications on the IBM i OS. Also known as RPG Free, RPGLE, and RPG IV.
  8. IBM i – IBM i is the operating system used on IBM’s POWER servers. IBM i is the OS name, which was formerly i5/OS and OS/400. AS/400 is what was renamed to iSeries, System i, and then IBM Power Systems.

XML - Extensible Markup Language

An expanded definition

Let’s dig into XML and find out what each of its terms mean. As mentioned in the “Definitions” section, XML stands for Extensible Markup Language. What do each of these terms mean?

Extensible

Oracle’s documentation describes extensibility as “an ability of the software system to allow and accept the significant extension of its capabilities without major rewriting of code or changes in its basic architecture. Extensible systems provide technology, tools, languages that are designed so that developers can expand or add to its capabilities.”

For example, when compared to HTML, XML is extensible. While HTML requires you to use a specific set of predefined tags (such as <body>, <head>, <div>, etc), XML accommodates user-defined tags (such as <OrderNumber>, <Quantity>, <Location>, <Status>, etc.).

Markup

Techtarget.com says that “Markup refers to the sequence of characters or other symbols that you insert at certain places in a text or word processing file to indicate how the file should look when it is printed or displayed or to describe the document’s logical structure. The markup indicators are often called tags.”

Language

Techtarget.com also explains that “A markup language is a computer language that uses tags to define elements within a document. It is human-readable, meaning markup files contain standard words, rather than typical programming syntax.”

While most languages programmers use are tied closely to the development of a program (think RPG, COBOL, or CL), markup languages can be considered a metalanguage (i.e. a language used to describe another language).

What is XML’s history?

To get a broad understanding of XML, as well as some trivia, let’s look into XML’s predecessors, and where they came from. Furthermore, we’ll explore the offshoots of XML and similar data formats built parallel to it.

GML - Generalized Markup Language

The earliest roots of XML come from GML, as developed by Charles Goldfarb, Edward Mosher, and Raymond Lorie at IBM in the 1960s. Made long before the advent of HTTP, GML was designed not for data interchange across the web, but for representation. GML documents were built using a series of markup tags intending to specify what a document should look like, rather than representing exactly what it would look like.

SGML - Standard Generalized Markup Language

Extending from GML came SGML, which was formalized as an ISO standard in 1986, and defined as having a scope which includes “Standardization in the field of document structures, languages and related facilities for the description and processing of compound and hypermedia documents.”

Darla Ferrara at ThoughtCo describes SGML’s use well:

SGML, HTML, and XML are all markup languages… In this family of markup languages, Standard Generalized Markup Language (SGML) is the parent. SGML provides a way to define markup languages and sets the standard for their form. In other words, SGML states what some languages can or cannot do, what elements must be included, such as tags, and the basic structure of the language. As a parent passes on genetic traits to a child, SGML passes structure and format rules to markup languages.

HTML - HyperText Markup Language

HTML, one of the most familiar coding languages, is an extension of SGML and was released in 1993. It has evolved greatly over the years, and is the primary language used when displaying web pages.

XML - Extensible Markup Language

Released in 1996, XML is another SGML extension, and is one of the most widely-used data formats available today. While HTML worked well for building websites, its limited extensibility made it unsuitable for storing data in an efficient manner. XML was created in response to that. As described by Chris Collins in ‘A Brief History of XML,’

When it comes to data storage and interchange, HTML is a bad fit, as it was originally intended as a presentation technology, while SGML is considered too complex for general use. XML bridges this gap by being both human and machine readable, while being flexible enough to support platform and architecture independent data interchange.

XHTML - Extensible HyperText Markup Language

XHTML is an extension of both HTML and XML, and its initial release was in 2000. As stated by w3schools.com, “XHTML was developed to make HTML more extensible and flexible to work with other data formats (such as XML). In addition, browsers ignore errors in HTML pages, and try to display the website even if it has some errors in the markup. So XHTML comes with a much stricter error handling.”

Flowchart of GML SGML HTML XML and XHTML History
Note, the 1-2 lines of text under each circular markup symbol points out general document requirements (or lack thereof) in the history of GML through XHTML.

How is XML structured?

XML is built using a tree structure, and begins with a single root element, which expands into a series of child, parent, and sibling elements. An XML document might be structured like this:

				
					<?xml version="1.0" encoding="UTF-8"?>
<Root_Family>
	<ParentA>
		<Child_Sibling1 category=”daughter”>
			<Name>Alex</Name>
			<Age>30</Name>
		</Child_Sibling1>
		<Child_Sibling2 category=”son”>
			<Name>Taylor</Name>
			<Age>32</Name>
		</Child_Sibling2>
	</ParentA>
</Root_Family>
				
			

XML is composed of four principle items: an opening prolog, elements + their respective tags, and optional attributes.

Element

XML elements are composed of everything contained in an element’s start and end tags, as well as the tags themselves. In the example above, everything contained within…

				
					<Root_Family>
	...
</Root_Family>
				
			

… is an element, and everything contained within…

				
					<Name>Alex</Name>
				
			

is an element too. Properly structured XML documents must contain a unique root element which contains the rest of the root’s data.

Tag

XML tags form the structure of an XML document and define the scope of elements. Tags consist of a starting tag like…

				
					<Age>
				
			

 … and an end tag, like…

				
					</Age>
				
			

End tags should have a solidus or forward slash (“/”) before the element name.

Prolog

A prolog is a specially formatted element that defines an XML document’s version, as well as which type of encoding to use (encoding is the process of converting Unicode characters into the binary, machine-readable format; UTF-8 is the default). An XML prolog is optional, but if it’s included, it must begin the XML document, as shown above:

				
					<?xml version="1.0" encoding="UTF-8"?>
				
			

Attribute

An attribute is used to specify a single property of an element. In the above example…

				
					<Child_Sibling1 category=”daughter”>
				
			

… contains the attribute ‘category’ with an attribute value of ‘daughter’.

Who uses XML?

XML is one of the most firmly entrenched data formats available, and one can find companies everywhere using it. XML presents structured information in an easy-to-read text format and is used in nearly every industry. XML’s flexibility allows it to represent itself as documents, data, transaction information, configurations, order forms, and/or receipts.

Many of our customers and business partners continue to use XML and its various  forms. Many XML-based APIs (most of them REST), provide publicly accessible endpoints for consuming their business data. For example, programmableweb.com provides a large catalog of public APIs and documentation pages. Looking up “XML API” in their search function yields 121 pages of results.

What are its benefits?

Compared to its predecessors, XML is a more tightly defined, though still flexible means of storing and transmitting data. XML comes with a variety of benefits:

  • XML is extensible by enabling developers to add new functionality through custom tags. 
  • XML is human-readable making it easier for people to understand. Those already familiar with HTML will find XML very similar.
  • XML is easier to use than HTML as it does not require a fixed library of predefined tags.
  • XML’s straightforward structure makes it easy to work with, especially when creating or parsing.
  • XML document prologs define encoding rules which are easy to understand.
  • XML is language-independent making it available for all major programming languages to use.

Tools & resources for manipulating XML

When building XML documents or when sending XML data over HTTP, having a suite of tools helps drastically reduce your development time. When connecting to a new API, assisting new customers with a proof of concept, or prototyping a new web service, our development team uses a variety of tools to accomplish programming goals. We recommend resources like these in order to help your own development.

  1. Visual Studio Code – Our team often uses VS Code when creating XML documents in their initial form. Its interface is easy to navigate and use, and supports a variety of languages.
  2. Using XSD Validation within RPG API Express  – This article goes over some basic rules to follow when constructing XML in RPG, as well as the benefits of using an XSD file to validate the structure of your XML documents. Look into this documentation page for information on the related subprocedure RXS_Validate().
  3. XML Validator – XSD (XML Schema) – If not using RXS_Validate(), there’s plenty of free tools available online such as this validator.
  4. Handling Reserved Characters in Your XML Data – This article on XML reserved characters in RPG (such as the “&” and “<” characters) provides an alternative to using XML entities by leveraging CDATA (or Character Data).

When do I use XML in RPG programs?

XML is typically used in various parts of your RPG programs. As mentioned above, XML is frequently used when APIs interchange data. When sending XML to a web service, your program will need to construct the XML. A variety of options are available such as through native tools on your IBM i or via open source development tools. One of the fastest ways to develop XML of any complexity or depth is with RPG API Express’s compiled templates and composition subprocedures.  As mentioned in our documentation site,

To generate XML, RPG API Express relies on what is referred to as the “template engine” or “composition engine”. This engine uses a specially marked up pseudo-XML file divided into sections and with variable fields embedded in it as well as a set of RPG API Express subprocedures. The combination of the template file and the subprocedures allow you to build XML of any complexity or depth needed to meet your business requirements. 

After constructing XML, your program will often send it to an API through a transmitted GET or POST request via HTTP. This step is somewhat implied; you need to send out an XML request in order to get a response after all! Upon receiving your XML request, the server your program is connecting with sends back a response (almost always containing XML, if XML is what you sent). Your RPG programs can transmit XML by using the RXS_Transmit() subprocedure, which requires only a few lines of RPG code for full configuration with an API.

After receiving XML, your program will usually parse the information. In relation to an API workflow, parsing is a process that involves extracting the contents of a semi-structured file (i.e. XML & JSON) for further processing. When parsing XML in RPG, programs will typically store the parsed data into DB2 variables declared within the program itself. Other uses include parsing directly into program variables, or sending XML into a file within the IFS (for instance, by using the RXS_PutStmf() subprocedure).

How can I create XML in RPG?

Companies everywhere are exchanging XML. For example, thousands of companies use Oracle’s XML Publisher, which is an XML-based template system for EDI data exchanges. Note, it is vastly different from RPG API Express’s template system, which employs command line tools for generating XML.

When going for a hybridized “RPG + open source” route, there are a few free options available, although some of these come at the cost of requiring colossal complex data structures, especially when building multiple XML child elements nested many layers deep. Some of the options available include tools like IBM’s XML Toolkit and DATA-GEN (a tool which helps developers create XML documents), as well as older open source options like CGIDEV2 (which uses HTML normally; XML is similar enough and can be tweaked for use within its programs). Finally, SQL is known for its broad range of capabilities and it could be inserted into an SQLRPGLE program by leveraging XMLTABLE + a column layout.

While open source is helpful (especially for learning new skills in non-production shops), it comes at the expense of not having a dedicated support team who can answer questions related to XML or RPG web services in general.

Table of Contents