Reading Time: 4 minutes

Converting XML to JSON is still a common task in modern software development, especially when working with legacy systems, enterprise integrations, SOAP APIs, RSS feeds, configuration files, or large data exports. While JSON dominates web APIs and frontend development, XML remains deeply embedded in many infrastructures. Converting between the two is rarely a simple mechanical operation, because XML and JSON represent data in fundamentally different ways.

Understanding the Core Differences Between XML and JSON

Before converting anything, it is essential to understand that XML and JSON are built on different data models. XML represents a document-oriented tree with ordered elements, attributes, text nodes, namespaces, and optional mixed content. JSON represents a data-oriented structure composed of objects, arrays, and primitive values such as strings, numbers, booleans, and null.

This mismatch leads to unavoidable questions during conversion: should XML attributes become JSON properties, how should repeated elements be detected and converted into arrays, what should happen to text nodes, how should namespaces be represented, and whether element order must be preserved. The answers depend on the goal of the conversion.

Defining the Goal of the Conversion

The rules you choose for conversion should be driven by intent. If the JSON is meant for JavaScript consumption or API responses, a pragmatic and possibly lossy conversion may be acceptable. If the JSON is used for data interchange, long-term storage, or indexing, consistency and predictability matter more. If round-trip conversion back to XML is required, information loss must be minimized.

Throughout this article, the examples will follow an explicit and documented ruleset to make the conversion predictable and understandable.

Mapping Rules Used in the Examples

The following rules are applied consistently in all examples below:

  • Elements are mapped to JSON objects.
  • Repeated sibling elements with the same name become JSON arrays.
  • Attributes are stored using an @ prefix.
  • Text content is stored using a #text key.
  • Namespaces are preserved using fully qualified keys in the form {namespaceURI}localName.

Example 1: Simple Elements and Arrays

This example demonstrates how repeated XML elements naturally map to JSON arrays.

<users>
  <user>
    <name>Alice</name>
    <age>30</age>
  </user>
  <user>
    <name>Bob</name>
    <age>25</age>
  </user>
</users>

Converted JSON:

{
  "users": {
    "user": [
      {
        "name": { "#text": "Alice" },
        "age": { "#text": "30" }
      },
      {
        "name": { "#text": "Bob" },
        "age": { "#text": "25" }
      }
    ]
  }
}

Here, the converter detects that user appears more than once and converts it into a JSON array. Even though the XML structure is simple, making this decision explicit avoids ambiguity when only one element appears.

Example 2: Attributes and Text Nodes

This example shows how attributes and text content coexist in XML and how they can be represented in JSON without collisions.

<product id="A123" currency="USD">Laptop</product>

Converted JSON:

{
  "product": {
    "@id": "A123",
    "@currency": "USD",
    "#text": "Laptop"
  }
}

Using prefixed keys for attributes avoids name clashes with child elements and makes it clear which values originated as attributes. Text content is preserved explicitly instead of being merged or discarded.

Example 3: Namespaces and Collision Avoidance

Namespaces are one of the most common sources of errors in XML to JSON conversion. This example demonstrates how two elements with the same local name but different namespaces can be preserved safely.

<record xmlns:bk="http://example.com/book" xmlns:hr="http://example.com/hr">
  <bk:title>XML Fundamentals</bk:title>
  <hr:title>Senior Engineer</hr:title>
</record>

Converted JSON:

{
  "record": {
    "{http://example.com/book}title": {
      "#text": "XML Fundamentals"
    },
    "{http://example.com/hr}title": {
      "#text": "Senior Engineer"
    }
  }
}

By embedding the namespace URI into the JSON key, the conversion guarantees uniqueness and avoids silent overwrites. Although verbose, this strategy is stable and safe for data interchange and indexing.

Example 4: Mixed Content (The Hard Case)

Mixed content occurs when text and child elements are interleaved, which is common in document-centric XML but awkward to represent in JSON.

<p>
  This is <em>very</em> important.
</p>

Converted JSON using an ordered token strategy:

{
  "p": {
    "content": [
      { "#text": "This is " },
      { "em": { "#text": "very" } },
      { "#text": " important." }
    ]
  }
}

This approach preserves order and meaning but results in a structure that is difficult to consume. For many applications, mixed content is a strong signal that XML should not be converted to JSON without carefully reconsidering the use case.

Tools and Methods for XML to JSON Conversion

There is no universal best tool for XML to JSON conversion. JavaScript environments often rely on configurable parsers that support attributes, text nodes, and streaming. Python is well suited for batch conversion and ETL workflows using standard XML libraries combined with custom mapping logic. Java and .NET ecosystems typically use schema-driven object models followed by JSON serialization.

Streaming parsers are essential for very large XML files, but they complicate array detection and structural decisions. Schema-informed conversion using XSD produces the most consistent JSON but requires reliable schemas and valid XML.

Common Pitfalls to Watch For

The most frequent issues include ambiguous arrays where a single element sometimes becomes an object and sometimes an array, silent loss of namespaces, attributes overwriting element names, numbers and booleans turning into strings, and order-dependent XML structures losing meaning in JSON. Security concerns such as entity expansion attacks should also be considered when parsing untrusted XML.

Best Practices for Reliable Conversion

Always define and document a mapping specification before converting XML to JSON. Use explicit conventions for attributes and text nodes, apply consistent array rules, and include automated tests that compare expected JSON outputs against known XML inputs. If round-trip conversion matters, design the JSON structure with reversibility in mind rather than convenience.

When Not to Convert XML to JSON

XML should not be converted to JSON when the document is markup-heavy, when exact ordering and whitespace are significant, or when namespace precision is critical but downstream consumers will not respect it. In such cases, exposing XML through a JSON-friendly API layer is often a better solution than full conversion.

Conclusion

Converting XML to JSON is not just a technical operation but a design decision. The quality of the result depends on clear goals, explicit rules, and an understanding of what information can be safely transformed or must be preserved. With a well-defined mapping strategy and appropriate tooling, XML to JSON conversion can be predictable, reliable, and maintainable.