How to convert xml to json python

Converting XML to JSON using Python

XML is a well-known markup language that provides data in an organized and easy-to-understand manner. The team designed this markup language to store data. It is case-sensitive and offers developers to establish markup elements and produce customized markup language.

XML does not have predefined tags. They are lightweight but time-consuming to write as compare to JSON. In this particular tutorial, you will learn how to convert an XML structure to JSON.

What JSON does?

JSON (JavaScript Object Notation), is an open-standard file format used for data interchange. It can store human-readable text. We can access and use it, transmit data objects as attribute-value pairs & arrays. Developers can use JSON in place of XML as JSON is in trend because of its heavy use, lightweight, easy-readable structure, and ease to design. Both JSON & XML uses the same concept to transfer data from client to server and vice versa. However, both have different ways to serve for the same cause.

Various methods to Convert XML to JSON

1. Using xmltodict Module:

Xmltodict is a popular Python module that can convert your XML structure to JSON structure. It makes working in XML Ease so that you feel like you are working with JSON. It is not a predefined module, and hence you need to install it using the pip install command.

import xmltodict import json xml_data = """   TA01 Karlos Ray 4.6 Intelligence Analyst Dept Yes   TA102 Sue 4.0  Network Security Cloud systems Malware analysis  False   """ data = xmltodict.parse(xml_data) # using json.dumps to convert dictionary to JSON json_data = json.dumps(data, indent = 3) print(json_data)

Explanation:

Читайте также:  Evaluating expressions in java

Here, you have to first import the xmltodict as well as the json module. Next, create an XML structure and put all of it inside a string (triple quoted). Create another object that will assign the result of xmltodict.parse() mothod. This method takes the XML string as a parameter. Next use json.dumps() to structure your converted or parsed data. Provide the indentation value as 3. Finally print that JSON data to see the JSON structure.

2. Using Xmljson Module:

Xmljson is another library that its contributors do not maintain actively. But it acts as an alternative to xmltodict and untangle. This library can help developers parse XML to JSON but using specific XML to JSON conventions. It allows converting XML to various Python objects (especially dictionary structures) like in JSON, tree, etc., and vice-versa.

Since it is not a predefined module, you have to install the xmljson module first using the pip install command.

from xmljson import badgerfish as bf from json import dumps from xml.etree.ElementTree import fromstring from xmljson import parker print(dumps(bf.data(fromstring(' E101 ')) )) print(dumps(bf.data(fromstring(' Karlos 64,000 ')))) print(dumps(bf.data(fromstring(' Deeza 47,500 ')))) print(dumps(parker.data(fromstring('  101 203 '), preserve_root = True)))

Explanation:

Here, you have to first import the badgerfish from the xmljson module. Here we alias it with bf. Also, import the json and xml.etree.ElementTree, for helping the conversion happen. Next, inside the print() function, you have to dump the baggerfish data taking the data using fromstring() . These are nested functions call one inside the other. Pass an XML format as a string in the fromstring() . Perform this ‘n’ number of times for multiple XML lines.

3. Using Reverse Expression (re) Module:

Regular Expression is a powerful feature most modern programming language provides. A RegEx, or Regular Expression, makes a sequence of characters forming a search pattern. Developers and programmers use regular expressions to validate whether a string contains a specific pattern or not. Python has a built-in module called the re module that allows developers to program regular expressions. The re module uses Unicode strings (str) as well as 8-bit strings (bytes) to perform its pattern-based search. We will use this re module to convert our XML code to the JSON structure.

import json import re def getdict(fileArg): res = re.findall("<(?P\S*)(?P[^/>]*)(?:(?:>(?P.*?))|(?:/>))", fileArg) if len(res) >= 1: attreg="(?P\S+?)(?:(?:=(?P['\"])(?P.*?)(?P=quote))|(?:=(?P.*?)(?:\s|$))|(?P[\s]+)|$)" if len(res) > 1: return [ for j in re.findall(attreg, i[1].strip())]>, ]> for i in res] else: return for j in re.findall(attreg, res[1].strip())]>, ]> else: return fileArg with open("C:\\Users\\GauravKarlos\\PycharmProjects\\XmlJson\\data.xml", "r") as fil: print(json.dumps(getdict(fil.read())))

Explanation:

First, we have to import two modules, json and re. Now, create a user-defined function getdict() with a parameter fileArg . Now use the findall() method of the re module to match the xml pattern having the marker on both sides and some string or variable in between. The return statement is converting those XML structures to JSON structures if the result is greater than 1. Lastly, open a separate XML file which you want to take as input and open it in the Read mode. Then, use the j son.dumps() and pass the regular expression function getDict() and pass the file as an argument to it.

4. Using lxml2json Module:

lxml2json is a package of Python that helps in converting an XML structure to its JSON equivalent & vice versa. It allows various options to convert the data to the desired format. For implementing this, you have to import the package before using it in your program.

from lxml2json import convert from pprint import pprint as pp xml_data = """   TA01 Karlos Ray 4.6 Intelligence Analyst Dept Yes   TA102 Sue 4.0  Network Security Cloud systems Malware analysis  False   """ d = convert(xml_data) pp(d)

Explanation:

Here, you have to first import the lxml2json module. Also import the pprint module. Next, create an XML structure and put all of it inside a string (triple quoted). Use connvert() function to convert your xml_data string to JSON structure. Finally, print that returned object (d) to see the JSON structure.

Conclusion:

Among these, the regular expression is the fastest because all of its modules are implicitly existing. Xmltodict is the most popular of them all but is slightly slower but has more functionality. It is recommended not to use the xmljson module as it requires some other modules to install and work in conjunction making your program slower. lxml2json is another package you can use as an alternative to convert XML to the JSON structure.

  • TypeError: ‘int’ object is not subscriptable
  • pip is not recognized
  • Python Comment
  • Python Min()
  • Python Factorial
  • Armstrong Number in Python
  • Python Uppercase
  • Python map()
  • Python String find
  • Top Online Python Compiler
  • Polymorphism in Python
  • Inheritance in Python
  • Python : end parameter in print()
  • Python New 3.6 Features
  • Python input()
  • Python String Contains
  • Python Range
  • indentationerror: unindent does not match any outer indentation level in Python
  • Python Infinity
  • Python Return Outside Function

Источник

Python XML to JSON, XML to Dict

Python XML to JSON, XML to Dict

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Today we will learn how to convert XML to JSON and XML to Dict in python. We can use python xmltodict module to read XML file and convert it to Dict or JSON data. We can also stream over large XML files and convert them to Dictionary. Before stepping into the coding part, let’s first understand why XML conversion is necessary.

Converting XML to Dict/JSON

XML files have slowly become obsolete but there are pretty large systems on the web that still uses this format. XML is heavier than JSON and so, most developers prefer the latter in their applications. When applications need to understand the XML provided by any source, it can be a tedious task to convert it to JSON. The xmltodict module in Python makes this task extremely easy and straightforward to perform.

Getting started with xmltodict

We can get started with xmltodict module but we need to install it first. We will mainly use pip to perform the installation.

Install xmltodict module

python install xmltodict module

This will be done quickly as xmltodict is a very light weight module. Here is the output for this installation: The best thing about this installation was that this module is not dependent on any other external module and so, it is light-weight and avoids any version conflicts. Just to demonstrate, on Debian based systems, this module can be easily installed using the apt tool:

sudo apt install python-xmltodict 

Python XML to JSON

The best place to start trying this module will be to perform an operation it was made to perform primarily, to perform XML to JSON conversions. Let’s look at a code snippet on how this can be done:

import xmltodict import pprint import json my_xml = """ 123 Shubham  """ pp = pprint.PrettyPrinter(indent=4) pp.pprint(json.dumps(xmltodict.parse(my_xml))) 

python xml to json

Let’s see the output for this program: Here, we simply use the parse(. ) function to convert XML data to JSON and then we use the json module to print JSON in a better format.

Converting XML File to JSON

Keeping XML data in the code itself is neither always possible nor it is realistic. Usually, we keep our data in either database or some files. We can directly pick files and convert them to JSON as well. Let’s look at a code snippet how we can perform the conversion with an XML file:

import xmltodict import pprint import json with open('person.xml') as fd: doc = xmltodict.parse(fd.read()) pp = pprint.PrettyPrinter(indent=4) pp.pprint(json.dumps(doc)) 

python xml file to json

Let’s see the output for this program: Here, we used another module pprint to print the output in a formatted manner. Apart from that, using the open(. ) function was straightforward, we used it get a File descriptor and then parsed the file into a JSON object.

Python XML to Dict

As the module name suggest itself, xmltodict actually converts the XML data we provide to just a simply Python dictionary. So, we can simply access the data with the dictionary keys as well. Here is a sample program:

import xmltodict import pprint import json my_xml = """ 123 Shubham  """ my_dict = xmltodict.parse(my_xml) print(my_dict['audience']['id']) print(my_dict['audience']['id']['@what']) 

python xml to dict

Let’s see the output for this program: So, the tags can be used as the keys along with the attribute keys as well. The attribute keys just need to be prefixed with the @ symbol.

Supporting Namespaces in XML

In XML data, we usually have a set of namespaces which defines the scope of the data provided by the XML file. While converting to the JSON format, it is then necessary that these namespaces persist in the JSON format as well. Let us consider this sample XML file:

import xmltodict import pprint import json with open('person.xml') as fd: doc = xmltodict.parse(fd.read(), process_namespaces=True) pp = pprint.PrettyPrinter(indent=4) pp.pprint(json.dumps(doc)) 

xml namespace to dict and json

Let’s see the output for this program:

JSON to XML conversion

ALthough converting from XML to JSON is the prime objective of this module, xmltodict also supports doing the reverse operation, converting JSON to XML form. We will provide the JSON data in program itself. Here is a sample program:

import xmltodict student = < "data" : < "name" : "Shubham", "marks" : < "math" : 92, "english" : 99 >, "id" : "s387hs3" > > print(xmltodict.unparse(student, pretty=True)) 

python json to xml

Let’s see the output for this program: Please note that giving a single JSON key is necessary for this to work correctly. If we consider that we modify our program to contain multiple JSON keys at the very first level of data like:

import xmltodict student = < "name" : "Shubham", "marks" : < "math" : 92, "english" : 99 >, "id" : "s387hs3" > print(xmltodict.unparse(student, pretty=True)) 

python json to xml unparse error

In this case, we have three keys at the root level. If we try to unparse this form of JSON, we will face this error: This happens because xmltodict needs to construct the JSON with the very first key as the root XML tag. This means that there should only be a single JSON key at the root level of data.

Conclusion

In this lesson, we studied an excellent Python module which can be used to parse and convert XML to JSON and vice-versa. We also learned how to convert XML to Dict using xmltodict module. Reference: API Doc

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases. Learn more about us

Источник

Оцените статью