Python json and unicode

Python JSON

In this tutorial, we’ll see how we can create, manipulate, and parse JSON in Python using the standard a json module. The built-in Python json module provides us with methods and classes that are used to parse and manipulate JSON in Python.

What is JSON

JSON (an acronym for JavaScript Object Notation) is a data-interchange format and is most commonly used for client-server communication. Refer Official JSON documentation

A JSON is an unordered collection of key and value pairs, resembling Python’s native dictionary.

  • Keys are unique Strings that cannot be null.
  • Values can be anything from a String, Boolean, Number, list, or even null.
  • A JSONO can be represented by a String enclosed within curly braces with keys and values separated by a colon, and pairs separated by a comma
Читайте также:  Input elements in php

Whenever the client needs information, it calls the server using a URI, and the server returns data to the client in the form of JSON. Later we can use this data in our application as per our requirement. Also, when the client application wants to store the data on the server. It can POST that data in the form of JSON.

JSON is most commonly used for client-server communication because:

  • It is human readable.
  • It’s both easy to read/write and
  • JSON is language-independent.

Python json Module

Python comes with a built-in module called json for working with JSON data. You only need to add import json at the start of your file and you are ready to use it.

Python JSON Tutorial

This Python JSON tutorial contains the following articles that cover the sub-topic and frequently asked questions in detail.

Python JSON Encoding

When we encode Python Objects into JSON we call it a Serialization. In this section, we will cover the following.

  • The mapping between JSON and Python entities while encoding
  • json.dumps to encode JSON Data into native Python String.
  • json.dump to encode and write JSON into a file
  • Understand the various use of json.dump() and json.dumps() method in detail
  • Write Indented and pretty printed JSON data into the file
  • Compact JSON encoding
  • Encode Unicode data as-is into JSON

Python JSON Parsing

When we convert JSON encoded/formatted data into Python Types we call it a JSON deserialization or parsing.

In this section, we will cover the following: —

  • The Mapping between JSON and Python entities while decoding
  • How to read JSON data from a file using json.load() and convert it into Python dict so we can access JSON data in our system.
  • Convert JSON response or JSON string to Python dictionary using json.loads()
  • Understand the various use of load and loads() method in detail
  • Learn how to parse and Retrieve nested JSON array key-values
  • Implement a custom JSON decoder

Validate JSON Data using Python

There are multiple scenarios where we need various types of JSON validations. In this section, we will cover the following.

  • Check if a JSON document is valid JSON as per the JSON specification
  • Validate JSON Object from the command line
  • Validate JSON File
  • Validate JSON Schema by checking all necessary fields present in JSON and also validate data types of those fields

PrettyPrint JSON Data using Python

PrettyPrint means JSON data should be correctly indented and easy-to-read format. In this section, we will cover the following.

  • Write Indented and Pretty-printed JSON data into a file
  • Read and PrettyPrint JSON file using Python
  • Pretty-print JSON from the command line
  • Sort JSON keys alphabetically
  • Change JSON key and value separator

Make a Python Class JSON serializable

The Python built-in json module can only handle Python primitives types that have a direct JSON equivalent (e.g., dictionary, lists, strings, Numbers, None, etc.). So when we try to serialize custom class instance into JSON, we receive a type error. The Class is not JSON serializable. In this section, I will show you how to convert any arbitrary Python objects to JSON.

  • Learn the different ways to write custom JSON Encoder to convert custom Class Object into JSON
  • Learn how to override the default behavior or Python JSON encode

Convert JSON Data Into a Custom Python Object (Custom JSON Decoder)

  • Learn how to convert JSON data into a custom Python object instead of a dictionary.
  • Understandearn the different ways to write custom JSON Decoder to convert custom JSON Decoder.
  • Learn how to override the default behavior or Python JSON decoder.

Python JSON Handle Unicode Data

  • JSON serialize Unicode or non-ASCII data as-is. instead of u escape sequence (Example, Store Unicode string ø as-is instead of u00f8 in JSON)
  • JSON serialize all non-ASCII characters escaped (Example, Store Unicode string ø as u00f8 in JSON)

Parse a JSON response using the Python requests library

In this section, we will learn how to send a RESTful GET call to a server, and Parse a JSON response using the requests library.

Post JSON using the requests library

Learn how to post a JSON from a client to a server using a requests library

Serialize Python DateTime into JSON

The Python built-in json module can only handle Python primitives types that have a direct JSON equivalent (e.g., dictionary, lists, strings, Numbers, None, etc.). So when we try to serialize Python Object which contains DateTime instance into JSON, we receive a type error. The DateTime is not JSON serializable. In this section, we will see serialize and deserialize DateTime instance to and from JSON.

Serialize Python Set into JSON

In this section, we will see how to JSON serialize Python set. To solve TypeError: Object of type set is not JSON serializable we need to build a custom encoder to make set JSON serializable.

Serialize NumPy array into JSON

In this section, we will see how to JSON serialize NumPy ndarray

Python Check if a key exists in JSON

In this section, instead of iterating entire JSON, we will see:

  • How to Check if the key exists or not in JSON. Check if there is a value for a key in JSON
  • Return default value if the key is missing
  • Find if the nested key exists in JSON and Access nested key directly

Parse multiple JSON objects from file

In this section, we will see how to solve a json.decoder.JSONDecodeError: Extra dat error so that we can load and parse a JSON file with multiple JSON objects.

Parse JSON serialized list which contains a nested dictionary

Python JSON Exercise

This Python JSON exercise is to help Python developer to learn and practice JSON creation, manipulation, Parsing. Each question includes a specific JSON topic you need to learn. When you complete each question, you get more familiar with JSON encoding and decoding in Python.

All Python JSON Tutorials: —

Python JSON Exercise

Updated on: December 8, 2021 | 6 Comments

Источник

Python Encode Unicode and non-ASCII characters as-is into JSON

In this article, we will address the following frequently asked questions about working with Unicode JSON data in Python.

  • How to serialize Unicode or non-ASCII data into JSON as-is strings instead of \u escape sequence (Example, Store Unicode string ø as-is instead of \u00f8 in JSON)
  • Encode Unicode data in utf-8 format.
  • How to serialize all incoming non-ASCII characters escaped (Example, Store Unicode string ø as \u00f8 in JSON)

Further Reading:

The Python RFC 7159 requires that JSON be represented using either UTF-8, UTF-16, or UTF-32, with UTF-8 being the recommended default for maximum interoperability.

The ensure_ascii parameter

Use Python’s built-in module json provides the json.dump() and json.dumps() method to encode Python objects into JSON data.

The json.dump() and json.dumps() has a ensure_ascii parameter. The ensure_ascii is by-default true so the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii=False , these characters will be output as-is.

The json module always produces str objects. You get a string back, not a Unicode string. Because the escaping is allowed by JSON.

  • using a ensure_ascii=True , we can present a safe way of representing Unicode characters. By setting it to true we make sure the resulting JSON is valid ASCII characters (even if they have Unicode inside).
  • Using a ensure_ascii=False , we make sure resulting JSON store Unicode characters as-is instead of \u escape sequence.

Save non-ASCII or Unicode data as-is not as \u escape sequence in JSON

In this example, we will try to encode the Unicode Data into JSON. This solution is useful when you want to dump Unicode characters as characters instead of escape sequences.

Set ensure_ascii=False in json.dumps() to encode Unicode as-is into JSON

import json unicodeData= < "string1": "明彦", "string2": u"\u00f8" >print("unicode Data is ", unicodeData) encodedUnicode = json.dumps(unicodeData, ensure_ascii=False) # use dump() method to write it in file print("JSON character encoding by setting ensure_ascii=False", encodedUnicode) print("Decoding JSON", json.loads(encodedUnicode))

unicode Data is JSON character encoding by setting ensure_ascii=False Decoding JSON

Note: This example is useful to store the Unicode string as-is in JSON.

JSON Serialize Unicode Data and Write it into a file.

In the above example, we saw how to Save non-ASCII or Unicode data as-is not as \u escape sequence in JSON. Now, Let’s see how to write JSON serialized Unicode data as-is into a file.

import json sampleDict= < "string1": "明彦", "string2": u"\u00f8" >with open("unicodeFile.json", "w", encoding='utf-8') as write_file: json.dump(sampleDict, write_file, ensure_ascii=False) print("Done writing JSON serialized Unicode Data as-is into file") with open("unicodeFile.json", "r", encoding='utf-8') as read_file: print("Reading JSON serialized Unicode data from file") sampleData = json.load(read_file) print("Decoded JSON serialized Unicode data") print(sampleData["string1"], sampleData["string1"])
Done writing JSON serialized Unicode Data as-is into file Reading JSON serialized Unicode data from file Decoded JSON serialized Unicode data 明彦 明彦

JSON file after writing Unicode data as-is

Serialize Unicode objects into UTF-8 JSON strings instead of \u escape sequence

You can also set JSON encoding to UTF-8. UTF-8 is the recommended default for maximum interoperability. set ensure_ascii=False to and encode Unicode data into JSON using ‘UTF-8‘.

import json # encoding in UTF-8 unicodeData= < "string1": "明彦", "string2": u"\u00f8" >print("unicode Data is ", unicodeData) print("Unicode JSON Data encoding using utf-8") encodedUnicode = json.dumps(unicodeData, ensure_ascii=False).encode('utf-8') print("JSON character encoding by setting ensure_ascii=False", encodedUnicode) print("Decoding JSON", json.loads(encodedUnicode))

unicode Data is Unicode JSON Data encoding using utf-8 JSON character encoding by setting ensure_ascii=False b» Decoding JSON

Encode both Unicode and ASCII (Mix Data) into JSON using Python

In this example, we will see how to encode Python dictionary into JSON which contains both Unicode and ASCII data.

import json sampleDict = print("unicode Data is ", sampleDict) # set ensure_ascii=True jsonDict = json.dumps(sampleDict, ensure_ascii=True) print("JSON character encoding by setting ensure_ascii=True") print(jsonDict) print("Decoding JSON", json.loads(jsonDict)) # set ensure_ascii=False jsonDict = json.dumps(sampleDict, ensure_ascii=False) print("JSON character encoding by setting ensure_ascii=False") print(jsonDict) print("Decoding JSON", json.loads(jsonDict)) # set ensure_ascii=False and encode using utf-8 jsonDict = json.dumps(sampleDict, ensure_ascii=False).encode('utf-8') print("JSON character encoding by setting ensure_ascii=False and UTF-8") print(jsonDict) print("Decoding JSON", json.loads(jsonDict))

unicode Data is JSON character encoding by setting ensure_ascii=True Decoding JSON JSON character encoding by setting ensure_ascii=False Decoding JSON JSON character encoding by setting ensure_ascii=False and UTF-8 b» Decoding JSON

Python Escape non-ASCII characters while encoding it into JSON

Let’ see how store all incoming non-ASCII characters escaped in JSON. It is a safe way of representing Unicode characters. By setting ensure_ascii=True we make sure resulting JSON is valid ASCII characters (even if they have Unicode inside).

import json unicodeData= < "string1": "明彦", "string2": u"\u00f8" >print("unicode Data is ", unicodeData) # set ensure_ascii=True encodedUnicode = json.dumps(unicodeData, ensure_ascii=True) print("JSON character encoding by setting ensure_ascii=True") print(encodedUnicode) print("Decoding JSON") print(json.loads(encodedUnicode))

unicode Data is JSON character encoding by setting ensure_ascii=True Decoding JSON

Did you find this page helpful? Let others know about it. Sharing helps me continue to create free Python resources.

About Vishal

I’m Vishal Hule, Founder of PYnative.com. I am a Python developer, and I love to write articles to help students, developers, and learners. Follow me on Twitter

Python Exercises and Quizzes

Free coding exercises and quizzes cover Python basics, data structure, data analytics, and more.

  • 15+ Topic-specific Exercises and Quizzes
  • Each Exercise contains 10 questions
  • Each Quiz contains 12-15 MCQ

Источник

Оцените статью