Read schema in java

Read an XML File using DOM Parser in Java

In this Java xml parser tutorial, learn to read XML using DOM parser. DOM parser is intended for working with XML as an object graph (a tree-like structure) in memory – the so-called “Document Object Model (DOM)“.

At first, the parser traverses the input XML file and creates DOM objects corresponding to the nodes in the XML file. These DOM objects are linked together in a tree-like structure. Once the parser is done with the parsing process, we get this tree-like DOM object structure back from it. Now we can traverse the DOM structure back and forth as we want – to get/update/delete data from it.

The other possible ways to read an XML file are using the SAX parser and StAX parser as well.

For demo purposes, we will be parsing the below XML file in all code examples.

  Lokesh Gupta India  Alex Gussin Russia  David Feezor USA  

Let’s note down some broad steps to create and use a DOM parser to parse an XML file in java.

1.1. Import dom Parser Packages

We will need to import dom parser packages first in our application.

import org.w3c.dom.*; import javax.xml.parsers.*; import java.io.*;

1.2. Create DocumentBuilder

The next step is to create the DocumentBuilder object.

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder();

1.3. Create Document object from XML file

Read the XML file to Document object.

Document document = builder.parse(new File( file ));

1.4. Validate Document Structure

XML validation is optional but good to have it before starting parsing.

Schema schema = null; try < String language = XMLConstants.W3C_XML_SCHEMA_NS_URI; SchemaFactory factory = SchemaFactory.newInstance(language); schema = factory.newSchema(new File(name)); >catch (Exception e) < e.printStackStrace(); >Validator validator = schema.newValidator(); validator.validate(new DOMSource(document));

1.5. Extract the Root Element

We can get the root element from the XML document using the below code.

Element root = document.getDocumentElement();

We can examine the XML element attributes using the below methods.

element.getAttribute("attributeName") ; //returns specific attribute element.getAttributes(); //returns a Map (table) of names/values

Child elements for a specified Node can be inquired about in the below manner.

node.getElementsByTagName("subElementName"); //returns a list of sub-elements of specified name node.getChildNodes(); //returns a list of all child nodes

2. Read XML File with DOM parser

In the below example code, we are assuming that the user is already aware of the structure of employees.xml file (its nodes and attributes). So example directly starts fetching information and starts printing it in the console. In a real-life application, we will use this information for some real purpose rather than just printing it on the console and leaving.

public static Document readXMLDocumentFromFile(String fileNameWithPath) throws Exception < //Get Document Builder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); //Build Document Document document = builder.parse(new File(fileNameWithPath)); //Normalize the XML Structure; It's just too important !! document.getDocumentElement().normalize(); return document; >

Now we can use this method to parse the XML file and verify the content.

public static void main(String[] args) throws Exception < Document document = readXMLDocumentFromFile("c:/temp/employees.xml"); //Verify XML Content //Here comes the root node Element root = document.getDocumentElement(); System.out.println(root.getNodeName()); //Get all employees NodeList nList = document.getElementsByTagName("employee"); System.out.println("============================"); for (int temp = 0; temp < nList.getLength(); temp++) < Node node = nList.item(temp); if (node.getNodeType() == Node.ELEMENT_NODE) < //Print each employee's detail Element eElement = (Element) node; System.out.println("\nEmployee id : " + eElement.getAttribute("id")); System.out.println("First Name : " + eElement.getElementsByTagName("firstName").item(0).getTextContent()); System.out.println("Last Name : " + eElement.getElementsByTagName("lastName").item(0).getTextContent()); System.out.println("Location : " + eElement.getElementsByTagName("location").item(0).getTextContent()); >> >
employees ============================ Employee id : 111 First Name : Lokesh Last Name : Gupta Location : India Employee id : 222 First Name : Alex Last Name : Gussin Location : Russia Employee id : 333 First Name : David Last Name : Feezor Location : USA

Another real-life application’s requirement might be populating the DTO objects with information fetched in the above example code. I wrote a simple program to help us understand how it can be done easily.

Let’s say we have to populate Employee objects which are defined as below.

Now, look at the example code to populate the Employee objects list. It is just as simple as inserting a few lines in between the code, and then copying the values in DTOs instead of the console.

public static List parseXmlToPOJO(String fileName) throws Exception < Listemployees = new ArrayList(); DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(new File(fileName)); document.getDocumentElement().normalize(); NodeList nList = document.getElementsByTagName("employee"); for (int temp = 0; temp < nList.getLength(); temp++) < Node node = nList.item(temp); if (node.getNodeType() == Node.ELEMENT_NODE) < Element eElement = (Element) node; Employee employee = new Employee(); employee.setId(Integer.parseInt(eElement.getAttribute("id"))); employee.setFirstName(eElement.getElementsByTagName("firstName").item(0).getTextContent()); employee.setLastName(eElement.getElementsByTagName("lastName").item(0).getTextContent()); employee.setLocation(eElement.getElementsByTagName("location").item(0).getTextContent()); //Add Employee to list employees.add(employee); >> return employees; >

4. Parse “unknown” XML using NamedNodeMap

The previous example shows how we can iterate over an XML document parsed with known or little know structure to you, while you are writing the code. In some cases, we may have to write the code in such a way that even if there are some differences in the assumed XML structure while coding, the program must work without failure.

Here we are iterating over all elements present in the XML document tree. we can add our knowledge and modify the code such that as soon as we get the required information while traversing the tree, we just use it.

private static void visitChildNodes(NodeList nList) < for (int temp = 0; temp < nList.getLength(); temp++) < Node node = nList.item(temp); if (node.getNodeType() == Node.ELEMENT_NODE) < System.out.println("Node Name = " + node.getNodeName() + "; Value = " + node.getTextContent()); //Check all attributes if (node.hasAttributes()) < // get attributes names and values NamedNodeMap nodeMap = node.getAttributes(); for (int i = 0; i < nodeMap.getLength(); i++) < Node tempNode = nodeMap.item(i); System.out.println("Attr name : " + tempNode.getNodeName() + "; Value wp-block-code">employees ============================ Node Name = employee; Value = Lokesh Gupta India Attr name : id; Value = 111 Node Name = firstName; Value = Lokesh Node Name = lastName; Value = Gupta Node Name = location; Value = India Node Name = employee; Value = Alex Gussin Russia Attr name : id; Value = 222 Node Name = firstName; Value = Alex Node Name = lastName; Value = Gussin Node Name = location; Value = Russia Node Name = employee; Value = David Feezor USA Attr name : id; Value = 333 Node Name = firstName; Value = David Node Name = lastName; Value = Feezor Node Name = location; Value = USA

That’s all for this good-to-know concept around Java XML DOM Parser. Drop me a comment if something is not clear OR needs more explanation.

Источник

Java SAX

Java SAX tutorial shows how to use Java SAX API to read and validate XML documents.

SAX

is an event-driven algorithm for parsing XML documents. SAX is an alternative to the Document Object Model (DOM). Where the DOM reads the whole document to operate on XML, SAX parsers read XML node by node, issuing parsing events while making a step through the input stream. SAX processes documents state-independently (the handling of an element does not depend on the elements that came before). SAX parsers are read-only.

SAX parsers are faster and require less memory. On the other hand, DOM is easier to use and there are tasks, such as sorting elements, rearranging elements or looking up elements, that are faster with DOM.

A SAX parser comes with JDK, so there is no need to dowload a dependency.

Java SAX parsing example

In the following example, we read an XML file with a SAX parser.

  Peter Brown programmer  Martin Smith accountant  Lucy Gordon teacher   

We are going to read this XML file.

package com.zetcode.model; public class User < int id; private String firstName; private String lastName; private String occupation; public User() < >public int getId() < return id; >public void setId(int id) < this.id = id; >public String getFirstName() < return firstName; >public void setFirstName(String firstName) < this.firstName = firstName; >public String getLastName() < return lastName; >public void setLastName(String lastName) < this.lastName = lastName; >public String getOccupation() < return occupation; >public void setOccupation(String occupation) < this.occupation = occupation; >@Override public String toString() < StringBuilder builder = new StringBuilder(); builder.append("User<").append("id=").append(id) .append(", firstName=").append(firstName) .append(", lastName=").append(lastName) .append(", occupation=").append(occupation).append(">"); return builder.toString(); > >

This is the user bean; it will hold data from XML nodes.

package com.zetcode; import com.zetcode.model.User; import org.xml.sax.SAXException; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import java.io.File; import java.io.IOException; import java.nio.file.Paths; import java.util.List; import java.util.logging.Level; import java.util.logging.Logger; public class MyRunner < private SAXParser saxParser = null; private SAXParser createSaxParser() < try < SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); saxParser = factory.newSAXParser(); return saxParser; >catch (ParserConfigurationException | SAXException ex) < Logger lgr = Logger.getLogger(MyRunner.class.getName()); lgr.log(Level.SEVERE, ex.getMessage(), ex); return saxParser; >> public List parseUsers() < var handler = new MyHandler(); String fileName = "src/main/resources/users.xml"; File xmlDocument = Paths.get(fileName).toFile(); try < SAXParser parser = createSaxParser(); parser.parse(xmlDocument, handler); >catch (SAXException | IOException ex) < Logger lgr = Logger.getLogger(MyRunner.class.getName()); lgr.log(Level.SEVERE, ex.getMessage(), ex); >return handler.getUsers(); > >

MyRunner creates a SAX parser and launches parsing. The parseUsers returns the parsed data in a list of User objects.

SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); saxParser = factory.newSAXParser();

From the SAXParserFactory , we get the SAXParser .

SAXParser parser = createSaxParser(); parser.parse(xmlDocument, handler);

We parse the document with the parse method. The second parameter of the method is the handler object, which contains the event handlers.

package com.zetcode; import com.zetcode.model.User; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.helpers.DefaultHandler; import java.util.ArrayList; import java.util.List; public class MyHandler extends DefaultHandler < private Listusers = new ArrayList<>(); private User user; private boolean bfn = false; private boolean bln = false; private boolean boc = false; @Override public void startElement(String uri, String localName, String qName, Attributes attributes) < if ("user".equals(qName)) < user = new User(); int user.setId(id); >switch (qName) < case "firstname" ->bfn = true; case "lastname" -> bln = true; case "occupation" -> boc = true; > > @Override public void characters(char[] ch, int start, int length) < if (bfn) < user.setFirstName(new String(ch, start, length)); bfn = false; >if (bln) < user.setLastName(new String(ch, start, length)); bln = false; >if (boc) < user.setOccupation(new String(ch, start, length)); boc = false; >> @Override public void endElement(String uri, String localName, String qName) < if ("user".equals(qName)) < users.add(user); >> public List getUsers() < return users; >>

In the MyHandler class, we have the implementations of the event handlers.

public class MyHandler extends DefaultHandler 

The handler class must extend from the DefaultHandler , where we have the event methods.

@Override public void startElement(String uri, String localName, String qName, Attributes attributes) < if ("user".equals(qName)) < user = new User(); int user.setId(id); >switch (qName) < case "firstname" ->bfn = true; case "lastname" -> bln = true; case "occupation" -> boc = true; > >

The startElement method is called when the parser starts parsing a new element. We create a new user if the element is . For other types of elements, we set boolean values.

@Override public void characters(char[] ch, int start, int length) < if (bfn) < user.setFirstName(new String(ch, start, length)); bfn = false; >if (bln) < user.setLastName(new String(ch, start, length)); bln = false; >if (boc) < user.setOccupation(new String(ch, start, length)); boc = false; >>

The characters method is called when the parser encounters text inside elements. Depending on the boolean variable, we set the user attributes.

@Override public void endElement(String uri, String localName, String qName) < if ("user".equals(qName)) < users.add(user); >>

At the end of the element, we add the user object to the list of users.

package com.zetcode; import com.zetcode.model.User; import java.util.List; public class JavaReadXmlSaxEx < public static void main(String[] args) < var runner = new MyRunner(); Listlines = runner.parseUsers(); lines.forEach(System.out::println); > >

JavaReadXmlSaxEx starts the application. It delegates the parsing tasks to MyRunner . In the end, the retrieved data is printed to the console.

Java SAX validation example

The following example uses the XSD language to validate an XML file. is the current standard schema language for all XML documents and data. (There are other alternative schema languages such as DTD and RELAX NG.) XSD is a set of rules to which an XML document must conform in order to be considered valid according to the schema.

This is the XSD file for validating users. It declares, for instance, that the element must be within the element or that the id attribute of must be and integer and is mandatory.

package com.zetcode; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import javax.xml.XMLConstants; import javax.xml.transform.sax.SAXSource; import javax.xml.validation.Schema; import javax.xml.validation.SchemaFactory; import javax.xml.validation.Validator; import java.io.File; import java.io.IOException; import java.io.Reader; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.logging.Level; import java.util.logging.Logger; public class JavaSaxValidation < public static void main(String[] args) < var xsdFile = new File("src/main/resources/users.xsd"); try < Path xmlPath = Paths.get("src/main/resources/users.xml"); Reader reader = Files.newBufferedReader(xmlPath); String schemaLang = XMLConstants.W3C_XML_SCHEMA_NS_URI; SchemaFactory factory = SchemaFactory.newInstance(schemaLang); factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); Schema schema = factory.newSchema(xsdFile); Validator validator = schema.newValidator(); var source = new SAXSource(new InputSource(reader)); validator.validate(source); System.out.println("The document was validated OK"); >catch (SAXException ex) < Logger lgr = Logger.getLogger(JavaSaxValidation.class.getName()); lgr.log(Level.SEVERE, "The document failed to validate"); lgr.log(Level.SEVERE, ex.getMessage(), ex); >catch (IOException ex) < Logger lgr = Logger.getLogger(JavaSaxValidation.class.getName()); lgr.log(Level.SEVERE, ex.getMessage(), ex); >> >

The example uses the users.xsd schema to validate the users.xml file.

String schemaLang = XMLConstants.W3C_XML_SCHEMA_NS_URI; SchemaFactory factory = SchemaFactory.newInstance(schemaLang); factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); Schema schema = factory.newSchema(xsdFile);

With the SchemaFactory we choose the W3C XML schema for our schema definition. In other words, our custom schema definition must also adhere to certain rules.

Validator validator = schema.newValidator();

A new validator is generated from the schema.

var source = new SAXSource(new InputSource(reader)); validator.validate(source);

We validate the XML document against the provided schema.

By default, if the document is not valid, a SAXException is thrown.

In this article we have read and validated an XML document with Java SAX.

Author

My name is Jan Bodnar and I am a passionate programmer with many years of programming experience. I have been writing programming articles since 2007. So far, I have written over 1400 articles and 8 e-books. I have over eight years of experience in teaching programming.

Источник

Читайте также:  Eslint import extensions typescript
Оцените статью