An XSD Example
This chapter will demonstrate how to write an XML Schema. You will also learn that a schema can be written in different ways.
An XML Document
Let’s have a look at this XML document called «shiporder.xml»:
The XML document above consists of a root element, «shiporder», that contains a required attribute called «orderid». The «shiporder» element contains three different child elements: «orderperson», «shipto» and «item». The «item» element appears twice, and it contains a «title», an optional «note» element, a «quantity», and a «price» element.
The line above: xmlns:xsi=»http://www.w3.org/2001/XMLSchema-instance» tells the XML parser that this document should be validated against a schema. The line: xsi:noNamespaceSchemaLocation=»shiporder.xsd» specifies WHERE the schema resides (here it is in the same folder as «shiporder.xml»).
Create an XML Schema
Now we want to create a schema for the XML document above.
We start by opening a new file that we will call «shiporder.xsd». To create the schema we could simply follow the structure in the XML document and define each element as we find it. We will start with the standard XML declaration followed by the xs:schema element that defines a schema:
In the schema above we use the standard namespace (xs), and the URI associated with this namespace is the Schema language definition, which has the standard value of http://www.w3.org/2001/XMLSchema.
Next, we have to define the «shiporder» element. This element has an attribute and it contains other elements, therefore we consider it as a complex type. The child elements of the «shiporder» element is surrounded by a xs:sequence element that defines an ordered sequence of sub elements:
Then we have to define the «orderperson» element as a simple type (because it does not contain any attributes or other elements). The type (xs:string) is prefixed with the namespace prefix associated with XML Schema that indicates a predefined schema data type:
Next, we have to define two elements that are of the complex type: «shipto» and «item». We start by defining the «shipto» element:
With schemas we can define the number of possible occurrences for an element with the maxOccurs and minOccurs attributes. maxOccurs specifies the maximum number of occurrences for an element and minOccurs specifies the minimum number of occurrences for an element. The default value for both maxOccurs and minOccurs is 1!
Now we can define the «item» element. This element can appear multiple times inside a «shiporder» element. This is specified by setting the maxOccurs attribute of the «item» element to «unbounded» which means that there can be as many occurrences of the «item» element as the author wishes. Notice that the «note» element is optional. We have specified this by setting the minOccurs attribute to zero:
We can now declare the attribute of the «shiporder» element. Since this is a required attribute we specify use=»required».
Note: The attribute declarations must always come last:
Here is the complete listing of the schema file called «shiporder.xsd»:
Divide the Schema
The previous design method is very simple, but can be difficult to read and maintain when documents are complex.
The next design method is based on defining all elements and attributes first, and then referring to them using the ref attribute.
Here is the new design of the schema file («shiporder.xsd»):
Using Named Types
The third design method defines classes or types, that enables us to reuse element definitions. This is done by naming the simpleTypes and complexTypes elements, and then point to them through the type attribute of the element.
Here is the third design of the schema file («shiporder.xsd»):
The restriction element indicates that the datatype is derived from a W3C XML Schema namespace datatype. So, the following fragment means that the value of the element or attribute must be a string value:
The restriction element is more often used to apply restrictions to elements. Look at the following lines from the schema above:
This indicates that the value of the element or attribute must be a string, it must be exactly six characters in a row, and those characters must be a number from 0 to 9.
XML Schema Tutorial
An XML Schema describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema Definition (XSD).
XSD Example
The purpose of an XML Schema is to define the legal building blocks of an XML document:
- the elements and attributes that can appear in a document
- the number of (and order of) child elements
- data types for elements and attributes
- default and fixed values for elements and attributes
Why Learn XML Schema?
In the XML world, hundreds of standardized XML formats are in daily use.
Many of these XML standards are defined by XML Schemas.
XML Schema is an XML-based (and more powerful) alternative to DTD.
XML Schemas Support Data Types
One of the greatest strength of XML Schemas is the support for data types.
- It is easier to describe allowable document content
- It is easier to validate the correctness of data
- It is easier to define data facets (restrictions on data)
- It is easier to define data patterns (data formats)
- It is easier to convert data between different data types
XML Schemas use XML Syntax
Another great strength about XML Schemas is that they are written in XML.
- You don’t have to learn a new language
- You can use your XML editor to edit your Schema files
- You can use your XML parser to parse your Schema files
- You can manipulate your Schema with the XML DOM
- You can transform your Schema with XSLT
XML Schemas are extensible, because they are written in XML.
With an extensible Schema definition you can:
- Reuse your Schema in other Schemas
- Create your own data types derived from the standard types
- Reference multiple schemas in the same document
XML Schemas Secure Data Communication
When sending data from a sender to a receiver, it is essential that both parts have the same «expectations» about the content.
With XML Schemas, the sender can describe the data in a way that the receiver will understand.
A date like: «03-11-2004» will, in some countries, be interpreted as 3.November and in other countries as 11.March.
However, an XML element with a data type like this:
ensures a mutual understanding of the content, because the XML data type «date» requires the format «YYYY-MM-DD».
Well-Formed is Not Enough
A well-formed XML document is a document that conforms to the XML syntax rules, like:
- it must begin with the XML declaration
- it must have one unique root element
- start-tags must have matching end-tags
- elements are case sensitive
- all elements must be closed
- all elements must be properly nested
- all attribute values must be quoted
- entities must be used for special characters
Even if documents are well-formed they can still contain errors, and those errors can have serious consequences.
Think of the following situation: you order 5 gross of laser printers, instead of 5 laser printers. With XML Schemas, most of these errors can be caught by your validating software.