Chapter 2. Retrieving Data From An XML Datasource
This chapter shows how to retrieve XML data from a standard data source. Such source can be a file, an HTTP object or a text string. The method described in this chapter is the simplest way to retrieve XML data. More advanced ways are described in the next chapters.
This section describes a very simple XML application. It parses XML data from a stream and dumps it "pretty-printed" to the standard output. While its use is very limited, it shows how to set up a parser and parse an XML document.
You can easily traverse the logical tree generated by the parser. If you need to create your own object tree, you can create your custom builder, which is described in chapter 3.
The default XML builder, StdXMLBuilder generates a tree of IXMLElement objects. Every such object has a name and can have attributes, #PCDATA content and child objects.
The following XML data:
<FOO attr1="fred" attr2="barney">
is parsed to the following objects:
You can retrieve the name of an element using the method getFullName, thus:
FOO.getFullName() ==> "FOO"
You can enumerate the attribute keys using the method enumerateAttributeNames:
Enumeration enum = FOO.enumerateAttributeNames();
You can retrieve the value of an attribute using getAttribute:
FOO.getAttribute("attr1", null) ==> "fred"
The child elements can be enumerated using the method enumerateChildren:
Enumeration enum = FOO.enumerateChildren();
If the element contains parsed character data (#PCDATA) as its only child. You can retrieve that data using getContent:
BAR.getContent() ==> "Some data."
If an element contains both #PCDATA and XML elements as its children, the character data segments will be put in untitled XML elements (whose name is null).
IXMLElement contains many convenience methods for retrieving data and traversing the XML tree.
You can very easily create a tree of XML elements or modify an existing one.
To create a new tree, just create an IXMLElement object:
IXMLElement elt = new XMLElement("ElementName");
You can add an attribute to the element by calling setAttribute.
You can add a child element to an element by calling addChild:
IXMLElement child = elt.createElement("Child");
Note that the child element is created calling the method createElement. This insures that the child instance is compatible with its new parent.
If an element has no children, you can add #PCDATA content to it using setContent:
If the element does have children, you can add #PCDATA content to it by adding an untitled element, which you create by calling createPCDataElement:
IXMLElement pcdata = elt.createPCDataElement();
When you have created or edited the XML element tree, you can write it out to an output stream or writer using an XMLWriter:
java.io.Writer output = ...;
As of version 2.1, NanoXML has support for namespaces. Namespaces allow you to attach a URI to the name of an element name or an attribute. This URI allows you to make a distinction between similary named entities coming from different sources. More information about namespaces can be found in the XML Namespaces recommendation.
Please note that a DTD has no support for namespaces. It is import to understand that an XML document can have only one DTD. Though the namespace URI is often presented as a URL, that URL is not a system ID for a DTD. The only function of a namespace URI is to provide a globally unique name.
As an example, let's have the following XML data:
The doc:book top-level element uses the namespace "http://nanoxml.n3.net/book". The prefix is used as an alias for the namespace, which is defined in the attribute xmlns:doc. This prefix is defined for the doc:book element and its child elements.
The chapter element uses the namespace "http://nanoxml.n3.net/chapter". Because the namespace URI has been defined as the value of the xmlns attribute, the namespace is the default namespace for the chapter element. Default namespaces are inherited by the child elements, but only for their names. Attributes never have a default namespace.
The chapter element has an attribute doc:id, which is defined in the same namespace as doc:book because of the doc prefix.
NanoXML 2.1 offers some variants on the standard retrieval methods to allow the application to access the namespace information.
In the following examples, we assume the variable book to contain the doc:book element and the variable chapter to contain the chapter element.
To get the full name, which includes the namespace prefix, of the element, use getFullName:
book.getFullName() ==> "doc:book"
To get the short name, which excludes the namespace prefix, of the element, use getName:
book.getName() ==> "book"
For elements that have no associated namespace, getName and getFullName are equivalent.
To get the namespace URI associated with the name of the element, use getNamespace:
book.getNamespace() ==> "http://nanoxml.n3.net/book"
If no namespace is associated with the name of the element, this method returns null.
You can get an attribute of an element using either its full name (which includes its prefix) or its short name together with its namespace URI, so the following two instructions are equivalent:
Note that the title attribute of chapter has no namespace, even though the chapter element name has a default namespace.
You can create a new element which uses a namespace this way:
book = new XMLElement("doc:book", "http://nanoxml.n3.net/book");
You can add an attribute which uses a namespace this way:
chapter.setAttribute("doc:id", "http://nanoxml.n3.net/book", chapterId);