About XML

XML, or eXtensible Markup Language, is a markup language. Markup languages clarify the content in a document by tagging the elements of the document. A well-known markup language is HTML, the standard language for writing webpages. The benefit of working with XML is that it is an open standard. The structure and rules for working with XML documents are well documented by the World Wide Web Consortium (http://www.w3c.org). XML is also quite simple and understandable: you can view an XML file in any text editing application and even edit its content. Because XML is an open standard, anyone with sufficient understanding can process an XML document into other formats, such as plain text, HTML, or even other XML formats.

Tags and Elements

Tagging content gives the content of a document structure and specific meaning. Each tag defines an element of the document. For example, compare the following excerpts from a text file before and after tags have been added.

Original text file:

Coffee house wide shot
17
300
Good

Tagged document:

<clip>
<name>Coffee house wide shot</name>
<reel>17</reel>
<duration>300</duration>
<good>TRUE</good>
</clip>

In the original text file, you have to make assumptions about the meaning of the numbers 17 and 300. In the tagged document, the tags clarify that 17 is actually the reel name of a clip and 300 is the clip duration (in frames).

In XML, elements can contain other elements. In the example above, the <clip> element encompasses all of the other elements.

Most markup languages have a limited set of tags and rules about how the elements can be ordered hierarchically. For example, an HTML document can have a <p> element (this is a paragraph element) but if you added a <sentence> element, it would not be recognized by HTML-aware applications unless the entire HTML standard were altered.

XML was designed to be extensible—you can define any tags and hierarchical rules that fit the data you are working with. For example, an XML file that contains store inventory data might have elements such as <product>, <manufacturer>, <cost>, and <size>. An XML file that contains video editing information would have very different elements, such as <clip>, <name>, <duration>, <logginginfo>, and so on.

XML is a strict markup language, which means all tags must be closed. For example, if your XML document contains a <clip> tag, there must be a corresponding </clip> tag to close the element. Unclosed tags create errors.

Attributes of XML Elements

Some elements contain identifying information called attributes. In XML, an element’s attribute looks like this:

<font color=”red”>
...
</font>

In the example above, the font element has an attribute called color, which is set to “red.” Alternatively, you could choose to structure your XML format without attributes:

<font>
<color>red</color>
...
</font>

Just as XML tags are extensible, so are attributes. When you define the rules of your XML file, you can allow elements to have any attributes you want. For example, in the Final Cut Pro XML Interchange Format, every clip can have an “id” attribute so each clip can be uniquely identified and referenced:

<clip id=”coffee house 1”>
...
</clip>
<clip id =”coffee house 2”>
...
</clip>

Whitespace

Whitespace in a document includes multiple spaces, tab characters, carriage returns, newline characters, and so on. An XML parser reads and processes XML tags in a document, but ignores extra whitespace. To an XML parser, there is no difference between

<clip><name>Coffee house wide shot</name><reel>17</reel></clip>

and

<clip>
<name>Coffee house wide shot</name>
<reel>17</reel>
</clip>

Whitespace is permitted so you can make your XML file more readable without affecting the fundamental structure or meaning.

Document Type Definitions

Before you can create an XML document, you need to define the rules of your document: which elements (tags) can exist, which elements contain other elements, which elements are optional or required, what attributes each element has, and so on. You define the rules of an XML document in a Document Type Definition, or DTD. Every markup language has a DTD so that parsers know how to verify the structure of documents. Without a DTD, it is impossible for the parser to validate an XML document. Every XML document requires a DTD.

If you are working with a predefined language, such as HTML or the Final Cut Pro XML Interchange Format, the DTD has already been created for you. All you need to do is follow the rules of the DTD to create valid Final Cut Pro XML.

Working with XML Created in Different Applications

XML documents can be used to represent almost any kind of information. Unlike languages such as HTML, XML has no predefined elements. XML is not one format; rather, XML is used to create specific XML-based markup languages. Just because an application supports XML does not mean that it can recognize any kind of XML document. For example, a database application may use an XML format with elements like <row>, <column>, and <subtotal>, while a graphics application might store information in elements such as <layer>, <shape>, and <color>. Even though both documents are XML, they are incompatible because their Document Type Definitions are completely different.