How to create a Document Type Definition (DTD) for XML-files

A DTD describes in precise the allowed syntax and grammar of your XML file. It gives your XML file a fixed structure. When other XML elements are used than defined in the DTD your DTD would not be well formed any more. You can easily say you choose which elements, attributes, etc... users can write down.

Structure of an DTD

A DTD can have these items:

  • ELEMENT - xml elements
  • ATTRIBUTE - xml attributes
  • ENTITY - a data container
  • PCDATA - parsed charachter data
  • CDATA - unparsed character data

Creating the DTD-file

Ok we want to create a DTD to structure our xml projects file that we can easily change (import/export) files with other users without having troubles.

The XML file

This is the structure of our XML file that we want to create a doctype for.

<?xml version="1.0" encoding="UTF-8"?>
<projects>
  <project ID="1">
    <project_order />
    <name />
    <desc />
    <photo ID="1">
      <photo_name />
    </photo>
  </project>
  <project ID="2">
    <project_order />
    <name />
    <desc />
    <photo ID="2">
      <photo_name />
    </photo>
  </project>
</projects>
Creating the DTD file

DTD files have an *.dtd extension.

<!DOCTYPE projects [
	<!ELEMENT projects (project*)>
	<!ELEMENT project (photo*)>
	<!ELEMENT project (project_order, name, desc?)>
	<!ELEMENT photo (photo_name)>
	<!ELEMENT project_order (#PCDATA)>
	<!ELEMENT name (#PCDATA)>
	<!ELEMENT desc (#PCDATA)>
	<!ELEMENT photo_name (#PCDATA)>
	<!ATTLIST project ID CDATA #REQUIRED>
	<!ATTLIST photo ID CDATA #REQUIRED>
]>

Taking this line by line, it says:

  1. The start of our DTD file.
  2. projects is a valid element name, and an instance of such an element contains any number of person elements.
    The * denotes there can be 0 or more project elements within the projects element.
  3. project is a valid element name, and an instance of such an element contains any number of photo elements.
    The * denotes there can be 0 or more photo elements within the project element.
  4. project is a valid element name, and an instance of such an element contains one element named project_order, followed by one named name and followed by one element named desc (optional). The ? indicates that an element is optional. The references to the project_order and name elements name have no ?, so a project element must contain the project_order and name elements.
  5. photo is a valid element name, and an instance of such an element contains one element named photo_name.
  6. project_order is a valid element name, and an instance of such an element contains "parsed character data" (#PCDATA).
  7. name is a valid element name, and an instance of such an element contains "parsed character data" (#PCDATA).
  8. desc is a valid element name, and an instance of such an element contains "parsed character data" (#PCDATA).
  9. photo_name is a valid element name, and an instance of such an element contains "parsed character data" (#PCDATA).
  10. the elemenet project has an required attribute called ID and contains "unparsed character data" (CDATA).
  11. the elemenet photo has an required attribute called ID and contains "unparsed character data" (CDATA).
  12. The end of our DTD file.
Implementing the DTD into the XML file

You can easily implement your *.dtd file into your XML file by using a doctype.

Example of a doctype:

<!DOCTYPE name SYSTEM "dtdurl">

dtdurl is the absolute url for the DTD. In our example our doctype should be something like this:

<!DOCTYPE projects SYSTEM "http://www.mywebsite.com/files/projects.dtd">

Implemented our XML file will look like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE projects SYSTEM "http://www.mywebsite.com/files/projects.dtd">
<projects>
  <project ID="1">
    <project_order />
    <name />
    <desc />
    <photo ID="1">
      <photo_name />
    </photo>
  </project>
  <project ID="2">
    <project_order />
    <name />
    <desc />
    <photo ID="2">
      <photo_name />
    </photo>
  </project>
</projects>

This is only a basic example. You can go much further on DTD.

Example: Transition XHTML DTD

This example shows a part of a DTD for Transitional XHTML files:

<!ENTITY % URI "CDATA">
<!ENTITY % Text "CDATA">
<!ENTITY % Length "CDATA">
<!ENTITY % ImgAlign "(top|middle|bottom|left|right)">
<!ENTITY % Pixels "CDATA">
<!ELEMENT img EMPTY>
<!ATTLIST img
	%attrs;
	src %URI; #REQUIRED
	alt %Text; #REQUIRED
	name NMTOKEN #IMPLIED
	longdesc %URI; #IMPLIED
	height %Length; #IMPLIED
	width %Length; #IMPLIED
	usemap %URI; #IMPLIED
	align %ImgAlign; #IMPLIED
	border %Length; #IMPLIED
	hspace %Pixels; #IMPLIED
	vspace %Pixels; #IMPLIED
>