XML tutorial: XML DTD(DOCUMENT TYPE DEFINITION)

XML DTD(DOCUMENT TYPE DEFINITION)

What is DTD?

· DTD stands for document type definition

· A DTD is a text file with .dtd extension

· If XML file holds data, its corresponding DTD holds Meta data.

In a DTD legal building blocks of an XML documents are specified. i.e XML vocabulary is specified in A dtd.

What are the constituents of DTD file?

· DTD point of view, all XML DOCUMENTs are made up by the following building blocks:

1. Elements

2. Attributes

3. Entities

4. PCDATA

5. CDATA

Element:

this is the most important building block using which we can create tags. The tags can contain some text, it can contain someother elements (or) it can be empty.

<stud>

Element

2. Attributes: attributes are used to provide additional information about an XML tag.

· The attributes must be specified in the starting tags.

· The attributes always come in name value pairs.

· The attribute values must be specified either in single quotes or double quotes.

ENTITIES:

Some of the characters have a special meaning in XML , like the less than sign(<) that defines the start of a XML tag . most of you know the HTML Entity:” ” this “no-breaking-space” entity is used in HTML to insert an extra space in a document. Entities are expanded when a document is parsed by an XML parser.

The following entities are predefined in XML:

ENTITIY REFERENCE	CHARACTER
<	<
&gt	>
&amp	&
&quot	“
'	‘

PCDATA:

Xml parsers normally parse all the text in an XML document. When an XML element is parsed, the text between the xml tags is also parsed:

Ex: <message>This text is also parsed<message>

The parser does this because XML elements can contain other element, as in this example, where the <name> element contains two other elements(first and last):

<name><first>Bill</first><last>Gates</last></name>

And the parser will break it up into sub-elements like this:

<name>

<last>Gates</last>

</name>

Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML parser.

CDATA-(Unparsed) Character Data

· The term CDATA is used about text data that should not be parsed by the XML parser.

· Characters like “<” and “&” are illegal in XML elements.

· “<” will generate an error because the parser interprets it as the start of a new element

· “&” will generate an error because the parser interprets it as the start of an character entity.

· Some text , like JAVASCRIPT code, contains a lot of “<” or “&” characters. To avoid errors script code can be defined as CDATA.

· Everything inside a CDATA section is ignored by the parser.

C DATA section starts with “<![CDATA[“and ends with “]]>”:

<![CDATA[function matchwo(a,b)

{

if(a<b&&a<0) then

{

return l;

}

else

{

return 0;

}

</script>

· In the example above, everything inside the CDATA section is ignored by the parser.

Note: CDATA Sections: can not contain the string “]]>”. Nested CDATA sections are not allowed.

The “]]>” that marks the end of the CDATA section can not contain spaces or line breaks.

· In the examples are declared with an element declaration. An element declaration has the following syntax:

XML DTD(DOCUMENT TYPE DEFINITION)

3. Entities: these building blocks represent special characters.

4. PCDATA: (Parsed character Data) : this data will not be parsed by the parser and it can not expand the entities.

5. CDATA: (CHARACTER DATA): this data will not be parsed by the parser and it can not expand the entities.

Element

XML tutorial

Tuesday, 13 August 2013

XML DTD(DOCUMENT TYPE DEFINITION)

No comments:

Post a Comment