Creating well formed XML Documents

1.The <?xml> tag and the root element
➢ It is the first element
➢ It specifies the version
➢ Only one root element
➢ It is the first element
➢ It specifies the version
➢ Only one root element
2.Opening and Closing tag
➢ <name> </name>
➢ <name> </name>
3.Empty element
➢ <bookname></bookname> or <bookname/>
➢ <bookname></bookname> or <bookname/>
4. Entities
1.Character entities- &entity; (>, < , ', " , &)
2.Binary entities- <!ENTITY city SYSTEM “delhi.html” NDATA html>
1.!ENTITY - Keyword
2.City – Entity name
3.System – DTD type
4.“delhi.html” - Actual contents
5.NDATA – Non-XML(Binary)
6.Html – how to interpret
1.!ENTITY - Keyword
2.City – Entity name
3.System – DTD type
4.“delhi.html” - Actual contents
5.NDATA – Non-XML(Binary)
6.Html – how to interpret
3.Text entities- <!ENTITY country “india”>
(name) (content)
(name) (content)
XML Declaration
● <?xml
version="version_number"
encoding="encoding_declaration”
standalone="standalone_status"
?>
● At first position
● Version: default 1.0
● Encoding attribute: encoding of characters that are permitted in the document
● Standalone attribute: value-no if any external markup declaration content of :
– Attribute with default values
– Entities(other than & < )
– Element type with element content,if white space occurs directly
XML naming rules
● Element name should be consistent
● Begin with a letter or _ followed by any number of letters, numbers, hyphens, colon and _
● Case sensitive
● Can not contain space
● Numbers must not have prefix of xml eg:W3C, xmlname can not be used
● Can not use name begin with 'x'
● Brevity tag names: <queue> rather than <q>
● Maintain standard naming conventions and quoting(BookStore)
XML naming rules
● : colon is used to demarcate namespace prefixes from the actual name.
● Avoid usage of double punctuation combination(.,))
● XML 1.0 doesnt specify a character limit in XML names, many parsers will have problems with constructs longer than 256 characters.
Element content
● It is handled in one of the 2 ways
– Parsed Character Data(PCDATA): it is examined by the XML parser to discover XML content embedded within it.
– Character Data (CDATA): it is delimited by the special syntax <![CDATA[...]]> and is not processes by the parser.
PCDATA-Parsed Character Data
● All textual data is processes by default.
● Special characters need to be handled carefully to prevent confusion in the processor.
● XML parsers normally parse all the text in an XML document.
Ampersand: &
& Single quote: ' '
Greater than: > >
Less than: <
< Double quote: " “
● “;” is needed to differentiate between markup characters and text characters.
● Single and double qoute uses apostrophes.
● When there are lots of them within document, CDATA is used.
CDATA – Character Data
● <![CDATA[ ... .. ] ] >
● CDATA is not parsed and is treated as it is.
● CDATA can not be nested.
● It will retian spaces.
● Useful for embedding other languages within XML(& use &) – HTML document – Xml document – Java scrtipt or any other
CDATA Example
● The following markup code shows example of CDATA. Here, each character written inside the CDATA section is ignored by the parser.
<script>
<![CDATA[
<message> Welcome to TutorialsPoint </message>
]] >
</script >
● In the above syntax, everything between <message> and </message> is treated as character data and not as markup.
Element tag- Rules
● End tag is identified by / eg: <color> </color>
● Elements may contain attributes within the start tag
– Eg: <book isbn=”324”></book>
● Empty elements contain no child elements of text.
● Eg <record key=”111”> </record> can be represented as <record key=”111” /> or <record />
– Space is required before /
Element attributes-Rules
● Attribute provide extra info of tag(key info)
● Used to attach information to elements
● Consists of name=”value”, name is a legal XML name
● Placed in start tag
● May have several attributes
● <title type=”sec1” number=”1”>XML intro</title>
Comments
● For clarification inside actual element tags
● It may not be nested.
● Use whenever it is needed. Not much
● <!-- -->
Well formed versed Valid
● Well-formes = follows all rules
● Valid= both well formed and adheres rules
● well formed XML:
– Consists of XML elements that are nested within another
– Has a unique root element
– Follows naming conventions
– Follows rules for quoting attributes
– Has all special characters properly escaped.
● Valid Xml document has an associated vocabulary defined by DTD or XML schema
:max_bytes(150000):strip_icc()/Documento-XML-56a01d025f9b58eba4af0722.jpg)

0 Comments