Home

Wiley Beginning XML Databases

image

Contents

1. Bak gt amp A Bsearch Favorites media 4 B S Si ES Address B C Manuscripts Wiley BeginningXMLDatabases 01 Fig populationInThousands xml Wa Links bd ry Africa Burundi Comoros Djibouti Entrea Ethiopia Kenya Madagascar Malawi Mauritius Mozambique Reunion Rwanda Seychelles Somalia Uganda Tanzania Zambia Zimbabwe Angola Cameroon CentralAfncanRepublic Chad Congo RepublicOfCongo EquatorialGuinea Gabon SaoTomeAndPrincipe Algeria Egypt Libyan Morocco Sudan Tunisia Botswana Lesotho Namibia SouthAfrica Swaziland Benin BurkinaFaso CapeVerde CoteDIvoire Gambia Ghana Guinea GuineaBissau Liberia Mah Mauritania Niger Nigeria SaintHelena Senegal SierraLeone Togo Asia China HongKong NorthK orea Japan Macau Mongolia RepublicofK orea Afghanistan Bangladesh Bhutan India Iran Kazakhstan Kyrgyzstan Maldives Nepal Pakistan SriLanka Tajikistan Turkmenistan Uzbekistan BrunetDarussalam Cambodia EastTimor Indonesia Laos Malaysia Myanmar Philippines Singapore Thailand Vietnam Armenia Azerbaijan Bahrain Cyprus GazaStrip Georgia Iraq Israel Jordan Kuwait Lebanon Oman Qatar SandiArahia Sima Turkew ITAR Yemen zi E Done my Computer Z Figure 1 7 Using XML element attributes to change the display of an XML document What Is XML Figure 1 8 shows the HTML display of the preceding HTML coded page and the XML dis played document in Figure 1 7 the previous example AA C Manuscripts Wiley BeginningXML
2. Once again the now familiar population example lt populationInThousands gt lt world gt lt continents gt lt continent name Africa year1998 748 927 year2025 1 298 311 year2050 1 766 082 gt lt countries gt lt country name Burundi gt lt year1998 gt 6457 lt year1998 gt lt year2025 gt 11569 lt year2025 gt lt year2050 gt 15571 lt year2050 gt lt country gt 19 Chapter 1 lt country name Comoros gt lt year1998 gt 658 lt year1998 gt lt year2025 gt 1176 lt year2025 gt lt year2050 gt 1577 lt year2050 gt lt country gt omnes MOAS MIIA SAAE o AO Oane Attributes can also be contained within an element as child elements The example you just saw can be altered as in the next script removing all element attributes The following script just looks busier and perhaps a little more complex for the naked eye to decipher The more important point to note is that the physical size of the XML document is larger because additional termination elements are introduced In very large XML documents this can be a significant performance factor lt populationInThousands gt lt world gt lt continents gt lt continent gt lt name gt Africa lt name gt lt year1998 gt 748 927 lt year1998 gt lt year2025 gt 1 298 311 lt year2025 gt lt year2050 gt 1 766 082 lt year2050 gt lt countries gt lt country gt lt name gt Burundi lt name gt lt year1998 gt 6457 lt year1998 gt lt year2025 gt
3. What Is XML This chapter provides a brief summary of what XML is The abbreviation XML refers to eXtensible Markup Language which means that XML is extensible or changeable HTML Hypertext Markup Language on the contrary is a non extensible language and is the default language that sits behind many of the web pages in your web browser along with numerous other languages HTML does not allow changes to web pages HTML web pages are effectively frozen in time when they are built and cannot be changed when viewed in a browser Internet Explorer and Netscape are browsers used for viewing websites on the Internet XML on the other hand allows generation of web pages on the fly XML allows storage of change able data into web pages that can be altered at any time besides runtime XML pages can also be tailored in look feel and content and they can be tailored to any specific user looking at a web page at any point in time In this chapter you learn What XML is What XSL is The differences between XML and HTML Basic XML syntax The basics of the XML DOM Details about different browsers and XML The basics of the DTD Document Type Definition How to construct an XML document Reserved characters in XML Coovovoceovovodovdtd How to ignore the XML parser Chapter 1 Q What XML namespaces are Q How to handle XML for multiple languages Let s begin by comparing XML with HTML the Hypertext Markup Langu
4. not exist at all From a database performance perspective avoiding use of attributes in favor of contained unreferenced collections which are what a multitude of same named elements is is suicidal for your applications if your database gets even to a reasonable size It will just be too slow 21 Chapter 1 Q Attributes are not expansion friendly It is more difficult to change metadata than it is to change data It should be If you have to change metadata then there might be data structural design issues anyway In a purely database environment not using XML changing the database model is the equivalent of changing metadata In commercial environments metadata is usually not altered because it is too difficult and too expensive All application code depends on database structure not being changed Changing database metadata requires application changes as well That s why it can get expensive From a perspective of XML and XML in databases you do not want to change attributes because attributes represent metadata and that is a database modeling design issue not a programming issue Changing the data is much much easier Try It Out Using XML Syntax The following data represents three regions containing six countries as in the previous Try It Out sec tions in this chapter In this example currencies are now added Africa Zambia Kwacha Africa Zimbabwe Zimbabwe Dollars Asia Burma Australasia Australia Dollars Caribbean Ba
5. 1 0 gt lt WeatherForecast date 2 1 2004 gt lt city gt lt name gt Frankfurt lt name gt lt temperature gt lt min gt 43 lt min gt lt max gt 52 lt max gt lt temperature gt lt city gt lt city gt lt name gt London lt name gt lt temperature gt lt min gt 31 lt min gt lt max gt 45 lt max gt lt temperature gt lt city gt lt city gt lt name gt Paris lt name gt lt temperature gt lt min gt 620 lt min gt lt max gt 74 lt max gt lt temperature gt lt city gt lt WeatherForecast gt What Is XML A parser is a program that analyzes and verifies the syntax of the coding of a programming language An XML capable browser parses XML code to ensure that is syntactically correct As already mentioned one parser function is to ensure that all starting and ending tags exist and that there is no interlocking of XML tags within the document Interlocking implies that a new tag of the same type such as lt city gt cannot be started until the ending tag of the previous city lt city gt has been found In a browser the XML document looks as shown in Figure 1 3 The callouts in Figure 1 3 show that in addition to being flexible for a web pages programmer XML is even flexible to the end user End users are unlikely to see an XML document in this raw state but Figure 1 3 helps to demonstrate the flexibility of XML AA C Manuscripts Wiley BeginningXMLDatabases 01 fig weatherFor 5
6. TD gt lt DIV DATAFLD S text gt lt DIV gt lt TD gt lt TR gt lt TABLE gt lt BODY gt lt HTML gt What Is XML Both of these examples look as the screen does in Figure 1 5 C Manuscripts Wiley BeginningXMLDatabases 01 fig parts2 h tml 1 File Edit View Favorites Tools Help Ea Back gt A Bsearch GaFavorites pmedia lt 4 B GH E Address jE C Manuscripts Wiley BeginningXMLDatabases 01 Fig parts2 html Wa Links ba X12334 125 X12334 125 Oil Filter 24 99 X44562 001 X44562 001 Brake Hose 22 45 Y 00023 12A Y00023 12A Transmission 8000 00 Figure 1 5 Using the XML tag to embed XML data islands into an HTML page There are always different ways to do things Try It Out XML Data Islands The XML document that follows represents the three regions and six countries created in the Try It Out exercise presented earlier in this chapter lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt country gt Zambia lt country gt lt country gt Zimbabwe lt country gt lt region gt Asia lt region gt lt country gt Burma lt country gt lt region gt Australasia lt region gt lt country gt Australia lt country gt lt region gt Caribbean lt region gt lt country gt Bahamas lt country gt lt country gt Barbados lt country gt lt regions gt Here we will create a simple HTML page containing the preceding XML document as a d
7. The content of elements XML elements can have simple content text only attributes for the element concerned and can contain other child elements Node lt branch_2 gt in the preceding example has an attribute called name with a value of branch two The node lt leaf_1 gt contains nothing The node lt leaf 2 3 gt contains the text string This is a leaf Q Extensible elements XML documents can be altered without necessarily altering what is deliv ered by an application Examine Figure 1 7 The following is the XSL code used to apply the reduced template for get the result shown in Figure 1 7 lt xsl template gt lt xsl apply templates select gt lt xsl if test name Africa gt lt HR gt lt xsl if gt lt xsl if test name Asia gt lt HR gt lt xsl if gt lt xsl if test name Europe gt lt HR gt lt xsl if gt lt xsl if test name Latin America and the Caribbean gt lt HR gt lt xsl if gt lt xsl if test name North America gt lt HR gt lt xsl if gt lt xsl if test name Oceania gt lt HR gt lt xsl if gt lt xsl value of select name gt lt xsl apply templates gt lt xsl template gt lt xsl stylesheet gt Looking at the preceding XSL script yes we have not as yet covered anything about eXtensible Style Sheets XSL The point to note is that the boldface text in the preceding code finds only the name attribute values from all elements ignoring everyt
8. XML is a universal standard In short XML does not do as much processing as HTML does XML is structure applied to data Effectively XML complements HTML rather than replaces it XML was built to store and exchange data HTML is designed to display data XSL on the other hand is designed to format data Chapter 1 Try It Out Creating a Simple XML Document The data shown below represents three regions containing six countries Africa Zambia Africa Zimbabwe Asia Burma Australasia Australia Caribbean Bahamas Caribbean Barbados Here you are going to create a single hierarchy XML document The example shown in Figure 1 3 and its preceding matching XML data gives you an example to base this task on Create the XML document as follows 1 Use an appropriate editor to create the XML document text file Notepad in Windows 2 Create the XML tag lt xml version 1 0 gt 3 Create the root tag first The data is divided up as countries listed within continents regions Countries are contained within regions There are multiple regions so there has to be a tag which is a parent tag of the multiple regions If there was a single region there could be a single lt region gt tag as the root node So create a root node such as lt regions gt indicating multiple regions The XML document now looks something like this lt xml version 1 0 gt lt regions gt lt regions gt 4 Now add each region in as a child of the lt re
9. region gt Australasia lt region gt lt country gt Australia lt country gt lt region gt Caribbean lt region gt lt country gt Bahamas lt country gt lt country gt Barbados lt country gt lt fregions gt ES Done j Bm Computer h Figure 1 4 Creating a simple XML document How It Works You opened a text editor and created an XML document file The XML document begins with the XML tag identifying the version of XML is use Next you added the root node called lt regions gt All XML documents must have a single root node Next you added four lt region gt nodes representing four regions into the root node Next you added countries into the four different regions Last you viewed the XML document in your browser Embedding XML in HTML Pages Data Islands XML documents can also be displayed in a browser using an XML data island An XML data island is an XML document with its data directly or indirectly embedded inside an HTML page An XML docu ment can be embedded inline inside an HTML page using the HTML lt xML gt tag It can also be referenced with an HTML srRc attribute Chapter 1 This first example uses the XML tag to embed XML document data within an HTML page lt HTML gt lt BODY gt lt XML ID xmlParts gt lt xml version 1 0 gt lt parts gt lt part gt lt partnumber gt X12334 125 lt partnumber gt lt description gt 0il Filter lt description gt lt quantity gt 24 99 lt
10. tag is used for paragraphs Unlike HTML XML is extensible and thus is capable of being extended or modified by changing or adding features XML can have tags of its own created customized that are unique to every XML document created An XML document when embedded into an HTML page needs the predefined tag that an HTML page does such as lt HTML gt and lt P gt but XML can also make up its own tags as it goes along An important restriction with respect to the construction of XML documents that is not strictly applied in HTML code is that all tags must be contained within other tags The root node tag is the only exception Examine the previous HTML coding example and you will see that in the source code the first paragraph does not have a terminating lt P gt tag using the or forward slash character HTML does not care about this XML does What Is XML A This is a simple HMTL page Microsoft Internet Explorer lol x Fle Edt View Favorites Tools Help mi i Ei Back gt O A A Asearch Favorites Zmeda lt 4 By G vi Address J c manuscripts Wiley BeginningMLDatabases 01 fig shakespeare html gt Go Links vei Once more unto the breach dear fnends once more or close the wall up with our English dead In peace there s nothing so becomes a man as modest stillness and humility but when th blast of war blows in our ears then imitate the action of the tiger stiffen the sinews summon up the bloo
11. x File Edit view Favorites Tools Help Ea Back gt A A Asearch GaFavortes media 4 B 3 Si g Address C Manuscripts Wiley BeginningxMLDatabases 01 fig weatherForecast xml gt Go Links gt lt xml version 1 0 gt lt weatherForecast date 2 1 2004 gt lt city gt lt name gt Frankfurt lt name gt lt temperature gt lt min gt 43 lt min gt All trees opened lt max gt 52 lt max gt lt temperature gt lt city gt lt city gt lt name gt London lt name gt lt temperature gt tree is closed lt city gt lt city gt tree is closed lt weatherForecast gt le E My Computer WA Figure 1 3 A simple sample KML page The primary purpose of HTML is for display of data XML is intended to describe data XML is the data and thus describes itself When HTML pages contain data they must be explicitly generated For every web page weather report written in HTML anew HTML page must be created This includes both the weather report data and all HTML tags When regenerating an XML based weather report only the data is regenerated Any templates using something like XSL remain the same And those templates are prob ably only downloaded once The result is that XML occupies less network bandwidth and involves less processing power XML is also a very capable medium for bulk data transfers that are platform and database independent This is because
12. 11569 lt year2025 gt lt year2050 gt 15571 lt year2050 gt lt country gt lt country gt lt name gt Comoros lt name gt lt year1998 gt 658 lt year1998 gt lt year2025 gt 1176 lt year2025 gt lt year2050 gt 1577 lt year2050 gt lt country gt lt countries gt lt continent gt lt continents gt 20 What Is XML lt world gt lt populationInThousands gt From a purely programming perspective it could be stated that attributes should not be used because of the following reasons m Q Q m Elements help to define structure and attributes do not Attributes are not allowed to have multiple values whereas elements can Programming is more complex using attributes Attributes are more difficult to alter in XML documents at a later stage As already stated the preceding reasons are all sensible from a purely programming perspective From a database perspective and XML in databases the preceding points need some refinement and perhaps even some contradiction m Elements define structure and attributes do not I prefer not to put too much structure into data particularly in a database environment because the overall architecture of data can become too complex to manage and maintain both for administrators and the database software engine Performance can become completely disastrous if a database gets large because there is simply too much structure to deal with Attributes are not allowed mu
13. Databases 01 fig population nTh 5 x File Edit View Favorites Tools Help Kal Back gt gt Q A Qsearch GaFavorites Meda lt 4 B g E Address j C Manuscripts Wiley BeginningXMLDatabases 01 Fig populationInThousands html gt Go Links a Continent Country 1998 2025 2050 Africa 748 927 1 298 311 1 766 082 Burundi 6 457 11 569 11 571 Comoros 658 1 176 1 577 Djibouti 623 1 026 1 346 Eritrea 3 577 6 681 9 085 Ethiopia 59 649 115 382 169 446 Kenya 29 008 41 756 51 034 Madagascar 15 057 28 964 40 438 Malawi 10 346 19 958 29 008 Maurits 1 141 1 379 1 440 xl E Done l my Computer Z Figure 1 8 HTML embeds the code and is less flexible than XML Q Comments Both XML and HTML use the same character strings to indicate commented out code lt This is a comment and will not be processed by the HTML or XML parser gt Elements As you have already seen in the previous section an XML element is the equivalent of an HTML tag A few rules apply explicitly to elements Q Element naming rules The names of elements XML tags can contain all alphanumeric characters as long as the name of the element does not begin with a number or a punctuation character Also names cannot contain any spaces XML delimits between element names and attributes using a space character Do not begin an element name with any combina
14. L gt tag allows direct access from the HTML page to XML tags as stored in the countries xml file In other words the countries xml file is referenced from the HTML page as a referenced data island lt HTML gt lt HEAD gt lt TITLE gt Regions and Countries lt TITLE gt lt HEAD gt lt BODY gt lt XML ID xmlCountries SRC countries xml gt lt XML gt lt TABLE DATASRC xmlCountries gt lt TR gt lt TD gt lt DIV DATAFLD region gt lt DIV gt lt TD gt lt TD gt lt DIV DATAFLD Stext gt lt DIV gt lt TD gt lt TR gt lt TABLE gt lt BODY gt lt HTML gt The result will look as shown in Figure 1 6 when executed in a browser What Is XML 3 Regions and Countries Microsoft Internet Explorer ioi Fie Edit View Favorites Tools Help Bak gt O A A Search GFavorites Gpmedia lt 4 G amp 4 Address j C Manuscripts wiley BeginningxMLDatabases 01 figiregions htmi Go Links gt Africa Zambia Zimbabwe Asia Burma Australasia Australia Caribbean Bahamas Barbados pa E Done e My Computer 7 Figure 1 6 Creating a simple HTML page containing an XML data island How It Works You created an HTML page that referenced an XML document from the HTML page as a data island The data island is referenced from the HTML page to the XML document using the XML tag as defined in the HTML page Data is scrolled through in the HTML page using an HT
15. ML table field using the DATASRC attribute of the HTML lt TABLE gt tag Introducing the XML Document Object Model Another factor when using XML is that built into the browser used to display XML data is a structure behind the XML data set Look again at Figure 1 3 and you should see that everything is very neatly structured into a hierarchy This entire structure can be accessed programmatically using something called the Document Object Model or XML DOM Using the XML DOM a programmer can find read and even change anything within an XML document Those changes can also be made in two fundamen tal ways Q Explicit data access A program can access an XML document explicitly For example one can find a particular city by using the lt city gt tag and the name of the city Q Dynamic or generic access A program can access an XML document regardless of its data con tent by using the structure of the document In other words a program can scroll through all the tags and the data no matter what it is That is what the XML DOM allows An XML page can be a list of cities weather reports or even part numbers for an automobile manufacturer The data set is somewhat irrelevant because the XML DOM allows direct access to the program within the browser which displays the XML data on the screen as shown in Figure 1 3 In other words a program can find all the tags by passing up and down the tree of the XML DOM 11 Chapter 1 A browser use
16. age Comparing HTML and XML XML can in some respects be considered an extensible form of HTML This is because HTML is restric tive in terms of the tags it is allowed to use In the following sample HTML document all tags such as lt HTML gt are predefined lt HTML gt lt HEAD gt lt TITLE gt This is a simple HMTL page lt TITLE gt lt HEAD gt lt BODY gt lt P gt Once more unto the breach dear friends once more or close the wall up with our English dead In peace there s nothing so becomes a man as modest stillness and humility but when th blast of war blows in our ears then imitate the action of the tiger stiffen the sinews summon up the blood disguise fair nature with hard favour d rage then lend the eye a terrible aspect lt P gt Cry Havoc and let slip the dogs of war that this foul deed shall smell above the earth with carrion men groaning for burial lt P gt lt BODY gt lt HTML gt Figure 1 1 shows the execution of this script in a browser You can see in the figure that none of the tags appear in the browser only the text between the tags In the preceding sample HTML page code the tags are all predefined and enclosed within angle brackets lt gt An HTML document will always begin with the tag lt HTML gt and end with the corresponding closing tag lt HTML gt Other tags shown in the above script are lt HEAD gt lt TITLE gt lt BODY gt and lt P gt The lt P gt
17. ame gt Germany lt name gt lt temperature gt lt min gt 22 lt min gt lt max gt 45 lt max gt lt temperature gt lt country gt lt country gt lt name gt England lt name gt lt temperature gt lt min gt 24 lt min gt lt max gt 39 lt max gt lt temperature gt lt country gt lt country gt lt name gt France lt name gt lt temperature gt lt min gt 22 lt min gt lt max gt 85 lt max gt lt temperature gt lt country gt lt WeatherForecast gt Namespaces can be used to resolve this type of conflict by assigning a separate prefix to each XML docu ment adding the prefix to tags in each XML document as follows for the XML document containing cities lt xml version 1 0 gt lt i WeatherForecastxmlns i http www mywebsite com nsforcities date 2 1 2004 gt lt i city gt lt i name gt Frankfurt lt i name gt lt i temperature gt lt i min gt 43 lt i min gt lt i max gt 52 lt i max gt lt i temperature gt lt i city gt lt i city gt lt i name gt London lt i name gt lt i temperature gt lt i min gt 31 lt i min gt lt i max gt 45 lt i max gt lt i temperature gt lt i city gt lt i city gt lt i name gt Paris lt i name gt lt li temperature gt lt i min gt 20 lt i min gt lt i max gt 74 lt i max gt lt i temperature gt lt i city gt lt i WeatherForecast gt 25 Chapter 1 And for the XML document containing countries you use a different prefix lt xml version 1 0
18. ample XML document does include a style sheet making the XML document display with only the names of continents and countries And here is an HTML equivalent of the XML document for the previous example as shown in Figure 1 7 Notice how much more raw code there is for each population region and country lt HTML gt lt BODY gt lt TABLE CELLPADDING 2 CELLSPACING 0 BORDER 1 gt lt TR gt lt TH BGCOLOR silver gt Continent lt TH BGCOLOR silver gt Country lt TH BGCOLOR silver gt 1998 lt TH BGCOLOR silver gt 2025 lt TH BGCOLOR silver gt 2050 lt TR gt 15 Chapter 1 16 lt TR ALIGN right gt lt TD BGCOLOR DOFFFF ALIGN left gt Africa lt TD gt lt TD BGCOLOR FFFFD0 gt amp nbsp lt TD gt lt TD gt 748 927 lt TD gt and lt TD gt 1 766 082 lt TD gt lt TR gt lt TR ALIGN right gt lt TD BGCOLOR D0FFFF gt amp nbsp lt TD gt lt TD BGCOLOR FFFFDO ALIGN left gt Burundi lt TD gt SIMD 6A SIL AEDA lt TD gt 11 569 lt TD gt lt n IAL by SABES lt TR gt lt TR ALIGN right gt lt TD BGCOLOR DOFFFF gt amp nbsp lt TD gt lt TD BGCOLOR FFFFDO0 ALIGN left gt Comoros lt TD gt lt TD gt 658 lt TD gt lt TD gt 1 176 lt TD gt ANDRIE 15 1 lt lt AEDS lt TR gt lt TABLE gt lt BODY gt lt HTML gt F C Manuscripts Wiley BeginningXMLDatabases 01 fig populationInt hot Is 5 x Fie Edit View Favorites Tools Help Ea
19. ata island Assume that the XML document is called countries xml Don t worry about a full path name The exam ple shown in Figure 1 5 and its preceding matching XML data island HTML pages give you an example to base this task on Create the HTML page as follows 1 Use an appropriate editor to create a text file Chapter 1 2 3 4 5 7 10 Begin by creating the lt HTML gt tags for the start and end of the HTML page lt HTML gt lt HTML gt You could add a lt HEAD gt tag allowing inclusion of a title into the browser Begin by creating the lt HTML gt tags for the start and end of the HTML page lt HTML gt lt HEAD gt lt TITLE gt Regions and Countries lt TITLE gt lt HEAD gt lt HTML gt Add the body section for the HTML page by enclosing it between the lt BODY gt tags lt HTML gt lt HEAD gt lt TITLE gt Regions and Countries lt TITLE gt lt HEAD gt lt BODY gt lt BODY gt lt HTML gt Now add the lt xXML gt tag into the body of the HTML page which references the externally stored XML document lt HTML gt lt HEAD gt lt TITLE gt Regions and Countries lt TITLE gt lt HEAD gt lt BODY gt lt XML ID xmlCountries SRC countries xml gt lt XML gt lt BODY gt lt HTML gt Add a table field lt TABLE gt tag to the HTML page The table field references the lt xML gt tag by the ID attribute as shown in the code that follows The SRC in the lt xM
20. d disguise fair nature with hard favour d rage then lend the eye a terrible aspect Cry Havoc and let slip the dogs of war that this foul deed shall smell above the earth with carrion men groaning for burial E Done Si My Computer a Figure 1 1 A simple sample HTML page What Is XML Capable Of So XML is not limited to a predefined set of tags as HTML is but allows the creation of customized tags The advantages of using XML could loosely be stated as follows Q Q Flexibility with data Any information can be placed into an XML page The XML page becomes the data rather than the definitional container for data as shown in Figure 1 1 Web page integration This becomes easier because building those web pages becomes more generic Web pages are data driven based on content in the web page rather than relying on the definition of the tags programming language driven and where the tags are placed Open standards XML is completely flexible No single software company can control and define what tags are created what each tag means and where in a document tags should appear XML is a little like a completely generic programming language Enhanced scalability and compression When sending web pages over the Internet XML pages can contain just data All the coded programming tags required for HTML are not needed Irrelevant order of data The order in which data appears in an XML page is unimpor
21. gions gt tag It should look something like this lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt region gt Asia lt region gt lt region gt Australasia lt region gt lt region gt Caribbean lt region gt lt regions gt 5 Next you can add the individual countries into their respective regions by creating individual lt country gt tags lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt country gt Zambia lt country gt lt country gt Zimbabwe lt country gt What Is XML lt region gt Asia lt region gt lt country gt Burma lt country gt lt region gt Australasia lt region gt lt country gt Australia lt country gt lt region gt Caribbean lt region gt lt country gt Bahamas lt country gt lt country gt Barbados lt country gt lt regions gt 6 When executed in a browser the result will look as shown in Figure 1 4 Z C Manuscripts Wiley BeginningXMLDatabases 01 fig ret 7 File Edit View Favorites Tools Help SA Eaz Ea ebak gt Q A Al Gsearch Favorites meda 4 D S a a Address E C Manuscripts Wiley BeginningXMLDatabases 01 Fig regions xml Go Links lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt country gt Zambia lt country gt lt country gt Zimbabwe lt country gt lt region gt Asia lt regian gt lt country gt Burma lt country gt lt
22. gt lt o WeatherForecastxmlns o http www mywebsite com nsforcities date 2 1 2004 gt KOs CREYA lt o name gt Frankfurt lt o name gt lt o temperature gt lt o min gt 43 lt o min gt lt o max gt 52 lt o max gt lt o temperature gt lt o city gt iia lt o name gt London lt o name gt lt o temperature gt lt o min gt 31 lt o min gt lt o max gt 45 lt o max gt lt o temperature gt lt o city gt KOS Csi lt o name gt Paris lt o name gt lt o temperature gt lt o min gt 20 lt o min gt lt o max gt 74 lt o max gt lt o temperature gt lt o city gt lt o WeatherForecast gt Creating the preceding XML documents using prefixes has actually created separate elements in sepa rate documents This is done by using an attribute and a URL Also when using a namespace you don t have to assign the prefix to every child element only the parent node concerned So with the first XML document previously listed you can do this lt xml version 1 0 gt lt WeatherForecast xmlns i http www mywebsite com nsforcities date 2 1 2004 gt lt city gt lt name gt Frankfurt lt name gt lt temperature gt lt min gt 43 lt min gt lt max gt 52 lt max gt lt temperature gt lt city gt lt city gt lt name gt London lt name gt lt temperature gt lt min gt 31 lt min gt lt max gt 45 lt max gt lt temperature gt lt city gt lt city gt lt name gt Paris lt name gt lt temperature gt lt min gt 20 lt min
23. gt lt max gt 74 lt max gt lt temperature gt lt city gt lt WeatherForecast gt You could also use a namespace for the weather forecast for the countries XML in Many Languages Storing XML documents in a language other than English requires some characters not used in the English language These characters are encoded if not stored in Unicode Notepad allows you to store text files in this case XML documents in Unicode In Notepad on Win2K select the Encoding option under the Save As menu option When reloading the XML document in a browser you simply have to alter the XML tag at the beginning of the script to indicate that an encoding other than the default is used Win2K SP3 Notepad will allow storage as ANSI the default Unicode Unicode big endian and UTF 8 To allow the XML parser in a browser to interpret the contents of an XML document stored as UTF 8 change the XML tag as follows lt xml version 1 0 encoding UTF 8 gt 26 What Is XML Summary In this chapter you learned that Cc CCC CL DOOD m HTML is the Hypertext Markup Language and its set of tags is predetermined XML is the eXtensible Markup Language XML is extensible because its metadata set of tags is completely dynamic and can be extended XSL stands for eXtensible Style Sheets XSL allows for consistent formatting to be applied to repeated groups stored in XML documents XML namespaces allow for the making of distinctions bet
24. hamas Dollars Caribbean Barbados Dollars In this example you use what you have learned about the difference between XML document elements and attributes The following script is the XML document created in the first Try It Out section in this chapter lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt country gt Zambia lt country gt lt country gt Zimbabwe lt country gt lt region gt Asia lt region gt lt country gt Burma lt country gt lt region gt Australasia lt region gt lt country gt Australia lt country gt lt region gt Caribbean lt region gt lt country gt Bahamas lt country gt lt country gt Barbados lt country gt lt regions gt You will use the preceding XML document and add the currencies for each country Do not create any new elements in this XML document Change the XML document as follows 1 Open the XML document You can copy the existing XML text into a new text file if you want 22 What Is XML 2 All you do is add an attribute name value pair to each opening lt country gt tag lt country currency Kwacha gt Zambia lt country gt 3 The final XML document looks something like this lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt country currency Kwacha gt Zambia lt country gt lt country currency Zimbabwe Dollars gt Zimbabwe lt country gt lt region gt Asia lt region gt lt country gt Burma lt c
25. hing else Therefore all population 18 What Is XML numbers are discarded and only the names of continents and countries are returned It is almost as if the XML document might as well look like that shown next with all population numbers removed The result in Figure 1 7 will still be exactly the same lt xml version 1 0 gt lt xml stylesheet type text xsl href 791202 fig0107 xsl gt lt populationInThousands gt lt world gt lt continents gt lt continent name Africa gt lt countries gt lt country name Burundi gt lt country gt lt country name Comoros gt lt country gt lt countries gt lt continent gt lt continents gt lt world gt lt populationInThousands gt Attributes Elements can have attributes An element is allowed to have zero or more attributes that describe it Attributes are often used when the attribute is not part of the textual data set of an XML document or when not using attributes is simply awkward Store data as individual elements and metadata as attributes Metadata is the data about the data In a database environment the data is the names of your customers and the invoices you send them The metadata is the tables you define which are used to store records in cus tomer and invoice tables In the case of XML and HTML metadata is the tags or elements lt gt lt gt contained within a web page The values between the tags is the actual data
26. l meaning that a closing tag can be included or not Q Case sensitive XML elements are case sensitive HTML tags are not case sensitive The XML element lt root gt in the previous example is completely different than the XML element lt Root gt in the next example The following example is completely different than the previous XML doc ument shown in the previous point Even though all the elements are the same their case is different for the lt Root gt and lt BRANCH_1 gt elements lt Root gt lt BRANCH_1 gt lt leaf_1 gt lt leaf_1 gt lt BRANCH_1 gt branch 2 gt lt branch_2 gt lt Root gt HTML does not require proper nesting of elements such as in this example lt FONT COLOR red gt lt B gt lt I gt This is bold italic text in red lt FONT gt lt B gt lt I gt XML on the other hand produces an error using the preceding code For example in XML the following code is invalid because lt tag2 gt should appear before lt tag1 gt lt tagl gt lt tag2 gt some tags lt tagl gt lt tag2 gt Q Element attributes Like HTML tags XML elements can have attributes An element attribute refines the aspects of an element Attributes and their values are called name value pairs An XML element can have one or more name value pairs and the value must always be quoted HTML attribute values do not always have to be quoted although it is advisable In the following XML document sample the complete document is
27. l contain the root node of the XML document tree structure The root node contains all other nodes in the XML document either directly or indirectly through child nodes lt root gt A single root node An XML document must have a single root tag such that all other tags are contained within that root tag All subsequent elements must be contained within the root tag each nested within its parent tag An XML tag is usually called an element 9 The ending root tag The last line will contain the ending element for the root element All end ing elements have exactly the same name as their corresponding starting elements except that the name of the node is preceded by a forward slash lt root gt 13 Chapter 1 0 Opening and closing elements All XML elements must have a closing element Omitting a closing element will cause an error Exceptions to this rule is the XML definitional element at the beginning of the document declaring the version of XML in exceptions and an optional style sheet lt root gt lt branch_1 gt lt leaf_1 gt lt leaf_1 gt lt branch_1 gt lt branch_2 gt lt branch_2 gt aoa HTML tags do not always require a closing tag Examine the first HTML code example in this chapter in the section Comparing HTML and XML The first paragraph does not have a lt P gt paragraph end tag The second paragraph does have a lt P gt paragraph eng tag Some closing tags in HTML are optiona
28. l the latest ver sions of Internet Explorer or Netscape will do nicely Using an older version of a software tool can sometimes be asking for trouble Using a non mainstream browser might also be limited in scope but this is unlikely if you use the latest version There are however some very specific technologies used by specific vendors Microsoft s Internet Explorer falls into this category Then again Internet Explorer is probably now the most widely used browser So for browser based examples I ve used Microsoft technology Database technology being used in this book will primarily be Oracle Database from Oracle Corporation and SQL Server Database from Microsoft Once again bear in mind that the focus of this book is on using XML as a database or in other databases The Document Type Definition The Document Type Definition DTD is a method of defining consistent structure across all XML docu ments within a company an installation and so on In other words it allows validation of XML documents ensuring that standards are adhered to even for XML data where the source of the XML data is external to the company 12 What Is XML From an XML in databases perspective DTD could provide a method of structural validation which is of course very important to any kind of database structure However it could also be superfluous and simply get in the way It may depend on how XML documents are created or generated as being sources
29. ltiple values If attributes need to have multiple values then those attributes should probably become child elements anyway This book is after all about XML databases and XML in databases Therefore it makes sense to say that an attribute with multiple values is effectively a one to many relationship You send many invoices to your customers There is a one to many relationship between each customer and all of their respective invoices A one to many relationship is also known as a master detail rela tionship In this case the customer is the master and the invoices are the detail structural element The many sides of this relationship are also known as a collection or even an array in object methodology parlance Q Attributes make programming more complex Programming is more complex when accessing attributes because code has to select specific values Converting attributes to multiple contained elements allows programming to scan through array or collection structures Once again per formance should always be considered as a factor Scrolling through a multitude of elements contained within an array or collection is much less efficient than searching for exact attributes which are within exact elements It is much faster to find a single piece of data rather than searching through lots of elements when you do not even know if the element exists or not An XML document can contain an element which can be empty or the element can simply
30. not shown here populations for continents including the name of the continent are contained as attributes of the lt continent gt element In other words the continent of Africa had a population of 748 927 000 people in 1998 748 million people where the population in thousands is the total divided by 1 000 or 748 927 14 What Is XML It follows that projected populations for the African continent are 1 3 billion 1 298 311 for the year 2025 and 1 8 billion 1 766 082 for the year 2050 Also in this example the name of the country is stored in the XML document as an attribute of the lt country gt element lt xml version 1 0 gt lt xml stylesheet type text xsl href 791202 fig0105 xsl gt lt populationInThousands gt lt world gt lt continents gt lt continent name Africa year1998 748 927 year2025 1 298 311 year2050 1 766 082 gt lt countries gt lt country name Burundi gt lt year1998 gt 6457 lt year1998 gt lt year2025 gt 11569 lt year2025 gt lt year2050 gt 15571 lt year2050 gt lt country gt lt country name Comoros gt lt year1998 gt 658 lt year1998 gt lt year2025 gt 1176 lt year2025 gt lt year2050 gt 1577 lt year2050 gt lt country gt Sena Penn nen UA ai UA AA XML element and attribute names can have space characters included in those names as in the lt continent gt element shown in the preceding sample XML document As shown in Figure 1 7 the previous s
31. of both metadata and data If XML documents are manually created then something like DTD could be very useful Of course once data is created it is possible that only one round of validation is required for at least static data Static data in a database is data that does not change very often if at all In a database containing customers and invoices your customers are relatively static their names don t change at least not very often Transactional or dynamic data such as invoices is likely to change frequently However it is extremely likely that any creation of XML documents would be automatically generated by application programs Why validate with the DTD when applications generating data XML documents will do that validation for you The DTD will be covered in a later chapter in detail where you will deal with schemas and XML Schemas XML Schemas are a more advanced form of the DTD XML Schemas can be used to define what and how everything is to be created in an XML document XML Syntax The basic syntax rules of XML are simple but also very strict This section goes through those basic syn tax rules one by one The XML tag The first line in an XML document declares the XML version in use lt xml version 1 0 gt Q Including style sheets The optional second line contains a style sheet reference if a style sheet is in use lt xml stylesheet type text xsl href cities xsl gt QO The root node The next line wil
32. ountry gt lt region gt Australasia lt region gt lt country currency Dollars gt Australia lt country gt lt region gt Caribbean lt region gt lt country currency Dollars gt Bahamas lt country gt lt country currency Dollars gt Barbados lt country gt lt regions gt 4 Figure 1 9 shows the result when executed in a browser Z C Manuscripts Wiley BeginningXMLDatabases 01 fig currenc lol x Fie Edit View Favorites Tools Help Ea Back gt gt Q A Qsearch Favorites media lt 4 B 3 SI y Address E C Manuscripts Wiley BeginningXMLDatabases 01 Fig currencies xml Go Links ze lt xml version 1 0 gt lt regions gt lt region gt Africa lt region gt lt country currency Kwacha gt Zambia lt country gt lt country currency Zimbabwe Dollars gt Zimbabwe lt country gt lt region gt Asia lt regian gt lt country gt Burma lt country gt lt region gt Australasia lt region gt lt country currency Dollars gt Australia lt country gt lt region gt Caribbean lt region gt lt country currency Dollars gt Bahamas lt country gt country currency Dollars gt Barbados lt country gt lt regions gt zi Done L E my Computer yy Figure 1 9 Adding attributes to elements in an XML document How It Works All you did was to edit an XML document containing the XML tag a single root node and various regions of the world that contained
33. quantity gt lt part gt lt part gt lt partnumber gt X44562 001 lt partnumber gt lt description gt Brake Hose lt description gt lt quantity gt 22 45 lt quantity gt lt part gt lt part gt lt partnumber gt Y00023 12A lt partnumber gt lt description gt Transmission lt description gt lt quantity gt 8000 00 lt quantity gt lt part gt lt parts gt lt XML gt lt TABLE DATASRC xmlParts gt lt TR gt lt TD gt lt DIV DATAFLD partnumber gt lt DIV gt lt TD gt lt TD gt lt DIV DATAFLD text gt lt DIV gt lt TD gt lt TR gt lt TABLE gt lt BODY gt lt HTML gt HTML and XML tags can have attributes or descriptive values In the HTML code lt IMG SRC image jpg BORDER 1 gt the tag is an lt IMG gt or image tag for referencing an image The SRC attribute tells the HTML lt IMG gt tag where to find the image and the BORDER tag tells HTML to put a 1 pixel wide border around the image The second example allows a reference to a separate XML file using the SRc attribute of the XML tag The XML source file is stored externally to the HTML page In this case the parts xml file is stored in the operating system and not stored within the HTML file as in the previous example lt HTML gt lt BODY gt lt XML ID xmlParts SRC parts xml gt lt XML gt lt TABLE DATASRC xmlParts gt lt TR gt lt TD gt lt DIV DATAFLD partnumber gt lt DIV gt lt TD gt lt
34. s the XML DOM to build a picture of an XML document as shown in Figure 1 3 The browser contains a parser that does not care what the data is but rather how data is constructed In other words the DOM contains a multiple dimensional hierarchical array structure That array structure allows access to all tags and all data without the programmer having to know the contents of the tags within it and even the names of the tags An XML document is just data and so any data can be contained within it When creating weather reports for people in different parts of the world the underlying templates that make the web pages look nice are all exactly the same only the data is different This is where this book comes into being Data stored in databases as traditional relation tables can be used to create XML documents that can also be stored in a database The XML DOM allows programmatic access into XML documents stored in a database In other words you can create XML documents stuff them in a database and then use database software to access the documents either as a whole or in part using the XML DOM That is really what this book is about It is however necessary to explain certain facets of XML before we get to the meat of databases and XML You need to have a basic picture of things such as XML and XSL first XML Browsers and Different Internet Browsers There are varying degrees of support for XML in different Internet browsers In genera
35. some of their respective countries You then proceeded to add cur rency attributes into some of the countries 23 Chapter 1 Reserved Characters in XML Escape characters are characters preventing execution in a programming language or parser Thus the lt and gt characters must be escaped using an escape sequence if they are used in an XML document any where other than delimiting tags elements In XML an escape sequence is a sequence of characters known to the XML parser to represent special characters This escape sequence is exactly the same as that used by HTML The following XML code is invalid lt country name Germany gt West lt East lt country gt The preceding code can be resolved into XML by replacing the lt character with the escape sequence string amp 1t as follows lt country name Germany gt West amp lt East lt country gt The lt gt and amp characters are illegal in XML and will be interpreted Quotation characters of all forms are best avoided and best replaced with an escape sequence Ignoring the XML Parser with CDATA There is a special section in an XML document called the CDATA section The XML parser ignores any thing within the CDATA section So no errors or syntax checking will be performed in the CDATA section The CDATA section can be used to include scripts written in other languages such as JavaScript The CDATA section is the equivalent of a lt SCRIPT gt lt SCRIPT g
36. t tag enclosed section in an HTML page The CDATA section begins and ends with the strings as highlighted in the following script example lt SCRIPT gt lt CDATA function F_To C return Gr AA EN gt lt SCRIPT gt What Are XML Namespaces 24 Two different XML documents containing elements with the same name where those names have differ ent meanings could cause conflict This XML document contains weather forecasts for three different cities The lt name gt element represents the name of each city lt xml version 1 0 gt lt WeatherForecast date 2 1 2004 gt lt city gt lt name gt Frankfurt lt name gt lt temperature gt lt min gt 43 lt min gt lt max gt 52 lt max gt lt temperature gt lt city gt What Is XML lt city gt lt name gt London lt name gt lt temperature gt lt min gt 31 lt min gt lt max gt 45 lt max gt lt temperature gt lt city gt lt city gt lt name gt Paris lt name gt lt temperature gt lt min gt 20 lt min gt lt max gt 74 lt max gt lt temperature gt lt city gt lt WeatherForecast gt This next XML document also contains lt name gt elements but those names are of countries and not of cities Adding these two XML documents together could cause a semantic meaning conflict between the lt name gt elements in the two separate XML documents lt xml version 1 0 gt lt WeatherForecast date 2 1 2004 gt lt country gt lt n
37. t lt street gt lt town gt Smithtown lt town gt lt state gt NY lt state gt lt zip gt 11723 lt zip gt lt address gt lt phone gt 631 445 2231 lt phone gt lt customer gt lt customer gt lt customers gt 3 What kind of a web page is this lt HTML gt lt BODY gt lt XML ID xmlParts SRC cookies xml gt lt XML gt lt TABLE DATASRC xmlCookies gt lt TR gt lt TD gt lt DIV DATAFLD cookietype gt lt DIV gt lt TD gt lt TD gt lt DIV DATAFLD cookiedescription gt lt DIV gt lt TD gt lt TR gt lt TABLE gt lt BODY gt lt HTML gt 4 What does XSL do for XML Allows changes to data in XML pages at run time a b Allows changes to metadata in XML pages at run time G Allows regeneration of entire XML pages at run time All of the above gt e None of the above 28
38. tant because it is data Data can have things applied to it at the client site in a browser to change it if you use something like eXtensible Style Sheets XSL What Is XSL XSL is a formatting language that applies templating to consistent data repetitions inside XML docu ments For example an XML page containing a listing of clients and their addresses could be formatted into a nice looking table field using an XSL style sheet that showed each different client on a single row in the table as shown in Figure 1 2 Chapter 1 AA C Manuscripts Wiley BeginningXMLDatabases 01 fig custor loj x File Edit view Favorites Tools Help Ea lek gt OD OlO t Geow Gun GI Goa Address E C Manuscripts Wiley BeginningXMLDatabases 01 Fig customers xml Go Links gt James Bloggs 25 5th Street Manhattan NY 11124 212 123 5566 Jim Jones 25 Amery Street Jones Beach NY 11744 516 456 5467 Jim Jones PO Box 361 NY NY 11001 631 123 4567 Zachary Smith 1 Smith Street Smithtown NY 11723 631 445 2231 VE Fa Done e My Computer Figure 1 2 XSL can be used to apply templates to XML documents The HTML equivalent of XSL is cascading style sheets CSS Creating and Displaying a Simple XML Document Following is a sample XML document The only required predefined tag is on the first line which describes that the version of the XML parser used is version 1 0 lt xmlversion
39. tion of the letters XML in any combination of uppercase or lowercase characters In other words KML 1 xml_1 xML_1 and so on are all not allowed It will not produce an error to use multiple opera tive characters such as addition and subtraction but their use is inadvisable Elements least likely to cause any problems are those containing only letters and numbers Stay away from odd characters 17 Chapter 1 Q Relationships between elements The root node has only children All other nodes have one parent node as well as zero or more child nodes Nodes can have elements that are related on the same hierarchical level In the code example that follows the following apply Q The root node element is called lt root gt Q The root node has two child node elements lt branch_1 gt and lt branch_2 gt Q The node lt branch_1 gt has one child element called lt leaf_1_1 gt a The node lt branch_2 gt has three child elements called lt leaf 2 1 gt lt leaf 2 2 gt and lt leaf_2_3 gt Q The nodes lt leaf_2_1 gt lt leaf_2_2 gt and lt leaf_2_3 gt are all siblings having the same parent node element in common node lt branch_2 gt lt root gt lt branch_1 gt lt leaf_1_1 gt lt leaf_1_1 gt lt branch_1 gt lt branch_2 name branch two gt lt leaf_2_1 gt lt leaf_2_1 gt lt leaf_2_2 gt lt leaf_2_2 gt lt leaf_2_3 gt This is a leaf lt leaf_2_3 gt lt branch_2 gt lt root gt Q
40. ween different XML documents that have the same elements XML can utilize character sets of different languages by using Unicode character sets The XML DOM Dynamic Object Model allows run time dynamic access to XML web pages Different browsers and browser versions will behave differently with XML For examples in this book I ve used Microsoft Internet Explorer version 6 0 running in Win2K Windows 2000 The DTD Document Type Definition allows enforcement of structure across XML documents This chapter has given you a brief picture of what XML is including a comparison with HTML and a brief summary of XSL HTML creates web pages with fixed data and metadata XML allows creation of web pages with adaptable data and metadata content The next chapter examines the XML DOM or the Document Object Model for XML The XML DOM like the HTML DOM allows dynamic run time access to both the data and metadata in a web page Exercise 1 DoF WNP Which line in this HTML script contains an error lt HTML gt lt HEAD gt lt TITLE gt Title lt TITLE gt lt HEAD gt lt BODY gt lt P gt This is a paragraph lt P gt This another paragraph lt P gt lt BODY gt lt HTML gt 1 3 c 864 d 5 e None of the above 27 Chapter 1 2 How many errors are present in this XML script lt xml gt lt customers gt lt customer gt lt name gt Zachary Smith lt name gt lt address gt lt street gt 1 Smith Stree

Download Pdf Manuals

image

Related Search

Related Contents

OffiWire Operational Manual    Hypertec Evoluent VerticalMouse 2  Super Trivia Service Manual.pub  LG L1900E User's Manual  Denver CRPF-350  SW Two - AudioSource  Compaq 307502-001 Laptop User Manual  Astro Koza  EDItran 4.1  

Copyright © All rights reserved.
Failed to retrieve file