Thursday, 30 August 2012

XML Database Mapping Using XSD (XML Schema)

It is not uncommon for developers working on software or Web applications to change data storage and modelling approaches. An example of this would be deciding to switch from XML data to a relational database. The decision to shift data modelling technologies may be a result of changes in the aims or other technologies being used within a project.

Changing from XML to a relational database system is not typically a difficult task. The main activity involved is mapping the data structures from one system to the other. The data structure includes the data items and the relationships between them. Ultimately the database will be defined using SQL but the initial stage involves creating a design, an abstract model of the database determining the structures its data content will be held in.

If an XML data source is accompanied by an XML Schema Definition (XSD) determining its structure, this can be used as the primary source for mapping the data to a relational system. XML Schemas define the structures and elements an XML data source can contain, including the names and datatypes of elements, their attributes and the relationships between them.

Techniques For Mapping

There is no set procedure for mapping XML to a relational structure, although there are some common approaches. The process should be accompanied by an informed sense of the system for which the data is being modelled, with the overall purpose of this system in mind when mapping decisions are being made.

Element To Table

Often the most logical step will be to translate XML elements into relational database tables. This is not always the best approach, it does depend on the data in question. If an XML element is mapped into a database table, its columns may become the element attributes, optionally the element content or children.

Element To Column

In some cases XML elements may become columns in other database tables. This is most likely the case with simple elements, which simply contain a text string, easily translatable into a column in a system such as MySQL. If an element contains further elements or attributes, the possibility of mapping only into a column will be less likely.

Attribute To Column

Since attributes are single, simple pieces of data, it will generally make most sense to translate them into columns in the database tables corresponding to their given elements. An alternative may be in cases where there are only a set, finite number of possible attribute values, in which case this may be reflected in different tables for the elements having each attribute type.

For example, if an element called "thing" has an attribute "type" and this attribute can only be "red" or "green" it may be reflected by having two tables, one for "red things" and one for "green things". However, in most cases it will be more sensible to have a "thing" table with a "type" column in it.

Relationships

Representing relationships in a data set is one of the more challenging aspects of mapping from XML to relational database using an XSD. Relationships in XML are implemented by structure, for example with an element having child elements. In relational database systems such relationships are reflected using "keys" linking different tables.

For example, consider the following XML structure:


Mary
Jim
Tony


The corresponding XSD excerpt:








This could be reflected in a relational database using Foreign Keys. The relationship is "one to many" in that one family may be associated with multiple members, but each member will typically only be part of a single family. In this case, a relational database could contain a "family" table with a "name" column and a unique "ID" column representing the table's Primary Key. An additional "member" table could include a "name" column (containing "Mary", "Jim" or "Tony" in this case). The "member" table could use a Foreign Key by including a column called "familyID" in which the "ID" for the member's family is recorded, linking the two tables in a way that accurately represents their relationship.

Finally

Ultimately, there is no one correct way to translate XML indicated by an XSD into a relational database design. An understanding of both technologies, together with information about the project the data is being used for, will make a successful outcome far more likely.

See also:

No comments:

Post a Comment