Native SGML/XML Systems

Native SGML/XML systems are designed especially for the management of SGML/XML data. The systems should include capabilities to define, create, store, validate, manipulate, publish,and retrieve SGML/XML documents and their parts. Some of the native systems, such as Astoria and Information Manager, are comprehensive document management systems with front-ends for users to work with documents. Some others, such as SIM and Tamino, are software packages intended for building applications for the management of SGML/XML data. A few systems, especially those that support semi-structured data,such as Lore, XYZFind, and dbXML, provide native support for tree-structured data but are limited in their support of rich XML documents because they do not rely extensively on DTDs or other document type definitions.

The data model: There is no single well-defined data model for XML data. The lack of a well-defined universal conceptual model causes problems in the native systems: for example, the underlying model for XML data is not explicitly defined in Astoria or Tamino, and system-specific notions and models have been invented in SIM. Many of the systems consist of packages of tools that do not share a common data model and may be limited in kind of XML documents they are able to store and manipulate. Unfortunately, because the systems do not highlight the details of the data model, such inconsistencies and constraints are often difficult to detect.

Data definition: The capability to define document types is an important characteristic of XML, and we consider the document type definition capability an essential feature in systems of this category. This aspect severely reduces the utility of semi-structured approaches for managing persistent XML resources. The systems originally developed for SGML are able to use DTDs directly as the document type definition with no translation to some other form of schema. Additional definitions may be needed, however,to support flexible manipulation and efficient implementation. In Astoria an important extension is provided by components, which form the data unit for many operations. For example, access rights are granted at the component level, components can have variants and versions, and simultaneous update to a document by several users is controlled at the component level.

Data manipulation: The lack of a standardized XML query language has led to various system-specific query languages. In addition, the simplified data models restrict query capabilities. For example, since Tamino does not store information about attribute types, queries utilizing IDs and IDREFs are impossible. The response to a Tamino query is an XML document containing the query result as tagged text, plus metadata related to the query (e.g.date and time). Thus the query language cannot be applied directly to query results unless a Tamino schema defines them as part of the database. In content management systems such as Astoria and Information Manager, parts of documents can be updated by structure editors integrated with the systems. In both of them style sheets can be associated with documents in their associated editors, and transformations can be defined by means of style sheets. Both of the systems also offer some capabilities for document assembly. In Tamino, database update is applied at the document level. The data storage mechanism for XML data(called X-Machine) has an associated programming language that includes commands for inserting and deleting documents. XSL is used to transform XML documents to HTML for Web publishing,but there is no additional support for defining transformations.

Tags : , , , , , , , ,

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Leave Comment