Example 192/199 Proposal
Major: Computer Science and Engineering Major
Course: ECS 199, 3 units, Spring 2001
Instructor: Michael Gertz
MAINTAINING INTEGRITY CONSTRAINTS ON XML DATA IN ORACLE 8i
The objective of this project is to determine a way that XML data, stored in Oracle 8i, can be maintained with database integrity constraints.
XML is fast becoming an important method for exchanging data across the Internet. As XML becomes more popular, however, there will need to be a way to ensure that the data being exchanged is valid and consistent with a defined schema. Part of ensuring consistent data is maintaining and implementing integrity constraints such as keys, cardinality ratios and other important business rules. There is currently no specification as to how such integrity constraints are enforced in a database management system, such as Oracle 8i, that stores such XML data.
In doing this research project, we hope to achieve the following goals: To learn how XML DTDs and XML data can be stored in Oracle 8i. To use this information to determine how integrity constraints on this data, such as primary keys, foreign keys and cardinality ratios, can be specified and stored. To determine a method for enforcing these integrity constraints, possibly through the use of triggers. To develop a simplified syntax for defining integrity constraints in an XML document. To develop an easy and efficient method for automating the translation from integrity constraints described in an XML document to enforceable integrity constraints within the database management system.
We will begin the project by studying how XML data is stored in Oracle 8i. We'll begin by consulting online documentation from Oracle as well as from the World Wide Web Consortium (W3C) regarding Oracle's treatment of XML data and XML specifications. Also, we will refer to Building Oracle XML Applications by Steve Muench as another valuable resource on Oracle and XML.
We will then study possible methods for storing integrity constraints on that data that are similar in nature to the way normal integrity constraints are stored on typical database data. We expect that there may be some very simple, but not extensive, method for enforcing integrity constraints already available within Oracle. This method, however, may not involve treating XML data as different from other data stored in the database. For instance, simply storing an entire XML document as a text file in a database and parsing the file to confirm it's well-formedness is NOT NEARLY ENOUGH to consider that as XML data. Therefore, we will look for some more intense methods that consider XML data as a unique type of data within the database. One method in particular that we will focus on is mapping the XML document structure directly into a table, or a set of tables in the database. Another important aspect we will consider is critical operations. If we are to enforce integrity constraints, we will need to assess the actions that could cause the XML data to become inconsistent. Almost certainly, updating an XML document will be one such critical operation that we will consider handling.
We will study which, if any, of the methods we discover can be reasonably implemented and find the most efficient one. This will involve examining the theory behind these methods and considering the costs, in terms of speed, reliability, and ease of use, associated with each method. We will also create some sample XML documents and test whether the methods are acceptable in practice.
We will also study ways in which integrity constraints may be enforced on XML data stored in the database. Our initial assumption is that database triggers can be written for all integrity constraints and that they can be activated any time that the XML data is accessed. We will, however, also consider other features already built in to Oracle, as well as possibly creating a new feature not currently supported.
Finally, based on the way integrity constraints are stored and enforced, we will study ways in which such integrity constraints may be specified in an XML document. This will involve looking at various XML schema languages such as XML Schema and Schematron to find an efficient, simple method for defining primary key, foreign key and cardinality constraints in the XML Document Type Definition. We must recognize that there may be long term consequences that result from this research. Most notably, we hope to be able to begin a process by which more expressive schema languages are developed and XML data is stored and maintained within a database management system more efficiently. Also, we hope that this research will ultimately lead to a way of transmitting data across the world efficiently, easily, and safely.
As this project focuses mainly on database issues, I will draw a great deal of knowledge from my work in course 165A. This project will give me a chance to take the basic principles of relational databases and apply them to a brand new type of data. I hope that this application will give me a more broad understanding of these principles in that I will learn new ways to use similar ideas. In perhaps a more obscure way, I will be using the skills I have learned in course 140A to help me learn and understand the syntax and semantics of a new computer language, which XML is to me.