|
Introduction
A common problem with many existing XML databases is that the semantics of terms in a database are
not taken into account when answering queries. First, they don’t have the ability to capture inter-term
lexical relationships. Second, they do not use any notion of similarity to answer queries.
Therefore, even though an ordinary XML query engine answers queries with 100% precision, the recall is
relatively low.
We introduce the novel concept of a similarity enhanced ontology (SEO for short). Then we extend the TAX algebra
with regard to a SEO. Our algebra is called TOSS. Here are the details of our approach:
- Use ontology to capture inter-term lexical relationships. An ontology extended semistructured instance(OES instance)
consists of a semistructured instance and an associated ontology.
- Merge the ontologies associated with the instances when answering queries spanning multiple OES instances.
- Apply a similarity enhancement operation to the merged ontology to capture the notion of similarty. The result is
called similarity enhanced ontology.
- Develope the TOSS algebra to answer queries with regard to a SEO.
We have built a prototype of TOSS on top of the Apache Xindice XML database system.
The system architecture can be found here. We experimentally evaluate the
TOSS system on the DBLP and SIGMOD bibliographic databases and demonstrate that TOSS provides higher
quality answers than ordinary XML query engines.
People
Edward
Hung, Yu
Deng, V.S. Subrahmanian
Publications
- Edward Hung, Yu Deng and V.S. Subrahmanian. TOSS: An Extension of TAX with
Ontologies and Similarity Queries. In Proceedings of the 23rd ACM
SIGMOD International Conference on Management of Data, Paris, France,
June 13-18, 2004.
Presentations
- Edward Hung, "TOSS: An Extension of TAX with
Ontologies and Similarity Queries," the 23rd ACM
SIGMOD International Conference on Management of Data, June 2004. (powerpoint)
|