Sung-Hyuk Kim
ksh@sookmyung.ac.kr
Library & Info. Sci. Dept
Sookmyung W. Univ.
Seoul, Korea
Sang-Wan Han
swhan@yonsei.ac.kr
Library & Info. Sci. Dept
Yonsei Univ
Seoul, Korea
In addition, we have implemented a way of combining browsing and querying documents.
Our approach directly motivated by the querying mechanism PESTO [CHMW96] is even further originated from QBE [Zlo77]. By extending our previous work [YK96] we propose "three phase multimedia document retrieval" (3PR in short) employees database techniques with intelligent multimedia retrieval. Queries can be constituted with metadata about a document in the first phase. As in a QBE, queries are asked with the entry values by filling frame slots with values to select. Each slot entry is of metadata, by metadata we mean the external information about document instances. An example of metadata is such document catalog information as publisher, document type, subject, call number, publication year, etc. So the slots requested in the first phase will be filled with user-provided values only if needed to ask.
Then, in the second phase, queries can be constituted with any combination of "ELEMENTs" that are defined in the corresponding document type definition (DTD). Part of the DTD regarding Korean technical journals (or articles) will be provided in the later section. Document instances are populated in the database according to a corresponding DTD. The ELEMENTs in a DTD are used to form the slots in the second phase. By ELEMENT in the Korean technical journals DTD we mean that abstract, chapters, sections, table or figure captions, references, etc.
Finally, a semantic approach is used for sophisticated users to ask queries. In this third phase, queries take user's subjective interests or meanings into account. Subjective meanings include user's annotations or user's heuristics, and can be represented in the IF-THEN rule format.
The contribution of this paper includes inventing a framework for multimedia document retrieval facility, which is far beyond a current text-based document retrieval technology. We include not only current technologies but also extend toward metadata-based, content-based, and semantic-based query. The 3PR makes it possibe for users to constitute queries a large number of documents efficiently.
In this paper, we assume that documents, not only text-based but multimedia document, are structurally tagged in the form of SGML or more generally HyTime, and they are defined in an object relational databases (ORDB). Queries are constituted to retrieve SGML/HyTime instances which may be in text, image, video, or audio forms.
The remaining of this paper is organized as follows: Section 2 describes related work, especially on-going research work being funded in Korea. Section 3 models a document database in object-relational model and Section 4 describes 3PR query model for multimedia document databases. Finally, Section 5 describes conclusions.
User friendly queries have been prototyped in research database systems. Zloof has developed also a query language which specifies an example in a form [Zlo77]. Stonebraker and Kalash have developed a browser in relational databases [SK82]. Rowe and Shoens have developed a form based query system [RS82]. Motro et. al. have developed a browser in relational databases [MDT88]. Carey et. al. have developed a form based language combining queries and browsers in object-oriented databases [CHMW96].
This paper is closest in its browsing style to PESTO; we were heavily influenced by PESTO's browsing facilities. However, unlike PESTO, where frame slots are explicitly indicated by the system, we develop DTD based frame slots depending on the documents available in databases.
For digital library researches, Clifton et. al. have developed a document query model by exploiting a filtering mechanism [CGMB95]. Schatz et. al. have developed document thesaurus to use for query processing [SJC96, SMC+96]. Our paper is similar to this paper in that document thesaurus is used. Unlikely, we also develop database thesaurus about database operators for similarity matching. Kobsa et al. [KNF97] Vassileva
[Vas97] describe work on adaptive hypertext and hypermedia systems that are tailored to a user's knowledge, experts, interests and abilities.
Although many digital libraries are being constructed, a few are in service, and only a couple of them have a capability of querying structually designed and marked-up documents. The MIRAGE system allows users to retrieve and browse through multimedia information [Mya96]. Kim and Yoon have prototyped a new information retrieving technique in Sookmyung W. University [YK96].
Each document instance contains text or multimedia data being marked up according to a DTD. There are several data models for multimedia enrich databases: from using relational data model to object-oriented data model. At one end of a spectrum of those data models, we propose an object relational data model, implemented in the Illustra ODBMS. Each data in multimedia document databases is assumed in the pair <data_type, data>. The data type of a data can be integer, text, audio, image, video, etc. Each tuple can represent a composite data type, which represents another tuple, and complex data type, which represents more than one data of the same type.
For example, an article can be defined as in Figure 2. The section attribute of article refers to one or more sections, each of which in turn contains title and bodies. Again, the body attribute of section refers to one or more figures and/or paragraphs. The section attribute is a composite object type that its value is another tuple or a set of tuples, and the section attribute is a complex data type that is set-valued.
Figure 1: A HyTime DTD for Article Documents
Figure 2: An Object Relational Data Model Example
Multimedia data can be represented as being meta-attributes, logical-attributes, and semanticattributes. Meta-attributes of multimedia data are of information externally represented without referring to and internal contents. Logicalattributes are typical database attributes which represent internal contents of multimedia data. Semantic-attributes are user annotations about the multimedia data, therefore they are not meta or logical attributes. For example in the book "History of Movies," the author or publisher attribute is a meta-attribute, while the Oscar-awarded-movie name attribute is a logicalattribute, and the annotation attribute is a rule that Oscar-awarded-movies are well marketed.
Since a multimedia datum may consist of one of multimedia component objects, those three attributes defined above can be classified as being six different attribute types: Multimedia Metaattributes, Object Meta-attributes, Multimedia Logical-attributes, Object Logical-attributes, Multimedia Semantic-attributes, Object Semanticattributes. The suffix indicates that whether it is for multimedia as a whole, or for multimedia as a component. In Figure 2, the multimedia type is defined over one or more objects that are in turn multimedia types.
For example, logical-attributes can be defined in an object-oriented data model as in Figure 3. Semantic-attributes will be defined as triggers or rules available in object-oriented data models. Figure 4 specifies that "if dominant color of mountains is red, the season is fall."
In order to retrieve multimedia data efficiently (or for QOS [VKvBG95]) from multimedia document databases, we extend SQL query answering techniques toward multimedia document retrieval. The new technique is based on the multimedia data attributes. As meta-attributes, logical-attributes, and semantic-attributes are used to represent multimedia data, queries are constituted with those attributes. Those queries are then classified as being metadata-based queries, content-based queries, and semanticbased queries. These queries are used in an order to constitute a user interactive query. We call such an order 3PR query. Each one of these queries are discussed in the following sections.
The condition specified in the where clause is defined as a formula. A formula is recursively defined as follows: 1) an atom is a formula, 2) if p is a formula, then so are :p, and (p), 3) if p1 and p2 are formulae, then so are p1 . p2, p1 ^ p2, and p1 =) p2. An atom is (attribute \Theta attribute) or (attribute \Theta c), where \Theta is a comparison operator, e.g., =; 6=; ?; !; ?=; !=, and attribute can be one of those six attributes as defined in the previous section. An attribute in object relational databases can be specified over a dot notation through which joins with other attributes become possible. For example, the expression "(article.section.body.film.type='AVD')" compares if the file type shown in the body of an article section is AVD.
For example, suppose that "Find all multimedia documents which are published in 1997 and contain audio files of over 10 minutes." Then an SQL-like query will be
Figure 3: An Object Relational Data Model for Meta-attributes
Figure 4: An Object Relational Data Model for Semantic-attributes
Consider an example: Suppose that "Find all multimedia documents which are written about `Mountain' in a section title and contain those mountain image files." Then an SQL-like query will be
Consider an example: Suppose that "Find all multimedia documents which are written about `mountain' in a section title and contain those fall mountain image files." Then an SQL-like query will be
In this case, be aware that the attribute "season" has not been defined. However, since semantic-attributes are defined over the attribute "season," the rule is activated to substitute the predicate in the above query. This procedure is a so-call semantic query optimization [YK93b]. By using an optimization technique, the given query will be rewritten as follows:
SELECT * FROM article d WHERE d.section.body.film.type='Image' AND d.section.body.film.object.naming ='mountain' AND d.section.body.title like '%mountain%' AND GetImgMaxColor( d.section.body.title.object.file) ='#00FFFF'
queries | objective attribute | constraint | similarity | semantic attribute |
Structured | formatted data | integrity | values | meaning |
Unstructured | full-text data | tagging constraint | ||
Image | image, graphic data | spatial constraint | shapes, colors | features |
Video | video data | video constraint | movement | |
Audio | speech, audio data | audio constraint | pitch, tone |
The contribution of this paper includes a framework for multimedia document retrieval facility, which is not only including the current technology but also metadata-based, contentbased, and semantic-based query constituents. The 3PR makes it possibe for users to constitute queries a large number of documents efficiently. The 3PR queries can be constituted in the form of frames to which users can fill slots with appropriate search values.
This work will be used for data mining and knowledge discovery (KDD) process from multimedia documents. Querying documents delivers a user's intention to the KDD process [YK93a, Yoo96]. We believe that various aspects of constituting user's queries can be used to extract useful knowledge sets from multimedia document databases.
[CHMW96] M. Carey, L. Haas, V. Maganty, and J. Williams. PESTO: An integrated query/browser for object databases. In Proc. Intl. Conf. on Very Large Data Bases, 1996.
[KNF97] A. Kobsa, A. Nill, and J. Fink. Hypertext and hypermedia clients of the user modeling system BGPMS. In M. Maybury, editor, Intelligent Multimedia Information Retrieval, pages 339-356. MIT Press, 1997.
[MDT88] A. Motro, A. D'Atri, and L. Tarantino. The design of KIVIEW: An objectoriented browser. In 2nd Int'l Conf. on Expert Database Systems, pages 17-32, Fairfax, 1988.
[Mya96] Sung-Hyun Myaeng. MIRAGE: A prototype for a multimedia information retrieval and gathering environment. In Proc. of the Int'l Conf. on Digital Libraries and Information Services for the 21st Century, pages 115-125, Seoul, Korea, 1996.
[O86] I S O. Information Processing Text and Office Systems - Standardized Generalized Markup Language (SGM L). International Organization for Standardization, ISO 8879- 1986, 1986.
[O94a] I S O. Information and Documentation - Electronic Manuscript Preparation and Markup. International Organization for Standardization, Switzerland, 1994.
[O94b] I S O. Information Technology - Hypermedia/Time-based Structuring Language (HyT ime). International Organization for Standardization, ISO/IEC 10744-1992, 1994.
[PB97] P. Pistor and H. Blanken. The SQL3 server interface. In Multimedia Databases In Perspective, pages 101-116. 1997.
[RS82] L. Rowe and K. Shoens. FADS - a forms application development system. In Proc. ACM SIGMOD Intl. Conf. on Management of Data, 1982.
[SJC96] B. Schatz, E. Johnson, and P. Cochrane. Interactive term suggestion for users of digital libraries: Using subject thesarui and co-occurrence lists for information retrieval. In 1st ACM Int'l Conf. on Digital Libraries, pages 126-133, 1996.
[SK82] M. Stonebraker and J. Kalash. TIMBER - a sophisticated relational browser. In Proc. Intl. Conf. on Very Large Data Bases, 1982.
[SMC+96] B. Schatz, W. Mischo, T. Cole, J. Hardin, and A. Bishop. Federating diverse collections of scientific literature. In IEEE Computer, May, pages 28-36. 1996.
[Vas97] J. Vassileva. Ensuring a task-based individualized interface for hypermedia information retrieval through user modeling. In M. Maybury, editor, Intelligent Multimedia Information Retrieval, pages 357-380. MIT Press, 1997.
[VKvBG95] A. Vogel, B. Kerherve, G. von Bochmann, and J. Gecsei. Distributed multimedia and QOS: A survey. IEEE Multimedia, 2(2):10-19, 1995.
[YK93a] Jong P. Yoon and Larry Kerschberg. A framework for knowledge discovery and evolution in databases. IEEE Transactions on Knowledge and Data Engineering, 5(6):973- 978, December 1993.
[YK93b] Jong P. Yoon and Larry Kerschberg. Semantic query optimization in deductive object-oriented databases. In Proc. of the Third International Conference on Deductive and Object-Oriented Databases, pages 169-182, Phoenix, Arizona, 1993.
[YK96] Jongpil Yoon and Sung-Hyuk Kim. Multimedia query processing in digital libraries. In Proc. of the Int'l Conf. on Digital Libraries and Information Services for the 21st Century, pages 88-106, Seoul, Korea, 1996.
[Yoo96] Jongpil Yoon. Extracting database knowledge from query trees. Journal of Electrical Engineering and Information Science, 1(2):145-156, 1996.
[Zlo77] M. Zloof. Query by example. IBM System. J., 16, 1977.