scispace - formally typeset
Open AccessProceedings ArticleDOI

VisualMOQL: the DISIMA visual query language

Reads0
Chats0
TLDR
VisualMOQL, a visual query language implementing the image component of MOQL, is presented, which extends the standard object oriented query language OQL with some multimedia functionalities and is called MOQL.
Abstract
Multimedia data are now available to a variety of users ranging from naive to sophisticated. To make querying easy, visual query languages have been proposed. Most of these languages have a low expressive power and have their own query processors. Efforts have been made to design query languages with proper semantics to facilitate query optimization and processing in existing database systems. The majority of multimedia database systems are built on top of object or object-relational database systems with the underlying query facilities inherited. The DISIMA system is being built on top of a commercial OODBMS and we have chosen to extend the standard object oriented query language OQL with some multimedia functionalities. The resulting language is called MOQL. This paper presents VisualMOQL, a visual query language implementing the image component of MOQL.

read more

Content maybe subject to copyright    Report

VisualMOQL: The DISIMA Visual Query Language
Vincent Oria, M. Tamer
¨
Ozsu, Bing Xu, L. Irene Cheng and Paul J. Iglinski
Department of Computing Science
University of Alberta
Edmonton, Alberta, Canada T6G 2H1
oria, ozsu, bing, lin, iglinski @cs.ualberta.ca
Abstract
Multimedia data are now available to a variety of users
ranging from naive to sophisticated. To make querying easy,
visual query languages have been proposed. Most of these
languages have a low expressive power and have their own
query processors. Efforts have been made to design query
languages with proper semantics to facilitate query opti-
mization and processing in existing database systems. The
majority of multimedia database systems are built on top of
object or object-relational database systems with the under-
lying query facilities inherited. The DISIMA system is being
built on top of a commercial OODBMS and we have chosen
to extend the standard object-oriented query language OQL
with some multimedia functionalities. The resulting lan-
guage is called MOQL. This paper presents VisualMOQL, a
visual query language implementing the image component
of MOQL.
1 Introduction
In this paper we present the visual query interface, Vi-
sualMOQL, of the DISIMA distributed image database
management system under development at the University of
Alberta. The topics under investigation include (i) the de-
velopment of an object-oriented DBMS kernel that provides
exibility for user-dened classication of images, provides
support for feature-based and spatial querying over image
content (by means of salient objects), and enables reasoning
over spatial relationships for query optimization; (ii) the de-
velopment of query languages and primitives for querying
image databases; and (iii) the provision of scalability and
open access to image repositories. The DISIMA prototype
is being implemented on top of the ObjectStore system [9].
VisualMOQL is based on a textual query language we
developed, Multimedia OQL (MOQL) [10]. MOQL ex-
tends the standard object query language OQL [4] by adding
This research is supported by a strategic grant from the Natural Sci-
ence and Engineering Research Council (NSERC) of Canada.
Current address: IBM Toronto Lab, Ontario, Canada, xub-
ing@ca.ibm.com
spatial, temporal, and presentation properties for content-
based image and video data retrieval, as well as for queries
on structured documents. VisualMOQL implements only
the image part of MOQL for the DISIMA project. A query
specied using VisualMOQL is translated into MOQL to
make use of the MOQL parser and query processor.
The complex structure and semantics of multimedia data
make their access by a classical query language non-trivial.
Since the media are inherently visual, it makes sense to pro-
vide visual querying capability. The approach whereby user
requests are visually represented is known as visual lan-
guage, iconic language, or graphical language [3]. Graphi-
cal language [2] refers to visual languages based on seman-
tic models which make use of graphs, ow-charts or block-
diagrams to represent objects and relationships dened
among them. In iconic languages [7], queries are expressed
by selecting and combining icons (visual metaphors) to pro-
duce new ones. In general, the expressive power of visual
query languages is low since they are directed at naive users
and are often not based on a textual query language.
The capabilities of visual languages can be enhanced
if they are based on powerful multimedia query lan-
guages, which themselves may be extensions of ob-
ject or object-relational query languages. This pro-
vides a visual query language that enables easy query-
ing of multimedia databases, while beneting from the
query facilities provided by the database management
system (DBMS). This is the approach we have cho-
sen. In this paper, we present VisualMOQL, a visual
language for the image component of MOQL. A demo
of the system is available at http://www.cs.ualberta.ca/
database/DISIMA/Interface.html. A description of the
demo is provided in [14].
The remainder of this paper is organized as follows: Sec-
tion 2 gives an overview of the DISIMA project, Section 3
presents VisualMOQL, Section 4 explains the semantics of
Visual MOQL queries, Section 5 discusses the implementa-
tion of the query processor, Section 6 discusses some related
work.

2 The DISIMA System Overview
This section gives an overview of the DISIMA model,
the MOQL query language and the image annotation pro-
cess. Details on the DISIMA model can be found in [12, 13]
and MOQL is fully dened in [10].
2.1 The DISIMA Model
The DISIMA model [12, 13], is composed of two main
blocks: the image block and the salient object block. We
dene a block as a group of semantically related entities.
The image block is made up of two layers: image layer
and image representation layer. We distinguish an image
from its representations to maintain an independence be-
tween them (representation independence). At the image
layer, the user denes an image type classication which
allows categorization of images.
DISIMA views the content of an image as a set of salient
objects (i.e., interesting entities in the image) with certain
spatial relationships to each other. The salient object block
is designed to handle salient object organization. For a
given application, salient objects can be dened by the user
and identied in images by means of an annotation process.
The denition of salient objects can lead to a type lattice.
DISIMA distinguishes two kinds of salient objects: physi-
cal and logical. A logical salient object (LSO) is an abstrac-
tion of a salient object that is relevant to some application; a
physical salient object (PSO) is a syntactic object in a par-
ticular image with its semantics given by a logical salient
object. Figure 1 shows examples of both a salient object
and an image class hierarchy.
Salient_object
Other_person
Athlete
OtherHuman_body
Head
Torso
Limb Politician
Person
ImageMisc
(a) Salient Object Hierarchy (b) Image Hierarchy
Image
MedicalImage Catalog NewsImage
PersonImage
EnvironmentalImage
Figure 1. An Example of Image and Logical
Salient Object Hierarchies.
The DISIMA model addresses both image and spatial
databases and allows for the independence of image repre-
sentations and applications. Moreover, it distinguishes the
existence and identity of logical salient objects from their
appearance in an image (physical salient objects).
2.2 Image Annotation
The representation of salient objects and their spatial re-
lationships assumes the detection of these objects. We have
tackled this issue within the DISIMA project by focusing on
face detection. The reason for this choice is that the driving
application of a news image database contains pictures with
many persons in them. The image processing software rst
detects the faces contained in the image, marking them with
a minimum bounding rectangle (useful for spatial relation-
ships), and then provides color and texture values. Next, a
human-annotator assigns a logical salient object to the face.
In addition, an image has some descriptive properties (i.e.,
meta-data), such as date and photographer, that have to be
provided. For this paper, we assume that the information at
the two levels of salient objects is provided.
2.3 MOQL: A Multimedia Extension of OQL
An OQL query is a function which returns an object
whose type may be inferred from the operators contributing
to the query expression. As an embedded language, OQL
allows applications to query objects that are supported by
the native programming language. The basic statement of
OQL is:
select [distinct] projection
attributes
from query [ [as] identier] , query [ [as] identier ]
[where query] [group by partition attributes] [having
query]
[order by sort criterion , sort criterion ]
Most extensions introduced to OQL by MOQL are
in the where clause, in the form of four new predi-
cate expressions: spatial expression, temporal expression,
contains predicate, and similarity expression. The spa-
tial expression is a spatial extension which includes spa-
tial objects, spatial functions, and spatial predicates. The
temporal expression deals with temporal objects, functions,
and predicates for videos. The contains predicate is dened
as: contains predicate ::= media object contains salientO-
bject where, media object represents an instance of a par-
ticular medium type, e.g., an image or video object, while
salientObject is an object within the media object that is
deemed interesting (salient) to the application (e.g., a per-
son, a car or a house in an image). The contains predicate
checks whether or not a salient object is in a particular me-
dia object. The similarity predicate checks if two media ob-
jects are similar with respect to some metric. VisualMOQL
uses the DISIMA model to implement the image facilities
of MOQL. A query : “Find images with 2 people next to
each other without any building, or images with buildings
without people, or images with animals” can be expressed
in MOQL as follow:
SELECT m FROM image m, animal a, building b1,
person p1, person p2
WHERE m contains a
OR ( m contains b1 and m not in
(SELECT m1 FROM image m1, person p3
WHERE m1 contains p3))
OR (m contains p1 and m contains p2
and p1.MBB west p2.MBB and m not in
(SELECT m2 FROM image m2, building b2
WHERE m2 contains b2))
This example points the need for a visual query interface.
Although the user may have a clear idea of the kind of
images he/she is interested in, the expression of the query is

not straightforward. VisualMOQL proposes an easier way
to express queries, and then translates them into MOQL.
3 VisualMOQL
VisualMOQL [15] implements the image part of MOQL
and allows users to query images by their semantics. Image
semantics are based on the DISIMA model that views im-
ages as composed of salient objects with some properties.
The user can query the database by specifying the salient
objects in the image. The query can be rened by dening
the color, shape, and other attribute values of these salient
objects. Furthermore, the user can specify the spatial rela-
tionships among salient objects in the image, which include
both topological and directional ones. The user can also
specify properties of the image meta-data - data members
dened in the image class, such as the name of the photog-
rapher and the date.
VisualMOQL has these particular features:
It is a declarative visual query language with a step by
step construction of queries, close to the way people
think in natural languages.
It has a clearly dened semantics based on object cal-
culus. This feature can be used to conduct a theoretical
study of the language, involving concepts such as ex-
pressive power and complexity, which we consider out
of the scope of this paper.
It combines several querying approaches: semantic-
based (query image semantics using salient objects),
attribute-based (specify and compare attribute values),
and cognitive-based (query by example). A user can
start a query using the semantic and/or attribute-based
approach and then choose an image for a cognitive-
based query.
Although the cognitive-based querying is dened in MOQL
and VisualMOQL, this feature is not yet implemented for
the DISIMA system. The DISIMA model is rich enough to
combine general image properties, including colors and tex-
ture, together with salient objects having semantics, colors,
texture, shape, and spatial relationships. This leads to the
denitions of several possible global image similarity func-
tions. Basically, the user should be able to say “I want the
similarity to be done on global image color features, with or
without texture, with or without salient objects”. The salient
object semantic and syntactic features can be used to rene
the similarity measurement. We are working on dening a
exible index able to handle all the possible similarity mea-
sures before making this feature available for DISIMA.
3.1 Query Interface
The VisualMOQL window (Figure 3) consists of a num-
ber of components to design a query. The user species a
query by choosing the image class he/she wants to query
and the salient objects he/she wants to see in the images.
Several levels of renement are offered, depending on the
type of query and also on the level of precision the user
wants the result of the query to have. The startup window
consists of:
A chooser to select the image classes. Images stored in
the database are categorized into user-dened classes.
Thus, the system allows the user to select a subset of
the database to search over. The root image class is set
as the default.
A salient object class browser which allows the user to
choose the objects that he/she wants. All salient ob-
jects and their associated attribute values are identied
during database population. They are organized into a
salient object hierarchy and the root salient object class
is set as the default.
A horizontal slider to specify the maximum number of
images that will be returned as the result of the query.
This is a quality of service parameter used by the query
result presentation interface.
A horizontal slider to specify the similarity threshold
between the query image and the target images stored
in the database. It is also used for color comparison.
This is also a quality of service parameter for the pre-
sentation interface.
A working canvas where the user constructs queries
step by step.
A query canvas where the user can construct com-
pound queries based on simple queries (sub-queries)
dened in the working canvas using AND, OR, and
NOT operators.
3.2 Working Canvas
The working canvas is where the user constructs or mod-
ies query blocks. The user rst selects an image class, then
selects a salient object class in the class browser. He/she in-
serts the selected salient object in the canvas by pressing
the “Insert” button. The object appears as a rectangle in
the working canvas. This rectangle is also used for deter-
mining the spatial relationships between objects. It could
later be resized and moved. The user can also dene the
color, shape, texture, and other attribute values of any ob-
jects on canvas by using a dialog box shown in Figure 2.
VisualMOQL allows the user to compare textual attributes.
The default comparison predicate is `=' but can be changed
to . Since the variables used to refer to
objects in the MOQL translation are shown on the object
icons, they can be used to express join operators. For exam-
ple, nd images with 2 persons of the same name” can be
expressed by inserting two salient objects of type person in
the working canvas. Assume VisualMOQL refers to them
as P01 and P02. Then the user can edit one of the salient
objects (e.g., P01) and type “P02.name” as the value for

the attribute name (Figure 2). The query can involve image
global properties like name of the photographer or time the
image was taken, as well as global colors and textures. A
dialog box obtained by clicking on the button “Image Prop-
erty”, is provided to let the user enter such information.
Figure 2. Dialog box for editing object at-
tributes.
Topological relationships will be added automatically for
any intersected objects. Directional relationships must be
dened explicitly through a dialog box. The user species
which axes (x-axis and/or y-axis) matter. The centroid of
the rectangles representing salient objects is used to calcu-
late the directional relationships. When both axes matter,
we can express complex spatial relationships such as north-
west, southeast, overlap, etc. When the user species that
only one axis matters, the spatial relationships are north,
south, east, or west.
We will use the term sub-query to refer to query blocks
obtained from the working canvas. By clicking on the 'Val-
idate' button, the user ends the sub-query specication. It is
then moved into the query canvas where it can be combined
with other sub-queries to form the nal query.
3.3 Query Canvas
The query canvas is the space for the user to construct
compound queries. Each sub-query is represented by a
square box on the query canvas named Query n ( is an
integer). Compound queries are constructed by combining
sub-queries or smaller compound queries using AND, OR,
and NOT operators. A sub-query in the query canvas can
be modied and revalidated at any stage by using the `Edit'
button. This moves the sub-query to the working canvas.
Finally, the user presses the query button to submit the
query. Before translating this visual query, the system will
check the query canvas to make sure there are no dangling
queries. That is, all the sub-queries have to be linked using
the AND, OR, or NOT operators. It will then translate the
VisualMOQL query into MOQL and display the resulting
string before submitting it to the query processor.
3.4 An Example of a VisualMOQL Query
Let us express the query :“nd images with 2 peo-
ple next to each other without any building, or images with
buildings without people, or images with animals” in Vi-
sualMOQL. This query is a combination of three queries:
(images with 2 people next to each other without any
building), (images with buildings without people), and
(images with animals). The nal expression of the query
is given in Figure 3. is expressed in MOQL by (Query1
AND NOT Query2) where Query1 is a sub-query expressing
images with two people, one to the west of the other (see the
working canvas of Figure 3) and Query2 expresses images
with buildings. is expressed in MOQL by (Query3 AND
NOT Query4) where Query3 expresses images with build-
ings and expresses images with people. is ex-
pressed by the sub-query Query5 and expresses images with
animals. The nal expression is obtained by combining the
sub-queries using the OR connective. The VisualMOQL ex-
pression is translated into MOQL (Figure 4) before being
submitted to the query processor.
Figure 3. Example of Query.
4 VisualMOQL Query Semantics and Trans-
lation
VisualMOQL has a well-dened semantics based on ob-
ject calculus. The working canvas represents an empty im-
age; inserting salient objects into the working canvas, the
user expresses the kind of salient objects he/she wants to
see in query results (existence condition). The user can
attach additional conditions on salient objects, including,
spatial relationships among salient objects. A query at
the working canvas level is viewed as a sub-query in the
query canvas. If we view the images and the salient ob-
jects classes as complex value relations [1], a sub-query

Figure 4. Query Translation.
from the query canvas is, in fact, a formula without any
free variable. The general framework of such a formula
is:
where
is an image class, are salient object classes,
expresses boolean conditions on salient
objects and expresses conditions on the image.
There is only one image class in a sub-query. The default
image class is Image, the root of the image class hierarchy,
but this can be changed. As a formula without any free vari-
able, a sub-query is evaluated to true or false.
At the query canvas level, sub-queries are combined us-
ing boolean operators and turned into queries. If and
are sub-queries then: , , are valid sub-
queries. We insure there is no dangling sub-query in the
case of more than one sub-query. The type of operators used
in the query canvas raise some safety problems common in
calculus languages. In the case of a query with a negation,
what is the range of the query result? For example “images
without buildings” will give different results depending on
the universe it is applied to, normally a specication the user
does not control. The problem is usually solved by range-
restricting quantied or free variables. By construction, the
variables in a sub-query are range-restricted. A sub-query
has one free variable ranging over an image class and some
salient object variables ranging over salient object classes.
4.1 Simple Queries
A sub-query , in which is the variable over an image
class is turned into a query as follows:
. The free variable also has to be range restricted if
is a negative formula: .
The example “images without buildings” makes sense if it
is expressed as “images from the news image class without
buildings”. The problem in this case is where to get the
images from. We set some simple rules to insure the safety
of VisualMOQL queries. A sub-query is always applied to
an image class (the root image class is set by default). If
is an atomic sub-query with or without negation, the free
variable can be restricted to the image class in as
follows: . The user will be
able to express queries like “images without buildings” only
within the context of a range of existing image classes.
4.2 Composed Queries
When the sub-query is a composed formula ( with
), with two image variables that range over two
different image classes, combining result objects of differ-
ent types is problematic. We rst determine the image class
in which the query can be expressed and understood inde-
pendently from its semantics. This is the least common an-
cestor of the two image classes in the image type system,
i.e., a class where images from both image classes have the
same type. Since the DISIMA image type system is rooted,
the common ancestor will be the image root class in the
worst case. Then we check the consistency of the query.
Assume we have an image variable ranging over
in and a variable ranging over in . This can
also be seen as combining the results of two queries in an
algebraic language: ( ). In this case,
the images must be compatible:
: there are two cases for which this query makes
sense: (1) or (2) is an ancestor of
(or the reverse) in the type system hierarchy. Other-
wise an error is detected.
: the type of the query result is set to the common
ancestor ( ) of and . The query is understood
as and the result is a set of
objects.
The nal query is then transformed into a safe-range nor-
mal form so that the child of each negation is an existen-
tially quantied formula (sub-query). The normalized for-
mula is obtained by:
variable substitution: the same image variable cannot
range over two distinct image classes.
push negation: replace by , by
, and by
.
For example, a query like “images with people and without
buildings” can logically be expressed as “not(images with
buildings or images without people)”. The normalization
facilitates the translations, as a negative existentially quanti-
ed formula expresses a set difference and can be translated
using a nested MOQL query.
5 Query Processing
Although ObjectStore provides some querying facilities
over collections, it does not have a built-in declarative query
language. Therefore, we have fully implemented a MOQL
parser and query processor for MOQL queries. Details on
the parser and the query processor can be found in [5]. The
result of the parser is an internal query tree structure which
is later transformed into an execution plan.

Citations
More filters
Book

Data Management for Multimedia Retrieval

TL;DR: This textbook on multimedia data management techniques offers a unified perspective on retrieval efficiency and effectiveness and presents data structures and algorithms that help store, index, cluster, classify, and access common data representations.
Journal ArticleDOI

The MPEG-7 Multimedia Database System (MPEG-7 MMDB)

TL;DR: The innovative parts of this system are the metadata model for multimedia content relying on the XML-based MPEG-7 standard, a new indexing and querying system for MPEG- 7, the query optimizer and the supporting internal and external application libraries.
Journal ArticleDOI

MPEG-7 and multimedia database systems

TL;DR: It is argued that MPEG-7 has to be considered complementary to, rather than competing with, data models employed in MMDBSs, and shown by an example scenario how these technologies can reasonably complement one another.
Journal ArticleDOI

Multimedia Databases and Data Management: A Survey

TL;DR: How the existing techniques, methodologies, and tools addressed relevant issues and challenges to enable a better understanding in multimedia databases and data management are discussed.
Proceedings ArticleDOI

Similarity-based operators and query optimization for multimedia database systems

TL;DR: The authors' image data repository model is presented that supports similarity based operations conveniently under an object-relational database paradigm and a cost model for an implementation of one of the major similarity based operators is introduced.
References
More filters
Book

Foundations of databases

TL;DR: This book discusses Languages, Computability, and Complexity, and the Relational Model, which aims to clarify the role of Semantic Data Models in the development of Query Language Design.
Book

The object database standard: ODMG 2.0

TL;DR: With this book, standards are defined for object management systems and this will be the foundational book for object-oriented database product.
Journal ArticleDOI

QBD*: a graphical query language with recursion

TL;DR: A system to query databases using diagrams as a standard user interface that makes use of a conceptual data model, a query language on this model, and a graphical user interface is proposed.

MOQL: A Multimedia Object Query Language

John Z. Li, +1 more
TL;DR: A general multimedia query language, called MOQL, based on ODMG's Object Query Language (OQL), which includes constructs to capture the temporal and spatial relationships in multimedia data as well as functions for query presentation.
Frequently Asked Questions (16)
Q1. What are the contributions mentioned in the paper "Visualmoql: the disima visual query language" ?

This paper presents VisualMOQL, a visual query language implementing the image component of MOQL. 

Work is in progress to extend VisualMOQL. 

Image semantic modeling in DISIMA combines general image properties including colors and texture, together with salient objects having colors, textures, and spatial relationships. 

The query can involve image global properties like name of the photographer or time the image was taken, as well as global colors and textures. 

The approach whereby user requests are visually represented is known as visual language, iconic language, or graphical language [3]. 

The objective of the DISIMA parser is to check the semantics and syntax of the external query, which is in the form of a character string. 

A logical salient object (LSO) is an abstraction of a salient object that is relevant to some application; a physical salient object (PSO) is a syntactic object in a particular image with its semantics given by a logical salient object. 

If the authors view the images and the salient objects classes as complex value relations [1], a sub-queryfrom the query canvas is, in fact, a formula without any free variable. 

When the sub-query is a composed formula ( with ), with two image variables that range over two different image classes, combining result objects of different types is problematic. 

The contains predicate is defined as: contains predicate ::= media object contains salientObject where, media object represents an instance of a particular medium type, e.g., an image or video object, while salientObject is an object within the media object that is deemed interesting (salient) to the application (e.g., a person, a car or a house in an image). 

Current address: IBM Toronto Lab, Ontario, Canada, xubing@ca.ibm.comspatial, temporal, and presentation properties for contentbased image and video data retrieval, as well as for queries on structured documents. 

This feature can be used to conduct a theoretical study of the language, involving concepts such as expressive power and complexity, which the authors consider out of the scope of this paper. 

To keep the query interface simple, their idea is to provide a VisualMOQL interface for each type of application (e.g., video, document, image). 

The user can also define the color, shape, texture, and other attribute values of any objects on canvas by using a dialog box shown in Figure 2. 

The final query is then transformed into a safe-range normal form so that the child of each negation is an existentially quantified formula (sub-query). 

This type of visual language provides an abstract set of operators and a mechanism to express operators defined in the internal language.