What are the future works mentioned in the paper "Visualmoql: the disima visual query language" ?

Work is in progress to extend VisualMOQL.

what is the objective of the DISIMA parser?

The objective of the DISIMA parser is to check the semantics and syntax of the external query, which is in the form of a character string.

What is the definition of a sub-query?

If the authors view the images and the salient objects classes as complex value relations [1], a sub-queryfrom the query canvas is, in fact, a formula without any free variable.

What is the idea of a simple query interface?

To keep the query interface simple, their idea is to provide a VisualMOQL interface for each type of application (e.g., video, document, image).

What is the main difference between the two types of visual languages?

This type of visual language provides an abstract set of operators and a mechanism to express operators defined in the internal language.

(Open Access) VisualMOQL: the DISIMA visual query language (1999) | Vincent Oria

Q: What is the semantic modeling in DISIMA?

Image semantic modeling in DISIMA combines general image properties including colors and texture, together with salient objects having colors, textures, and spatial relationships.

Q: What is the definition of a salient object?

A logical salient object (LSO) is an abstraction of a salient object that is relevant to some application; a physical salient object (PSO) is a syntactic object in a particular image with its semantics given by a logical salient object.

Q: What is the definition of the contains predicate?

The contains predicate is defined as: contains predicate ::= media object contains salientObject where, media object represents an instance of a particular medium type, e.g., an image or video object, while salientObject is an object within the media object that is deemed interesting (salient) to the application (e.g., a person, a car or a house in an image).

VisualMOQL: The DISIMA Visual Query Language

Vincent Oria, M. Tamer

Ozsu, Bing Xu, L. Irene Cheng and Paul J. Iglinski

Department of Computing Science

University of Alberta

Edmonton, Alberta, Canada T6G 2H1

oria, ozsu, bing, lin, iglinski @cs.ualberta.ca

Abstract

Multimedia data are now available to a variety of users

ranging from naive to sophisticated. To make querying easy,

visual query languages have been proposed. Most of these

languages have a low expressive power and have their own

query processors. Efforts have been made to design query

languages with proper semantics to facilitate query opti-

mization and processing in existing database systems. The

majority of multimedia database systems are built on top of

object or object-relational database systems with the under-

lying query facilities inherited. The DISIMA system is being

built on top of a commercial OODBMS and we have chosen

to extend the standard object-oriented query language OQL

with some multimedia functionalities. The resulting lan-

guage is called MOQL. This paper presents VisualMOQL, a

visual query language implementing the image component

of MOQL.

1 Introduction

In this paper we present the visual query interface, Vi-

sualMOQL, of the DISIMA distributed image database

management system under development at the University of

Alberta. The topics under investigation include (i) the de-

velopment of an object-oriented DBMS kernel that provides

ﬂexibility for user-deﬁned classiﬁcation of images, provides

support for feature-based and spatial querying over image

content (by means of salient objects), and enables reasoning

over spatial relationships for query optimization; (ii) the de-

velopment of query languages and primitives for querying

image databases; and (iii) the provision of scalability and

open access to image repositories. The DISIMA prototype

is being implemented on top of the ObjectStore system [9].

VisualMOQL is based on a textual query language we

developed, Multimedia OQL (MOQL) [10]. MOQL ex-

tends the standard object query language OQL [4] by adding

This research is supported by a strategic grant from the Natural Sci-

ence and Engineering Research Council (NSERC) of Canada.

Current address: IBM Toronto Lab, Ontario, Canada, xub-

ing@ca.ibm.com

spatial, temporal, and presentation properties for content-

based image and video data retrieval, as well as for queries

on structured documents. VisualMOQL implements only

the image part of MOQL for the DISIMA project. A query

speciﬁed using VisualMOQL is translated into MOQL to

make use of the MOQL parser and query processor.

The complex structure and semantics of multimedia data

make their access by a classical query language non-trivial.

Since the media are inherently visual, it makes sense to pro-

vide visual querying capability. The approach whereby user

requests are visually represented is known as visual lan-

guage, iconic language, or graphical language [3]. Graphi-

cal language [2] refers to visual languages based on seman-

tic models which make use of graphs, ﬂow-charts or block-

diagrams to represent objects and relationships deﬁned

among them. In iconic languages [7], queries are expressed

by selecting and combining icons (visual metaphors) to pro-

duce new ones. In general, the expressive power of visual

query languages is low since they are directed at naive users

and are often not based on a textual query language.

The capabilities of visual languages can be enhanced

if they are based on powerful multimedia query lan-

guages, which themselves may be extensions of ob-

ject or object-relational query languages. This pro-

vides a visual query language that enables easy query-

ing of multimedia databases, while beneﬁting from the

query facilities provided by the database management

system (DBMS). This is the approach we have cho-

sen. In this paper, we present VisualMOQL, a visual

language for the image component of MOQL. A demo

of the system is available at http://www.cs.ualberta.ca/

database/DISIMA/Interface.html. A description of the

demo is provided in [14].

The remainder of this paper is organized as follows: Sec-

tion 2 gives an overview of the DISIMA project, Section 3

presents VisualMOQL, Section 4 explains the semantics of

Visual MOQL queries, Section 5 discusses the implementa-

tion of the query processor, Section 6 discusses some related

work.

2 The DISIMA System Overview

This section gives an overview of the DISIMA model,

the MOQL query language and the image annotation pro-

cess. Details on the DISIMA model can be found in [12, 13]

and MOQL is fully deﬁned in [10].

2.1 The DISIMA Model

The DISIMA model [12, 13], is composed of two main

blocks: the image block and the salient object block. We

deﬁne a block as a group of semantically related entities.

The image block is made up of two layers: image layer

and image representation layer. We distinguish an image

from its representations to maintain an independence be-

tween them (representation independence). At the image

layer, the user deﬁnes an image type classiﬁcation which

allows categorization of images.

DISIMA views the content of an image as a set of salient

objects (i.e., interesting entities in the image) with certain

spatial relationships to each other. The salient object block

is designed to handle salient object organization. For a

given application, salient objects can be deﬁned by the user

and identiﬁed in images by means of an annotation process.

The deﬁnition of salient objects can lead to a type lattice.

DISIMA distinguishes two kinds of salient objects: physi-

cal and logical. A logical salient object (LSO) is an abstrac-

tion of a salient object that is relevant to some application; a

physical salient object (PSO) is a syntactic object in a par-

ticular image with its semantics given by a logical salient

object. Figure 1 shows examples of both a salient object

and an image class hierarchy.

Salient_object

Other_person

Athlete

OtherHuman_body

Head

Torso

Limb Politician

Person

ImageMisc

(a) Salient Object Hierarchy (b) Image Hierarchy

Image

MedicalImage Catalog NewsImage

PersonImage

EnvironmentalImage

Figure 1. An Example of Image and Logical

Salient Object Hierarchies.

The DISIMA model addresses both image and spatial

databases and allows for the independence of image repre-

sentations and applications. Moreover, it distinguishes the

existence and identity of logical salient objects from their

appearance in an image (physical salient objects).

2.2 Image Annotation

The representation of salient objects and their spatial re-

lationships assumes the detection of these objects. We have

tackled this issue within the DISIMA project by focusing on

face detection. The reason for this choice is that the driving

application of a news image database contains pictures with

many persons in them. The image processing software ﬁrst

detects the faces contained in the image, marking them with

a minimum bounding rectangle (useful for spatial relation-

ships), and then provides color and texture values. Next, a

human-annotator assigns a logical salient object to the face.

In addition, an image has some descriptive properties (i.e.,

meta-data), such as date and photographer, that have to be

provided. For this paper, we assume that the information at

the two levels of salient objects is provided.

2.3 MOQL: A Multimedia Extension of OQL

An OQL query is a function which returns an object

whose type may be inferred from the operators contributing

to the query expression. As an embedded language, OQL

allows applications to query objects that are supported by

the native programming language. The basic statement of

OQL is:

select [distinct] projection

attributes

from query [ [as] identiﬁer] , query [ [as] identiﬁer ]

[where query] [group by partition attributes] [having

query]

[order by sort criterion , sort criterion ]

Most extensions introduced to OQL by MOQL are

in the where clause, in the form of four new predi-

cate expressions: spatial expression, temporal expression,

contains predicate, and similarity expression. The spa-

tial expression is a spatial extension which includes spa-

tial objects, spatial functions, and spatial predicates. The

temporal expression deals with temporal objects, functions,

and predicates for videos. The contains predicate is deﬁned

as: contains predicate ::= media object contains salientO-

bject where, media object represents an instance of a par-

ticular medium type, e.g., an image or video object, while

salientObject is an object within the media object that is

deemed interesting (salient) to the application (e.g., a per-

son, a car or a house in an image). The contains predicate

checks whether or not a salient object is in a particular me-

dia object. The similarity predicate checks if two media ob-

jects are similar with respect to some metric. VisualMOQL

uses the DISIMA model to implement the image facilities

of MOQL. A query : “Find images with 2 people next to

each other without any building, or images with buildings

without people, or images with animals” can be expressed

in MOQL as follow:

SELECT m FROM image m, animal a, building b1,

person p1, person p2

WHERE m contains a

OR ( m contains b1 and m not in

(SELECT m1 FROM image m1, person p3

WHERE m1 contains p3))

OR (m contains p1 and m contains p2

and p1.MBB west p2.MBB and m not in

(SELECT m2 FROM image m2, building b2

WHERE m2 contains b2))

This example points the need for a visual query interface.

Although the user may have a clear idea of the kind of

images he/she is interested in, the expression of the query is

not straightforward. VisualMOQL proposes an easier way

to express queries, and then translates them into MOQL.

3 VisualMOQL

VisualMOQL [15] implements the image part of MOQL

and allows users to query images by their semantics. Image

semantics are based on the DISIMA model that views im-

ages as composed of salient objects with some properties.

The user can query the database by specifying the salient

objects in the image. The query can be reﬁned by deﬁning

the color, shape, and other attribute values of these salient

objects. Furthermore, the user can specify the spatial rela-

tionships among salient objects in the image, which include

both topological and directional ones. The user can also

specify properties of the image meta-data - data members

deﬁned in the image class, such as the name of the photog-

rapher and the date.

VisualMOQL has these particular features:

It is a declarative visual query language with a step by

step construction of queries, close to the way people

think in natural languages.

It has a clearly deﬁned semantics based on object cal-

culus. This feature can be used to conduct a theoretical

study of the language, involving concepts such as ex-

pressive power and complexity, which we consider out

of the scope of this paper.

It combines several querying approaches: semantic-

based (query image semantics using salient objects),

attribute-based (specify and compare attribute values),

and cognitive-based (query by example). A user can

start a query using the semantic and/or attribute-based

approach and then choose an image for a cognitive-

based query.

Although the cognitive-based querying is deﬁned in MOQL

and VisualMOQL, this feature is not yet implemented for

the DISIMA system. The DISIMA model is rich enough to

combine general image properties, including colors and tex-

ture, together with salient objects having semantics, colors,

texture, shape, and spatial relationships. This leads to the

deﬁnitions of several possible global image similarity func-

tions. Basically, the user should be able to say “I want the

similarity to be done on global image color features, with or

without texture, with or without salient objects”. The salient

object semantic and syntactic features can be used to reﬁne

the similarity measurement. We are working on deﬁning a

ﬂexible index able to handle all the possible similarity mea-

sures before making this feature available for DISIMA.

3.1 Query Interface

The VisualMOQL window (Figure 3) consists of a num-

ber of components to design a query. The user speciﬁes a

query by choosing the image class he/she wants to query

and the salient objects he/she wants to see in the images.

Several levels of reﬁnement are offered, depending on the

type of query and also on the level of precision the user

wants the result of the query to have. The startup window

consists of:

A chooser to select the image classes. Images stored in

the database are categorized into user-deﬁned classes.

Thus, the system allows the user to select a subset of

the database to search over. The root image class is set

as the default.

A salient object class browser which allows the user to

choose the objects that he/she wants. All salient ob-

jects and their associated attribute values are identiﬁed

during database population. They are organized into a

salient object hierarchy and the root salient object class

is set as the default.

A horizontal slider to specify the maximum number of

images that will be returned as the result of the query.

This is a quality of service parameter used by the query

result presentation interface.

A horizontal slider to specify the similarity threshold

between the query image and the target images stored

in the database. It is also used for color comparison.

This is also a quality of service parameter for the pre-

sentation interface.

A working canvas where the user constructs queries

step by step.

A query canvas where the user can construct com-

pound queries based on simple queries (sub-queries)

deﬁned in the working canvas using AND, OR, and

NOT operators.

3.2 Working Canvas

The working canvas is where the user constructs or mod-

iﬁes query blocks. The user ﬁrst selects an image class, then

selects a salient object class in the class browser. He/she in-

serts the selected salient object in the canvas by pressing

the “Insert” button. The object appears as a rectangle in

the working canvas. This rectangle is also used for deter-

mining the spatial relationships between objects. It could

later be resized and moved. The user can also deﬁne the

color, shape, texture, and other attribute values of any ob-

jects on canvas by using a dialog box shown in Figure 2.

VisualMOQL allows the user to compare textual attributes.

The default comparison predicate is `=' but can be changed

to . Since the variables used to refer to

objects in the MOQL translation are shown on the object

icons, they can be used to express join operators. For exam-

ple, “ﬁnd images with 2 persons of the same name” can be

expressed by inserting two salient objects of type person in

the working canvas. Assume VisualMOQL refers to them

as P01 and P02. Then the user can edit one of the salient

objects (e.g., P01) and type “P02.name” as the value for

the attribute name (Figure 2). The query can involve image

global properties like name of the photographer or time the

image was taken, as well as global colors and textures. A

dialog box obtained by clicking on the button “Image Prop-

erty”, is provided to let the user enter such information.

Figure 2. Dialog box for editing object at-

tributes.

Topological relationships will be added automatically for

any intersected objects. Directional relationships must be

deﬁned explicitly through a dialog box. The user speciﬁes

which axes (x-axis and/or y-axis) matter. The centroid of

the rectangles representing salient objects is used to calcu-

late the directional relationships. When both axes matter,

we can express complex spatial relationships such as north-

west, southeast, overlap, etc. When the user speciﬁes that

only one axis matters, the spatial relationships are north,

south, east, or west.

We will use the term sub-query to refer to query blocks

obtained from the working canvas. By clicking on the 'Val-

idate' button, the user ends the sub-query speciﬁcation. It is

then moved into the query canvas where it can be combined

with other sub-queries to form the ﬁnal query.

3.3 Query Canvas

The query canvas is the space for the user to construct

compound queries. Each sub-query is represented by a

square box on the query canvas named Query n ( is an

integer). Compound queries are constructed by combining

sub-queries or smaller compound queries using AND, OR,

and NOT operators. A sub-query in the query canvas can

be modiﬁed and revalidated at any stage by using the `Edit'

button. This moves the sub-query to the working canvas.

Finally, the user presses the query button to submit the

query. Before translating this visual query, the system will

check the query canvas to make sure there are no dangling

queries. That is, all the sub-queries have to be linked using

the AND, OR, or NOT operators. It will then translate the

VisualMOQL query into MOQL and display the resulting

string before submitting it to the query processor.

3.4 An Example of a VisualMOQL Query

Let us express the query :“ﬁnd images with 2 peo-

ple next to each other without any building, or images with

buildings without people, or images with animals” in Vi-

sualMOQL. This query is a combination of three queries:

(images with 2 people next to each other without any

building), (images with buildings without people), and

(images with animals). The ﬁnal expression of the query

is given in Figure 3. is expressed in MOQL by (Query1

AND NOT Query2) where Query1 is a sub-query expressing

images with two people, one to the west of the other (see the

working canvas of Figure 3) and Query2 expresses images

with buildings. is expressed in MOQL by (Query3 AND

NOT Query4) where Query3 expresses images with build-

ings and expresses images with people. is ex-

pressed by the sub-query Query5 and expresses images with

animals. The ﬁnal expression is obtained by combining the

sub-queries using the OR connective. The VisualMOQL ex-

pression is translated into MOQL (Figure 4) before being

submitted to the query processor.

Figure 3. Example of Query.

4 VisualMOQL Query Semantics and Trans-

lation

VisualMOQL has a well-deﬁned semantics based on ob-

ject calculus. The working canvas represents an empty im-

age; inserting salient objects into the working canvas, the

user expresses the kind of salient objects he/she wants to

see in query results (existence condition). The user can

attach additional conditions on salient objects, including,

spatial relationships among salient objects. A query at

the working canvas level is viewed as a sub-query in the

query canvas. If we view the images and the salient ob-

jects classes as complex value relations [1], a sub-query

Figure 4. Query Translation.

from the query canvas is, in fact, a formula without any

free variable. The general framework of such a formula

is:

where

is an image class, are salient object classes,

expresses boolean conditions on salient

objects and expresses conditions on the image.

There is only one image class in a sub-query. The default

image class is Image, the root of the image class hierarchy,

but this can be changed. As a formula without any free vari-

able, a sub-query is evaluated to true or false.

At the query canvas level, sub-queries are combined us-

ing boolean operators and turned into queries. If and

are sub-queries then: , , are valid sub-

queries. We insure there is no dangling sub-query in the

case of more than one sub-query. The type of operators used

in the query canvas raise some safety problems common in

calculus languages. In the case of a query with a negation,

what is the range of the query result? For example “images

without buildings” will give different results depending on

the universe it is applied to, normally a speciﬁcation the user

does not control. The problem is usually solved by range-

restricting quantiﬁed or free variables. By construction, the

variables in a sub-query are range-restricted. A sub-query

has one free variable ranging over an image class and some

salient object variables ranging over salient object classes.

4.1 Simple Queries

A sub-query , in which is the variable over an image

class is turned into a query as follows:

. The free variable also has to be range restricted if

is a negative formula: .

The example “images without buildings” makes sense if it

is expressed as “images from the news image class without

buildings”. The problem in this case is where to get the

images from. We set some simple rules to insure the safety

of VisualMOQL queries. A sub-query is always applied to

an image class (the root image class is set by default). If

is an atomic sub-query with or without negation, the free

variable can be restricted to the image class in as

follows: . The user will be

able to express queries like “images without buildings” only

within the context of a range of existing image classes.

4.2 Composed Queries

When the sub-query is a composed formula ( with

), with two image variables that range over two

different image classes, combining result objects of differ-

ent types is problematic. We ﬁrst determine the image class

in which the query can be expressed and understood inde-

pendently from its semantics. This is the least common an-

cestor of the two image classes in the image type system,

i.e., a class where images from both image classes have the

same type. Since the DISIMA image type system is rooted,

the common ancestor will be the image root class in the

worst case. Then we check the consistency of the query.

Assume we have an image variable ranging over

in and a variable ranging over in . This can

also be seen as combining the results of two queries in an

algebraic language: ( ). In this case,

the images must be compatible:

: there are two cases for which this query makes

sense: (1) or (2) is an ancestor of

(or the reverse) in the type system hierarchy. Other-

wise an error is detected.

: the type of the query result is set to the common

ancestor ( ) of and . The query is understood

as and the result is a set of

objects.

The ﬁnal query is then transformed into a safe-range nor-

mal form so that the child of each negation is an existen-

tially quantiﬁed formula (sub-query). The normalized for-

mula is obtained by:

variable substitution: the same image variable cannot

range over two distinct image classes.

push negation: replace by , by

, and by

For example, a query like “images with people and without

buildings” can logically be expressed as “not(images with

buildings or images without people)”. The normalization

facilitates the translations, as a negative existentially quanti-

ﬁed formula expresses a set difference and can be translated

using a nested MOQL query.

5 Query Processing

Although ObjectStore provides some querying facilities

over collections, it does not have a built-in declarative query

language. Therefore, we have fully implemented a MOQL

parser and query processor for MOQL queries. Details on

the parser and the query processor can be found in [5]. The

result of the parser is an internal query tree structure which

is later transformed into an execution plan.

VisualMOQL: the DISIMA visual query language

Figures

Citations

Data Management for Multimedia Retrieval

The MPEG-7 Multimedia Database System (MPEG-7 MMDB)

MPEG-7 and multimedia database systems

Multimedia Databases and Data Management: A Survey

Similarity-based operators and query optimization for multimedia database systems

References

Foundations of databases

The object database standard: ODMG 2.0

The ObjectStore database system

QBD*: a graphical query language with recursion

MOQL: A Multimedia Object Query Language

Related Papers (5)

VisualMOQL: A Visual Query Lanaguage for Image Databases

Query languages for multimedia search

GMQL: A graphical multimedia query language

MQL-a query language for multimedia database

Example-based graphical database query languages

Frequently Asked Questions (16)

Q1. What are the contributions mentioned in the paper "Visualmoql: the disima visual query language" ?

Q2. What are the future works mentioned in the paper "Visualmoql: the disima visual query language" ?

Q3. What is the semantic modeling in DISIMA?

Q4. What can be used to generate a query?

Q5. What is the name of the language used to represent user requests?

Q6. what is the objective of the DISIMA parser?

Q7. What is the definition of a salient object?

Q8. What is the definition of a sub-query?

Q9. What is the problem with a sub-query?

Q10. What is the definition of the contains predicate?

Q11. Where is the current address of the DISIMA project?

Q12. What can be used to conduct a theoretical study of the language?

Q13. What is the idea of a simple query interface?

Q14. How can the user define the attributes of objects on a working canvas?

Q15. What is the result of the nested MOQL query?

Q16. What is the main difference between the two types of visual languages?