scispace - formally typeset
Open AccessProceedings ArticleDOI

Adaptive image retrieval using a Graph model for semantic feature integration

Reads0
Chats0
TLDR
How semantic relations between multimedia objects based on user interaction can be learnt and then integrated with visual and textual features into a unified framework is described and ideas to implement short-term learning from relevance feedback are presented.
Abstract
The variety of features available to represent multimedia data constitutes a rich pool of information. However, the plethora of data poses a challenge in terms of feature selection and integration for effective retrieval. Moreover, to further improve effectiveness, the retrieval model should ideally incorporate context-dependent feature representations to allow for retrieval on a higher semantic level. In this paper we present a retrieval model and learning framework for the purpose of interactive information retrieval. We describe how semantic relations between multimedia objects based on user interaction can be learnt and then integrated with visual and textual features into a unified framework. The framework models both feature similarities and semantic relations in a single graph. Querying in this model is implemented using the theory of random walks. In addition, we present ideas to implement short-term learning from relevance feedback. Systematic experimental results validate the effectiveness of the proposed approach for image retrieval. However, the model is not restricted to the image domain and could easily be employed for retrieving multimedia data (and even a combination of different domains, eg images, audio and text documents).

read more

Content maybe subject to copyright    Report

Urban, J. and Jose, J.M. (2006) Adaptive image retrieval using a graph
model for semantic feature integration. In, 8th ACM International
Workshop on Multimedia Information Retrieval MIR '06, 26-27 October
2006, pages pp. 117-126, Santa Barbara, CA, USA.
http://eprints.gla.ac.uk/3583/
Glasgow ePrints Service
http://eprints.gla.ac.uk

Adaptive Image Retrieval using a Graph Model for
Semantic Feature Integration
Jana Urban and Joemon M. Jose
Dept. of Computing Science, University of Glasgow
Glasgow, UK
{jana,jj}@dcs.gla.ac.uk
ABSTRACT
The variety of features available to represent multimedia data con-
stitutes a rich pool of information. However, the plethora of data
poses a challenge in terms of feature selection and integration for
effective retrieval. Moreover, to further improve effectiveness, the
retrieval model should ideally incorporate context-dependent fea-
ture representations to allow for retrieval on a higher semantic level.
In this paper we present a retrieval model and learning framework
for the purpose of interactive information retrieval. We describe
how semantic relations between multimedia objects based on user
interaction can be learnt and then integrated with visual and textual
features into a unified framework. The framework models both fea-
ture similarities and semantic relations in a single graph. Querying
in this model is implemented using the theory of random walks. In
addition, we present ideas to implement short-term learning from
relevance feedback. Systematic experimental results validate the
effectiveness of the proposed approach for image retrieval. How-
ever, the model is not restricted to the image domain and could eas-
ily be employed for retrieving multimedia data (and even a combi-
nation of different domains, eg images, audio and text documents).
Categories and Subject Descriptors: H.3.3 [Information Storage
and Retrieval]: Information Search and Retrieval—relevance feed-
back, retrieval models
General Terms: Retrieval Models, Experimentation, Performance
Keywords: semantic features, image retrieval, relevance feedback,
random walks, fusion
1. INTRODUCTION
Ever since the deficiencies of primitive content-based features
were realised, interest has turned to “semantic features” and “se-
mantic retrieval”. Semantic features are now the ultimate goal in
order to facilitate effective retrieval of visual data, but what are
they? Smeulders et al state that ”Semantic features aim at encoding
interpretations of the image which may be relevant to the applica-
tion. [15, p. 1361]. There are two important points to note in this
assertion. Firstly, semantics are about interpretation, and secondly
the interpretation is to a large degree domain or context dependent.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MIR’06, October 26–27, 2006, Santa Barbara, California, USA.
Copyright 2006 ACM 1-59593-495-2/06/0010 ...$5.00.
An image by itself usually has no intrinsic meaning. The mean-
ing is bestowed upon the image by a human observer regarding the
context of both the observer and the image.
The goal of the semantic approach is to replace the low-level
feature space with a higher-level semantic space, which is closer
to the abstract concepts the user has in mind when looking for an
image. Since the endeavour of obtaining semantic features directly
from the visual attributes was unfruitful, mining for semantic con-
cepts from a knowledge-base has been the focus of research to this
end. Most of the existing attempts towards semantic features can
be broadly categorised in two classes: annotation-based [7, 12] and
user-based [18, 3, 4]. This distinction arises from the nature of the
knowledge-base used: the first method relies on an (at least par-
tially) annotated image corpus from which semantic concepts can
be learnt and propagated to other images, whereas the latter learns
semantic concepts from the user directly. While there is a number
of general concepts that can universally be agreed upon, e.g. an ‘in-
door’ vs. ‘outdoor’ classification, there are more subtle meanings
that are subject to the observer’s interpretation, e.g. ‘a romantic
scene’. The major difference in the two approaches hence lies in
the interpretation context considered for deciphering the image’s
meaning. It should become obvious that the annotation-based ap-
proach can only succeed in taking very general concepts into con-
sideration, as opposed to user-based approaches that are tailored to
the user’s expectations and interpretations.
Our approach is an example of the latter in that contextual in-
formation is mined from user interaction. We have developed a
system, EGO, that encourages its users to manage their retrieval re-
sults on a workspace provided in the interface [20]. While search-
ing for images, the creation of groupings of related images is sup-
ported, inciting the user to break up the task into related facets to
organise their ideas and concepts. The system can then assist the
user by recommending relevant images for selected groups. Previ-
ous user experiments have shown that EGO helps to overcome the
query formulation problem and leads to a more effective and en-
joyable search experience compared to a state-of-the-art relevance
feedback interface [22].
In this work, we use the groupings created in the user experi-
ments to infer a semantic feature. Our underlying assumption is
that all objects (images) in one group share some semantic concept
(user-, usage-, and task-dependent), eg images of snowy mountains,
images with high visual contrasts, images that could be used as
background on the front of a flyer. Instead of trying to label these
concepts, however, we simply record that there is a semantic rela-
tion between those images in a group. We refer to these relation-
ships as peer information. Appropriately recorded, the peer infor-
mation can be used to implement long-term learning of semantic
concepts in the system.

In addition to the peer information, low-level visual features and
textual annotations are further sources of information for the re-
trieval (and recommendation) system. However, the combination
of different feature modalities is a big challenge in multimedia re-
trieval [11, 6, 19]. Most state-of-the-art systems treat each feature
individually and fuse the result lists to obtain the final results. How-
ever, the method of fusion is far from obvious and such systems fail
to capture dependencies between the features. Even worse, such
systems have difficulties in exceeding the performance of a text-
only system in information retrieval tasks [11]. Instead of a late
fusion of results, we propose to integrate the different modalities in
a single graph and use the theory of random walks [10] to calculate
retrieval results.
In our model, images, terms, and visual features are represented
as nodes in an Image-Context Graph (ICG). The links between
nodes represent: (1) image attributes (relations between images and
their features); (2) intra-feature relations (feature similarities); and
(3) semantic relations (peer information). We describe a retrieval
model based on random walks, that can retrieve both top matching
images as well as terms to a query (consisting of both image ex-
amples and terms). In addition, we show how short-term relevance
feedback learning can be integrated in our model by adapting the
link weights in the ICG. The main contributions of this paper are:
We propose a group-based contextual feature (peer informa-
tion) based on mining usage information while searching in
a multimedia collection.
We show how the peer information can be integrated with
already existing low-level visual features and textual annota-
tion in a graph model.
We define various learning strategies in the graph model.
Through systematic experimental results the effectiveness of
the proposed approach is validated and learning strategies are
investigated.
The remainder of this document is organised as follows. Sec-
tion 2 reviews related work. We detail the graph-model and explain
the mathematical background in Section 3. Section 4 introduces
the baseline systems used in the evaluation. It consists of three sep-
arate retrieval models for each feature modality, whose results are
combined using a rank-based list aggregation method. We outline
the experimental methodology in Section 5, followed by the exper-
imental results in Section 6. Finally, we summarise and conclude
the paper in Section 7.
2. RELATED WORK
The theory of Random Walks has been applied to information
retrieval in the form of Google’s famous PageRank algorithm [1].
The idea can be sketched as follows. Imagine a random surfer on
the Web choosing to follow a link on each page at random. Occa-
sionally, the surfer gets stuck in a dead end or in cycles, or simply
gets bored. At these points, he may randomly jump to another page
on the Web not following any links. The goal of a page’s PageR-
ank score is to reflect its quality depending on the number of other
pages linking to it based on the random surfer model. The PageR-
ank algorithm can be viewed as a random walk on the Web graph.
The mathematical details will be elaborated in Section 3.1.
2.1 Random Walks in the Image Domain
Graph-based modelling techniques have recently found their way
into the image domain. The two most closely related approaches
include its application for relevance feedback learning [3, 4] and
for image captioning [12]. Han et al have proposed to model the
relationships between images based on their co-selection in rele-
vance feedback sessions [3]. The ratio of the frequency of two
image being labelled as positive examples in the same retrieval ses-
sion over the total frequency of them having been selected together
(as positive or negative samples) determines the weight of the link
between these two images. The calculation of a semantic similar-
ity measure between two images is based on the overall correla-
tion as determined by analysing the resulting graph (referred to as
the image link network). An overall similarity measure is defined
as a weighted linear combination of the semantic similarity and
the low-level feature similarity. In contrast, the theory of random
walks is explicitly employed on an image graph in which links be-
tween image nodes are also constructed from relevance feedback
information in [4]. Here the graph is constructed by adding two
special nodes to the graph: a positive absorbing node and a neg-
ative absorbing node. Each positively labelled image receives a
link to the positive absorbing node, while negative examples are
directly linked to the negative absorbing node. As this approach is
not discriminating between query session, it can only be used for
short-term learning.
The second application of random walks in the image domain
is to automatically learn annotations for previously unlabelled im-
ages [12]. A graph, called GCap, is constructed, which contains
one node per image, a node for each image region per image, and
a node for all terms in the vocabulary. Images are connected to its
region nodes and the terms it is annotated with. Further, regions are
linked to their k-nearest neighbours. Given an unlabelled image, ie
an image node I
i
that does not have any links to a term node in the
graph, a random walk is performed to compute the most probable
terms for this image. These are found by calculating the long-term
(stationary) probabilities that a random walker finds himself at a
particular node given that it randomly restarts the walk from i. The
top t terms with the highest stationary probability are returned as
the suggested labels.
The semantic link approaches [3, 4] only model the information
gained from relevance feedback which has to be combined with
feature-based similarity values in a further step, while the image
captioning approach [12] only models image-feature similarities
without a facility of adaptation to relevance feedback. We propose
to model both the image-feature relations as well as inter-image (or
semantic) relations together. Hence there are two vital ingredients
to our approach: (1) the feature integration of semantic as well as
low-level features using a graph-model, and (2) a learning strategy
in the graph model. The latter incorporates two levels of feedback
to implement short- and long-term learning from user feedback. By
adding links between images that are grouped together the seman-
tic network is iteratively constructed and enforced by using adap-
tive link weights, thus implementing a long-term learning strategy.
Further, we show how short-term learning can be achieved by in-
troducing feature weights to ensure that those links to feature nodes
with a strong feature weight are favoured over feature links with
small weights given a particular query.
3. THE IMAGE-CONTEXT GRAPH
The problems addressed in this paper are (a) how to capture and
model personalised usage information to improve retrieval perfor-
mance, and (b) how to integrate this information with other features
(visual and textual) to model interdependencies between features.
The idea is to represent images and all their attributes (features)
in a graph. The graph consists of a number of layers of vertices:
vertices for all images in the collection, and one layer of vertices
per implemented feature. These layers will contain both visual and
textual features. There are two different types of edges connect-

Figure 1: An example image-context graph
ing vertices: edges representing a “contains” relationship (ie edges
between the image vertices and their attributes), and edges repre-
senting the similarity amongst vertices in the same layer (“similar-
ity edges”). These edges are constructed based on the similarity
between features (similarity between visual feature vectors, simi-
larity between terms) or semantic relationships/co-occurrences of
images. Thus the graph represents the images in context and in the
following it is referred to as the Image-Context Graph, or ICG. An
example graph containing three image nodes (I
1
,..,I
3
), four term
nodes (t
1
,...,t
4
), and two types of visual features ( f
1
, f
2
) is depicted
in Figure 1.
The general recommendation problem (or retrieval problem for
that matter) can be stated as: Given a query, consisting of image
examples and/or terms, compute the most similar images to recom-
mend to the user. In the ICG, this translates to: given a start set of
vertices in the graph, compute those image vertices that are most
likely to be reached starting from the start set.
A solution to this problem can be found in the theory of Random
Walks. The likelihood of passing a node in the ICG is given by
calculating the stationary distribution of the Markov chain induced
by the ICG. By setting the restart vector to the nodes representing
the query items, we can stage a Random Walk with Restarts on the
ICG. This is equivalent to computing a query-biased “PageRank”
of the ICG as will be explained in the following section.
3.1 Mathematical Background
A random walk is a finite-state Markov chain that is time-revers-
ible. Markov chains are frequently used to model physical and con-
ceptual processes that evolve over time, for example the spread of
disease within a population or the modelling of gambling. An intro-
duction to Random Walks and Markov chains can be found in [10].
Let the Markov chain M consist of a finite number of states, say
N = {1,2,...,n}, and probabilities of a transition occurring between
states at discrete time steps. The (one-step) transition probability
p
i j
, denotes the conditional probability that M will be in state j at
time t + 1 given that it was observed in state i at time t. In general,
p
k
i j
denotes the probability that M proceeds from state i to state
j after k transitions. The transition probability matrix P = [p
i j
]
is often used to represent M . The stationary distributions π
T
=
[π
1
,π
2
,...,π
n
] represent the long-run proportion of time the chain
M spends in each state. π is also referred to as the steady state
probability vector. Markov chains are often represented as a graph,
or state transition diagram G. Finally to make the connection to
PageRank: the PageRank scores are equivalent to the stationary
distribution π of the Markov chain associated with the Web graph.
3.1.1 Calculating π
In general the stationary distribution, π, of a Markov chain can
be found by solving the following eigenvector problem:
π =
P
T
π (1)
A unique stationary distribution is guaranteed to exist, iff P is a
stochastic, irreducible matrix [8].
In the PageRank model, a transition probability matrix P is built
from the hyperlink structure of the Web. To create a stochastic,
irreducible matrix, Brin and Page suggested to eliminate dangling
pages (pages with no outlinks) by linking them to all other pages
in the Web [1]. This is achieved by replacing 0
T
rows of the sparse
matrix P with dense vectors, that is the uniform vector
1
n
e
T
initially
or a more general probability distribution over all pages v
T
. This
stochastic fix can be modelled implicitly by the following transfor-
mations (see [8]):
P = P + a v
T
(2)
P = (1 α)P + α e v
T
(3)
where a is a vector whose elements a
i
= 1 if row i in P corresponds
to a dangling node, and 0 otherwise; e the vector of all 1s; 0
α 1; and v representing a general probability distribution over
the nodes—often referred to as the personalisation or restart vector.
Substituting
P in Equation 1 then leads to:
π = ((1 α)P + ((1 α)a + α e)v
T
)
T
π (4)
π = (1 α)(P + a v
T
)
T
π + α v (5)
with the constraint that π is normalised, such that |π| = 1 and thus
e
T
π = 1. α is the probability of restarting the random walk from
any of the nodes in v.
3.1.2 Parameters of the PageRank Model
α. The value of α denotes the probability of a surfer choosing
to jump to a new Web page (teleportation), while they choose to
click on hyperlinks with probability (1 α). A small α places
more emphasis on the hyperlink structure of the graph and much
less on the teleportation tendencies, and also slows convergence of
the iterative computation of PageRank. Originally α = 0.15 was
proposed [1].
In the image annotation graph of [12] a value of α = 0.65 was
found to be better suited, which they could explain by a relationship
to the estimated diameter of the graph.
The personalisation vector v
T
. Instead of the uniform distri-
bution
1
n
e
T
, a more general distribution v
T
> 0 can be used in its
place. v
T
is often referred to as personalisation vector or restart
vector in random walk terms.
The personalisation vector also allows PageRank to be made
query-sensitive. The original PageRank assigns a score to a page
proportional to the number of times a random surfer would visit
that page, if they surfed indefinitely, following all outlinks with
equal probability or occasionally jumping to a random new page
chosen with equal probability. If we change the probability distri-
bution given by the personalisation vector v
T
then we can introduce
a certain bias that the surfer jumps to pages with high probability
in v
T
.

3.2 Constructing the ICG
Let G be the ICG and V the set of vertices in G and E the set of
edges. Then G = (V,E). The graph will be stored in the form of its
adjacency matrix M.
3.2.1 The Nodes
There are three types of nodes: image nodes I , term nodes T ,
and feature nodes F , and V = I T F :
Let I denote the set of all image nodes in G. Add one node
per image to the set of image nodes. I
i
denotes the node for
image i.
Let T denote the set of all term nodes in G. Add one node
for every term in the vocabulary to T . t
i
denotes the node
for term i.
Construct the set of visual feature nodes F by adding one
node per low-level visual feature for each image. If the num-
ber of implemented visual features is v (which is 6 in our
case), then |F | = v ×|I |. f
i j
denotes the node for the j-th
feature of image i.
3.2.2 The Edges
There are two types of edges: attribute edges and similarity edges.
The first type of edges link images to their attributes, the second
type of edge links nodes of the same feature type (term and visual
feature nodes) based on the similarity between these nodes. A spe-
cial type of similarity edges are peer edges between image nodes
themselves, which are created based on users’ groupings of images.
Attribute Edges Each image node I
i
is linked to all its features.
Thus an edge is created to each of its visual feature nodes f
i1
,... f
iv
.
For the textual features, an edge is created between an image node
I
i
and a term node t
j
if image i is annotated with term j.
Similarity Edges Similar to [12], we propose to create edges be-
tween visual features based on their nearest neighbours. Consider a
feature node f
il
representing the l-th feature of image i, then com-
pute the top k nearest neighbours by calculating the similarity score
between the feature vector
f
il
and the feature vector
f
jl
for all other
images j (0 < j < |I |). This allows for an adaptive definition of
closeness without having to fix a threshold value.
A similar idea could be applied to the term nodes choosing a
similarity measure between terms based on relationships between
terms (eg using WordNet) or a collection-based analysis. Since the
number of terms contained in an image (annotations) is typically
very low (compared to text documents), a collection-based anal-
ysis is probably not very significant. Instead we adopt a simple
similarity measure sim(t
i
,t
j
) = 1 if i = j and 0 otherwise. Using
this similarity measure, we will obtain an edge that links each term
node to itself.
Peer Edges Finally, the edges between the image nodes them-
selves are based on user feedback. For each group created by a user,
edges are created connecting all the images in that group. An edge
between two images i and j has a weight, which generally reflects
the frequency of these images co-occurring in groups. However,
the weight can also be reduced by negative feedback (see below).
These edges represent high-level semantic relationships between
images based on their usage.
3.3 Evaluating a Query
The objective of retrieval in the graph is to find those image
nodes I that are closest (or best connected) to the query nodes.
The overview of the algorithm is as follows. First, the restart vector
is built from the query nodes. Then, a Random Walk with Restarts
Algorithm 1 Calculating the query results based on a Random
Walk on ICG
Require: Query consisting of image examples and query terms; M
the adjacency matrix of the ICG; constant 0 < α < 1;
Ensure: ||π||
1
= 1 (L
1
norm of π)
1: Initialise personalisation vector v.
2: M’ = normalise(M).
3: Initialise π
0
= v
4: Set k = 0 the number of iterations.
5: while not converged do
6: π
k
= (1 α) M
0
π
k1
α v
7: Normalise π
k
.
8: k = k+1
9: end while
10: return Image documents sorted by their π values after con-
vergence.
is performed on the graph to estimate the stationary probability dis-
tribution π. Finally, the image nodes are returned to the user sorted
in descending order by their steady state probability scores. Algo-
rithm 1 shows an overview of these steps.
Construction of the restart/personalisation vector. As-
sume a query contains a number of image examples and a set of
terms. The personalisation vector v is initialised, such that v(u) =
1
q
for all nodes u representing the image examples and terms, where q
is the size of the query. The remaining elements are set to 0. Choos-
ing the personalisation vector this way ensures that these nodes are
favoured in the following Random Walk computation.
Calculating π. Recall from Section 3.1.1 (cf Equation 1) that
the stationary distribution, π, of a Markov chain can be found by
solving the eigenvector problem: π = P
T
π. In the ICG, there are
no dangling nodes due to the way the ICG is constructed, so the
transformation to create a stochastic, irreducible matrix represent-
ing the ICG (cf Equation 2) can be simplified to:
P = (1 α)P +α e v
T
(6)
And the calculation of π can be achieved by:
π = (1 α)M
0
π + α v (7)
where M
0
(= P
T
) is the column normalised adjacency matrix of the
ICG. α is the probability of restarting the random walk from any of
the nodes in v.
The estimation of π is solved in the iterative algorithm detailed
in Alg 1. The algorithm converges if two consecutive estimates π
k
and π
k+1
are reasonably close together, ie |π
k
-π
k+1
| < threshold.
The threshold is set to 10
6
.
Returning the query results. Finally, we choose the top r
image nodes (ie the elements, π(u
i
), from π, where 1 i |I |)
and present them to the user.
3.4 Relevance Feedback
In this section we show how both long- and short-term learning
can be implemented in the ICG to create a retrieval system that
adapts to its users. On the one hand, relevance feedback is used
to build up the semantic or peer network (the subgraph consisting
of image nodes and the edges between them) over time. On the
other hand, short-term learning is implemented by computing a set

Citations
More filters
Book

Information retrieval

TL;DR: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval, which I think is one of the most interesting and active areas of research in information retrieval.
Proceedings ArticleDOI

On social networks and collaborative recommendation

TL;DR: This work created a collaborative recommendation system that effectively adapts to the personal information needs of each user, and adopts the generic framework of Random Walk with Restarts in order to provide with a more natural and efficient way to represent social networks.
Journal ArticleDOI

Categorising social tags to improve folksonomy-based recommendations

TL;DR: A mechanism to automatically filter and classify raw tags in a set of purpose-oriented categories, and shows that content- and context-based tags are considered superior to subjective and organisational tags, achieving equivalent performance to using the whole tag space.
Journal ArticleDOI

Interactive search in image retrieval: a survey

TL;DR: This article focuses on the topic of content-based image retrieval using interactive search techniques, i.e., how does one interactively find any kind of imagery from any source, regardless of whether it is photographic, MRI or X-ray?
Journal ArticleDOI

Generating Visual Summaries of Geographic Areas Using Community-Contributed Images

TL;DR: A novel approach for automatic visual summarization of a geographic area that exploits user-contributed images and related explicit and implicit metadata collected from popular content-sharing websites and presents a novel evaluation protocol, which does not require input of human annotators.
References
More filters
Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Journal Article

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

Sergey Brin, +1 more
- 01 Jan 1998 - 
TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
Journal ArticleDOI

Visual pattern recognition by moment invariants

TL;DR: It is shown that recognition of geometrical patterns and alphabetical characters independently of position, size and orientation can be accomplished and it is indicated that generalization is possible to include invariance with parallel projection.
Journal ArticleDOI

Content-based image retrieval at the end of the early years

TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.
Book

Image Processing: Analysis and Machine Vision

TL;DR: The digitized image and its properties are studied, including shape representation and description, and linear discrete image transforms, and texture analysis.
Frequently Asked Questions (14)
Q1. What have the authors contributed in "Adaptive image retrieval using a graph model for semantic feature integration" ?

In this paper the authors present a retrieval model and learning framework for the purpose of interactive information retrieval. The authors describe how semantic relations between multimedia objects based on user interaction can be learnt and then integrated with visual and textual features into a unified framework. In addition, the authors present ideas to implement short-term learning from relevance feedback. 

The disadvantages of this method is that it uses the original indices and similarity computations rather than the graph representation to determine feature weights. 

Instead of having to compare N vectors given a particular query, the inverted index facilitates a fast computation of the relevant results. 

The problems addressed in this paper are (a) how to capture and model personalised usage information to improve retrieval performance, and (b) how to integrate this information with other features (visual and textual) to model interdependencies between features. 

In the image annotation graph of [12] a value of α = 0.65 was found to be better suited, which they could explain by a relationship to the estimated diameter of the graph. 

The cosine similarity between a query vector Q and a document Di is defined assim(Q,Di) = ∑Vj=0 wQ, j ×wi, j√∑Vj=0(wQ, j)2 ∑ V j=0(wi, j)2(12)where V is the total number of terms, and wQ, j is the weight of term j in the query. 

The setup of these runs is the following: for each task 200 queries consisting of 3 example images are issued to the system and relevance feedback is performed over a total of 20 relevance feedback iterations. 

Through systematic experimental results the effectiveness ofthe proposed approach is validated and learning strategies are investigated. 

The links between nodes represent: (1) image attributes (relations between images and their features); (2) intra-feature relations (feature similarities); and (3) semantic relations (peer information). 

the graph structure could be used to determine the similarity between query nodes based on the three individual features. 

In addition to the peer information, low-level visual features and textual annotations are further sources of information for the retrieval (and recommendation) system. 

To create a stochastic, irreducible matrix, Brin and Page suggested to eliminate dangling pages (pages with no outlinks) by linking them to all other pages in the Web [1]. 

Instead of simply using the feature weights as a scaling factor for updating link weights, the authors can also envisage a more drastic weighting technique. 

The optimal query vector ~qi (for the i-th feature) is calculated as the centroid of the P positive examples specified by the user.