scispace - formally typeset
Open Access

Organizing WWW Images Based on the Analysis of Page Layout

TLDR
In this paper, a web page is partitioned into blocks, and the textual and link information of an image can be accurately extracted from the block containing that image by extracting the page-to-block, block-toimage, blockto-page relationships through link structure and page layout analysis, and using techniques from spectral graph theory for image clustering and embedding.
Abstract
Due to the rapid growth of the number of digital images on the Web, there is an increasing demand for effective and efficient method for organizing and retrieving the images available This paper describes a method for clustering and embedding WWW images By using a vision-based page segmentation algorithm, a web page is partitioned into blocks, and the textual and link information of an image can be accurately extracted from the block containing that image By extracting the page-to-block, block-to-image, block-to-page relationships through link structure and page layout analysis, we construct an image graph With the image graph model, we use techniques from spectral graph theory for image clustering and embedding Some experimental results are given in the paper

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Hierarchical clustering of WWW image search results using visual, textual and link information

TL;DR: Wang et al. as mentioned in this paper proposed a hierarchical clustering method using visual, textual and link analysis to organize the results into different semantic clusters to facilitate users' browsing, which can be applied to image search results.
Journal ArticleDOI

Clustering and searching WWW images using link and page layout analysis

TL;DR: In this article, a Web page is partitioned into blocks, and the textual and link information of an image can be accurately extracted from the block containing that image for image indexing.
Dissertation

Emergsem : une approche d'annotation collaborative et de recherche d'images basée sur les sémantiques émergentes

TL;DR: L’extraction de the semantique d’une image est un processus qui necessite une analyse profonde du contenu de l’image afin de faciliter sa recuperation, se refere a leur interpretation a partir d”un point of vuehumain”.

Supervised Segmentation of Web Pages for Vibro-Tactile Access on Touch-Screen Devices

TL;DR: This paper presents a new hybrid web page segmentation algorithm dedicated to vibro-tactile access on touch-screen devices, and presents a comparison between automatic segmented pages (obtained by the proposed algorithm) and manual segmenting pages.
Proceedings ArticleDOI

Techniques for generating multiresolution repositories of XML subschemas

TL;DR: Techniques for generating multi-resolution repositories of reusable XML subschemas are investigated and two approaches are proposed to calculate the weights of XML elements, based on the analysis of links and their attributes.
References
More filters
Journal ArticleDOI

The anatomy of a large-scale hypertextual Web search engine

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Proceedings Article

On Spectral Clustering: Analysis and an algorithm

TL;DR: A simple spectral clustering algorithm that can be implemented using a few lines of Matlab is presented, and tools from matrix perturbation theory are used to analyze the algorithm, and give conditions under which it can be expected to do well.
Journal ArticleDOI

Authoritative sources in a hyperlinked environment

TL;DR: This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.
Related Papers (5)