Showing papers by "Hiroyuki Kitagawa published in 2010"

PDF

Open Access

Book Chapter•DOI•

TURank: twitter user ranking based on user-tweet graph analysis

[...]

Yuto Yamaguchi¹, Tsubasa Takahashi¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

12 Dec 2010

TL;DR: In this paper, TURank (Twitter User Rank), which is an algorithm for evaluating users' authority scores in Twitter based on link analysis, is proposed, and experimental results show that the proposed algorithm outperforms existing algorithms.

...read moreread less

Abstract: In this paper, we address the problem of finding authoritative users in a micro-blogging service, Twitter, which is one of the most popular micro-blogging services [1]. Twitter has been gaining a public attention as a new type of information resource, because an enormous number of users transmit diverse information in real time. In particular, authoritative users who frequently submit useful information are considered to play an important role, because useful information is disseminated quickly and widely. To identify authoritative users, it is important to consider actual information flow in Twitter. However, existing approaches only deal with relationships among users. In this paper, we propose TURank (Twitter User Rank), which is an algorithm for evaluating users' authority scores in Twitter based on link analysis. In TURank, users and tweets are represented in a user-tweet graph which models information flow, and ObjectRank is applied to evaluate users' authority scores. Experimental results show that the proposed algorithm outperforms existing algorithms.

...read moreread less

146 citations

Journal Article•DOI•

MV-OPES: Multivalued-Order Preserving Encryption Scheme: A Novel Scheme for Encrypting Integer Value to Many Different Values

[...]

Hasan Kadhem¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

01 Sep 2010-IEICE Transactions on Information and Systems

TL;DR: A novel database encryption scheme called MV-OPES (Multivalued — Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level and preserves the order of the integer values to allow comparison operations to be directly applied on encrypted data.

...read moreread less

Abstract: Encryption can provide strong security for sensitive data against inside and outside attacks. This is especially true in the “Database as Service” model, where confidentiality and privacy are important issues for the client. In fact, existing encryption approaches are vulnerable to a statistical attack because each value is encrypted to another fixed value. This paper presents a novel database encryption scheme called MV-OPES (Multivalued — Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level. Our idea is to encrypt a value to different multiple values to prevent statistical attacks. At the same time, MV-OPES preserves the order of the integer values to allow comparison operations to be directly applied on encrypted data. Using calculated distance (range), we propose a novel method that allows a join query between relations based on inequality over encrypted values. We also present techniques to offload query execution load to a database server as much as possible, thereby making a better use of server resources in a database outsourcing environment. Our scheme can easily be integrated with current database systems as it is designed to work with existing indexing structures. It is robust against statistical attack and the estimation of true values. MV-OPES experiments show that security for sensitive data can be achieved with reasonable overhead, establishing the practicability of the scheme.

...read moreread less

58 citations

Proceedings Article•DOI•

Complex Event Processing over Uncertain Data Streams

[...]

Hideyuki Kawashima¹, Hiroyuki Kitagawa¹, Xin Li•Institutions (1)

University of Tsukuba¹

04 Nov 2010

TL;DR: This paper proposes an optimized method to not only calculate the probability of outputs of compound events but also obtain the value of confidence of the complex pattern given by user against uncertain raw input data stream generated by distrustful network devices.

...read moreread less

Abstract: Pattern matching over event streams is well developed. However, with the increasing demand of measurement accuracy, confidence of more complex events sourced from original, continuously arriving events generated from sensor kind electronic devices is becoming more and more been concerned. Actually, some applications such as RFID-based supply chain management and monitoring in health care require data stream with high reliability, but current hardware and wireless communication techniques cannot support 100% confident data, one stream processing engine which can report confidence for processed complex events over uncertain data is needed. In this paper, we propose an optimized method to not only calculate the probability of outputs of compound events but also obtain the value of confidence of the complex pattern given by user against uncertain raw input data stream generated by distrustful network devices. Our proposal is based on an existing stream processing engine SASE+, and we extend its evaluation model NFAb automaton to a new type of automaton in order to manage the runtime against probabilistic stream. In the design of automaton, we consider optimizations to reduce the computation cost and response time to a realistic degree with long sliding time window.

...read moreread less

43 citations

Proceedings Article•

A secure and efficient order preserving encryption scheme for relational databases

[...]

Hasan Kadhem¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

01 Jan 2010

TL;DR: A novel database encryption scheme called MV-POPES (Multivalued Partial Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level and is robust against known plaintext attacks and statistical attacks.

...read moreread less

Abstract: Encryption is a well-studied technique for protecting the confidentiality of sensitive data. However, encrypting relational databases affects the performance during query processing. Preserving the order of the encrypted values is a useful technique to perform queries over the encrypted database with a reasonable overhead. Unfortunately, the existing order preserving encryption schemes are not secure against known plaintext attacks and statistical attacks. In those attacks, it is assumed that the attacker has prior knowledge about plaintext values or statistical information on the plaintext domain. This paper presents a novel database encryption scheme called MV-POPES (Multivalued Partial Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level. Our idea is to divide the plaintext domain into many partitions and randomize them in the encrypted domain. Then, one integer value is encrypted to different multiple values to prevent statistical attacks. At the same time, MV-POPES preserves the order of the integer values within the partitions to allow comparison operations to be directly applied on encrypted data. Our scheme is robust against known plaintext attacks and statistical attacks. MV-POPES experiments show that security for sensitive data can be achieved with reasonable overhead, establishing the practicability of the scheme.

...read moreread less

37 citations

Book Chapter•DOI•

An efficient algorithm for reverse furthest neighbors query with metric index

[...]

Jianquan Liu¹, Hanxiong Chen¹, Kazutaka Furuse¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

30 Aug 2010

TL;DR: This paper proposes an efficient algorithm for RFN query with metric index, and adapts the convex hull property to enhance the efficiency, but its computation is not on the fly.

...read moreread less

Abstract: The variants of similarity queries have been widely studied in recent decade, such as k-nearest neighbors (k-NN), range query, reverse nearest neighbors (RNN), an so on. Nowadays, the reverse furthest neighbor (RFN) query is attracting more attention because of its applicability. Given an object set O and a query object q, the RFN query retrieves the objects of O, which take q as their furthest neighbor. Yao et al. proposed R-tree based algorithms to handle the RFN query using Voronoi diagrams and the convex hull property of dataset. However, computing the convex hull and executing range query on R-tree are very expensive on the fly. In this paper, we propose an efficient algorithm for RFN query with metric index. We also adapt the convex hull property to enhance the efficiency, but its computation is not on the fly. We select external pivots to construct metric indexes, and employ the triangle inequality to do efficient pruning by using the metric indexes. Experimental evaluations on both synthetic and real datasets are performed to confirm the efficiency and scalability.

...read moreread less

12 citations

Journal Article•DOI•

Parallel holistic twig joins on a multi‐core system

[...]

Imam Machdi¹, Toshiyuki Amagasa, Hiroyuki Kitagawa•Institutions (1)

University of Tsukuba¹

22 Jun 2010-International Journal of Web Information Systems

TL;DR: This paper proposes general parallelism techniques for holistic twig join algorithms to process queries against Extensible Markup Language (XML) databases on a multi‐core system.

...read moreread less

Abstract: Purpose – The purpose of this paper is to propose general parallelism techniques for holistic twig join algorithms to process queries against Extensible Markup Language (XML) databases on a multi‐core system.Design/methodology/approach – The parallelism techniques comprised data and task parallelism. As for data parallelism, the paper adopted the stream‐based partitioning for XML to partition XML data as the basis of parallelism on multiple CPU cores. The XML data partitioning was performed in two levels. The first level was to create buckets for creating data independence and balancing loads among CPU cores; each bucket was assigned onto a CPU core. Within each bucket, the second level of XML data partitioning was performed to create finer partitions for providing finer parallelism. Each CPU core performed the holistic twig join algorithm on each finer partition of its own in parallel with other CPU cores. In task parallelism, the holistic twig join algorithm was decomposed into two main tasks, which wer...

...read moreread less

10 citations

Book Chapter•DOI•

Optimization techniques for range queries in the multivalued-partial order preserving encryption scheme

[...]

Hasan Kadhem¹, Toshiyuki Amagasa¹, Toshiyuki Amagasa², Hiroyuki Kitagawa¹•Institutions (2)

University of Tsukuba¹, Japan Aerospace Exploration Agency²

25 Oct 2010

TL;DR: Some optimization techniques to reduce the overhead for range queries in MV-POPES by simplifying the translated condition and controlling the randomness of the encrypted partitions are presented.

...read moreread less

Abstract: Encryption is a well-studied technique for protecting the privacy of sensitive data. However, encrypting relational databases affects the performance during query processing. Multivalued-Partial Order Preserving Encryption Scheme (MV-POPES) allows privacy preserving queries over encrypted databases with reasonable overhead and an improved security level. It divides the plaintext domain into many partitions and randomizes them in the encrypted domain. Then, one integer value is encrypted to different multiple values to prevent statistical attacks. At the same time, MV-POPES preserves the order of the integer values within the partitions to allow comparison operations to be directly applied on encrypted data. However, MV-POPES supports range queries at a high overhead. In this paper, we present some optimization techniques to reduce the overhead for range queries in MV-POPES by simplifying the translated condition and controlling the randomness of the encrypted partitions. The basic idea of our approaches is to classify the partitions into many supersets of partitions, then restrict the randomization within each superset. The supersets of partitions are created either based on predefined queries or using binary recursive partition. Experiments show high improvement percentage in performance using the proposed optimization approaches. Also, we study the affect of those optimization techniques on the privacy level of the encrypted data.

...read moreread less

9 citations

Journal Article•DOI•

Query result caching for multiple event-driven continuous queries

[...]

Yousuke Watanabe¹, Hiroyuki Kitagawa²•Institutions (2)

Tokyo Institute of Technology¹, University of Tsukuba²

01 Jan 2010-Information Systems

TL;DR: This paper proposes an efficient data stream processing scheme for multiple event-driven continuous queries that are activated by foreign events such as data arrival and the progression of time and introduces query result caching to achieve a flexible way to share common operators among queries activated by unpredictable events.

...read moreread less

8 citations

Proceedings Article•DOI•

A Faceted-Navigation System for QCDml Ensemble XML Data

[...]

Toshiyuki Amagasa¹, Noriyoshi Ishii², T. Yoshié¹, Osamu Tatebe², Mitsuhisa Sato¹, Hiroyuki Kitagawa¹ - Show less +2 more•Institutions (2)

University of Tsukuba¹, University of Tokyo²

04 Nov 2010

TL;DR: This work attempts to design and implement a dedicated faceted navigation system for QCDml on top of an XML database, and makes use of a relational database system as the engine to speed up the aggregate computation.

...read moreread less

Abstract: In this paper we describe a faceted navigation system for QCDml ensemble XML data, which is an XML-based metadata format for ILDG (International Lattice Data Grid). A faceted navigation system allows a user to search for one's desired information in an exploratory way, thereby enabling the user to browse a set of XML data without using specialized query languages such as XPath and XQuery. However, designing a faceted navigation interface for XML data is not straightforward due to the tree and flexible, tree-like nature of XML. In this work, we attempt to design and implement a dedicated faceted navigation system for QCDml on top of an XML database. The interface is designed by taking the domain experts' usability into account. We also care about the system's performance. In general, the process of faceted navigation is computationally expensive because of the need for aggregate computation of each available facets. In order to alleviate this, we make use of a relational database system as the engine to speed up the aggregate computation. We finally demonstrate the implemented faceted navigation system, which has been made available on the Web.

...read moreread less

4 citations

Book Chapter•DOI•

Stream-Based Real World Information Integration Framework

[...]

Hiroyuki Kitagawa¹, Yousuke Watanabe², Hideyuki Kawashima¹, Toshiyuki Amagasa¹•Institutions (2)

University of Tsukuba¹, Tokyo Institute of Technology²

10 Sep 2010-Studies in computational intelligence

TL;DR: This chapter presents a framework that directly supports efficient processing and a variety of advanced functions on event detection, which include complex event processing, probabilistic reasoning, and continuous media integration.

...read moreread less

Abstract: For real world oriented applications to easily use sensor data obtained frommultiple wireless sensor networks, a data management infrastructure is mandatory. The infrastructure design should be based on the philosophy of a novel framework beyond the relational data management for two reasons: First is the freshness of data. To keep sensor data fresh, an infrastructure should process data efficiently; this means conventional time consuming transaction processing methodology is inappropriate. Second is the diversity of functions. The primary purpose of sensor data applications is to detect events; unfortunately, relational operators contribute little toward this purpose. This chapter presents a framework that directly supports efficient processing and a variety of advanced functions. Stream processing is the key concept of the framework. Regarding the efficiency requirement, we present a multiple query optimization technique for query processing over data streams; we also present an efficient data archiving technique. To meet the functions requirement, we present several techniques on event detection, which include complex event processing, probabilistic reasoning, and continuous media integration.

...read moreread less

3 citations

Proceedings Article•DOI•

A-SAS: An Adaptive High-Availability Scheme for Distributed Stream Processing Systems

[...]

Hiroaki Shiokawa¹, Hiroyuki Kitagawa¹, Hideyuki Kawashima¹•Institutions (1)

University of Tsukuba¹

23 May 2010

TL;DR: A new high-availability scheme called Adaptive Semi-Active Standby (A-SAS) is proposed, which enables adaptive tradeoff between bandwidth usage and recovery time and experimental results suggest effectiveness.

...read moreread less

Abstract: Distributed stream processing engines (DSPEs) have recently been studied to meet the needs of continuous query processing. Because they are built on the cooperation of several stream processing engines (SPEs), node failures cause the whole system to fail. This paper proposes a new high-availability scheme called Adaptive Semi-Active Standby (A-SAS). A-SAS enables adaptive tradeoff between bandwidth usage and recovery time. This paper presents the properties of A-SAS and experimental results that suggest A-SAS effectiveness.

...read moreread less

Proceedings Article•DOI•

RDF packages: a scheme for efficient reasoning and querying over large-scale RDF data

[...]

Shohei Ohsawa¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

08 Nov 2010

TL;DR: This paper proposes RDF packages, which is a time and space efficient format for RDF data, which can be applied without any modification when performing RDFS reasoning, and demonstrates the performance of the proposed scheme in triple size, reasoning speed, and querying speed.

...read moreread less

Abstract: When querying RDF and RDFS data, for improving the performance, it is common to derive all triples according to RDFS entailment rules before query processing. An undesirable drawback of this approach is that a large number of triples are generated by the RDFS reasoning, and hence considerable amount of storage space is required if we materialize the RDFS closure. In this paper, we propose RDF packages, which is a time and space efficient format for RDF data. In an RDF package, a set of triples of the same class or triples having the same predicate are grouped into a dedicated node named Package. Using Packages, we can represent any metadata that can be expressed by RDF. An important feature of the RDF packages is that, when performing RDFS reasoning, the same rules can be applied without any modification, thereby allowing us to use existing RDFS reasoners. In this paper, we discuss the model of RDF packages and its rules, followed by the transformation between RDF and RDF packages. We also discuss the implementation RDF packages using an existing RDF framework. Finally, we demonstrate the performance of the proposed scheme in triple size, reasoning speed, and querying speed.

...read moreread less

Journal Article•DOI•

Social Bookmarking Induced Active Page Ranking

[...]

Tsubasa Takahashi¹, Hiroyuki Kitagawa¹, Keita Watanabe¹•Institutions (1)

University of Tsukuba¹

01 Jun 2010-IEICE Transactions on Information and Systems

TL;DR: By focusing on the timestamp sequence of social bookmarkings on web pages, this paper model their activation levels representing current values and improves the previously proposed ranking method for web search by introducing the activation level concept.

...read moreread less

Abstract: Social bookmarking services have recently made it possible for us to register and share our own bookmarks on the web and are attracting attention. The services let us get structured data: (URL, Username, Timestamp, Tag Set). And these data represent user interest in web pages. The number of bookmarks is a barometer of web page value. Some web pages have many bookmarks, but most of those bookmarks may have been posted far in the past. Therefore, even if a web page has many bookmarks, their value is not guaranteed. If most of the bookmarks are very old, the page may be obsolete. In this paper, by focusing on the timestamp sequence of social bookmarkings on web pages, we model their activation levels representing current values. Further, we improve our previously proposed ranking method for web search by introducing the activation level concept. Finally, through experiments, we show effectiveness of the proposed ranking method.

...read moreread less

Book Chapter•DOI•

Providing constructed buildings information by ASTER satellite DEM images and web contents

[...]

Takashi Takagi¹, Hideyuki Kawashima¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

01 Apr 2010

TL;DR: An event detection system that extracts candidate events from satellite images, collects information about them from the Web, and integrates them, and the result of evaluation showed that the system detected some information of building construction events with appropriate web contents in Tsukuba, Japan.

...read moreread less

Abstract: It has become easy to accumulate and to deliver scientific data by the evolution of computer technologies. The GEO Grid project has collected global satellite images from 2000 to present, and the amount of the collection is about 150 TB. It is required to generate new values by integrating satellite images with heterogeneous information such as Web contents or geographical data. Using GEO Grid satellite images, some researches detect feature changes such as earthquakes, fires and newly constructed building. In this paper, detections of feature changes from time series satellite image are referred to as events, and we focus on events about newly constructed buildings. Usually, there are articles about such newly constructed buildings on the Web. For example, a newly started shopping center is usually introduced in a news report, and a newly constructed apartment is often on the lips of neighboring residents. So, we propose an event detection system that extracts candidate events from satellite images, collects information about them from theWeb, and integrates them. This system consists of an event detection module and a Web contents collection module. The event detectionmodule detects geographical regions that have differences with elevation values between two satellite images which are temporally different. The expressions of regions are translated from latitudes/longitudes to building names by using an inverse geocoder. Then, the contents collection module collects Web pages by querying names of buildings to a search engine. The collected pages are re-ranked based on temporal information which is close to event occurrence time. We developed a prototype system. The result of evaluation showed that the system detected some information of building construction events with appropriate web contents in Tsukuba, Japan.

...read moreread less

Proceedings Article•DOI•

Extracting XML data from the web

[...]

Ngo Sy Viet Phu¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

08 Nov 2010

TL;DR: This paper proposes an algorithm for extracting complex records like XML by utilizing an existing IE technique, and points out a naive implementation that does not work well, and proposes an improved scheme for more efficient XML record extraction.

...read moreread less

Abstract: Information Extraction (IE) is a technique to extract structured information (record) from unstructured documents such as Web pages. However, existing techniques are basically aiming at extracting simple records, such as binary relationships like "(company, location)" or named entities like "(organization)". In this paper, we propose an algorithm for extracting complex records like XML by utilizing an existing IE technique. Given a set of seed records in the form of XML data (XML records), we firstly infer the schema information from the XML records. Then, we transform the XML records to a set of relational records consisting of several tables. The obtained relational tables are decomposed into a set of binary relations, and they are forwarded to a record extraction system. We reconstruct XML data from the results obtained from the record of the extraction system. We point out a naive implementation docs not work well, and propose an improved scheme for more efficient XML record extraction. We evaluate the effectiveness of our proposed algorithm in some experiments.

...read moreread less

Proceedings Article•DOI•

A constraint-based tool for data integrity management on the web

[...]

Masami Takahashi¹, Atsuyuki Morishima¹, Hiroyuki Kitagawa¹, Shigeo Sugimoto¹•Institutions (1)

University of Tsukuba¹

14 Jan 2010

TL;DR: A system to maintain the content integrity of Web sites without backend databases is proposed and weak inclusion relationships are proposed, which are inclusion relationships associated with inclusion ratios.

...read moreread less

Abstract: Today, publishing information on Web sites is common. And the size of the Web contents that need to be managed is increasing. Therefore it is important to maintain content integrities on the Web. This paper proposes a system to maintain the content integrity of Web sites without backend databases. First, we explain the architecture of the proposed system. Second, we address the problem of finding integrity constraints used as the input to the system. We focus on inclusion dependencies among HTML/XML elements and discuss how to find inclusion relationships that can be used as hints to find inclusion dependencies. In particular, we propose to introduce weak inclusion relationships, which are inclusion relationships associated with inclusion ratios. Finally, we propose a filter-based approach to the efficient discovery of weak inclusion relationships and discuss some of its possible implementations.

...read moreread less

Book Chapter•DOI•

Fast detection of functional dependencies in XML data

[...]

Hang Shi¹, Toshiyuki Amagasa¹, Hiroyuki Kitagawa¹•Institutions (1)

University of Tsukuba¹

17 Sep 2010

TL;DR: A scheme for efficiently detecting functional dependency in XML data (XFD) by modifying the basic PipeSort algorithm by incorporating a pruning mechanism by taking the features of XFDs into account, thereby making the whole process even faster.

...read moreread less

Abstract: In this paper we discuss a scheme for efficiently detecting functional dependency in XML data (XFD). The ability to detect XFD in XML data is useful in many real-life applications, such as XML schema design, relational schema design based on XML data, and redundancy detection in XML data. However, detection of XFD is an expensive task, and an efficient algorithm is essential in order to deal with large XML data collection. For this reason, we propose an efficient way to detect XFD in XML data. We assume that XML data being processed are represented as hierarchically organized relational tables. Given such data, we attempt to detect XFDs existing within and among the tables. Our basic idea is to adopt the PipeSort algorithm, which has been successfully used in OLAP, to detect XFDs within a table. We modify the basic PipeSort algorithm by incorporating a pruning mechanism by taking the features of XFDs into account, thereby making the whole process even faster. Having obtained a set of XFDs existing in tables, we attempt to detect XFDs existing among tables. In this process, we also make use of the features of XFDs for pruning. We show the feasibility of our scheme by some experiments.

...read moreread less

Journal Article•DOI•

Efficient Load Balancing Techniques for Self-organizing Content Addressable Networks

[...]

Djelloul Boukhelef, Hiroyuki Kitagawa

03 Jan 2010-Journal of Networks

TL;DR: This paper addresses the problem of load balancing in large scale and selforganizing P2P systems managing multidimensional data, and proposes simple and efficient decentralized mechanisms to evenly distribute the data load among the participating nodes in Content Addressable Networks.

...read moreread less

Abstract: Balancing the load in a decentralized P2P system is a challenging problem due to the dynamic nature of such environment and the absence of global knowledge about the actual composition of the system.In this paper, we address the problem of load balancing in large scale and selforganizing P2P systems managing multidimensional data. We propose simple and efficient decentralized mechanisms to evenly distribute the data load among the participating nodes in Content Addressable Networks. The basic idea is to enable a new node that joins the system to share the load with a heavily loaded node which is already in the system, such that the load is still evenly distributed among all the participating nodes. In the multiple random choices method, the new node probes the load of some existing nodes selected uniformly at random, then chooses the heaviest node among them to share the load with. In this paper, we extend this method in three ways. First, the new node probes a pool of nodes proportional to the network size and composition. Specifically, the number of probed nodes is logarithmic to the network size. This property enables to achieve a constant load imbalance factor which is very small without the need to estimate the network size. Second, the probed nodes are not selected at random, but they are well spread over the key space; which enables a good estimation of the actual data distribution and network composition, which enables to cope well with large-scale data imbalance. Third, the selection of nodes to probe is restricted to the immediate and distant neighbors of a randomly chosen node. The cost incurred by our join-based load balancing method is very small, since all load information is piggybacked to periodic maintenance messages exchanged between nodes and their neighbors. Unlike other methods, we do not make use of external index nor assume any global knowledge. We also generalize the first method to enable locating a heavily loaded node through a sequential walk starting from a randomly selected node. This new method incurs additional overhead, however it achieves much smaller load imbalance. We also study the robustness of our join-based load balancing method against adversarial attacks. Using simulation, we analyze the impact of the number of entry points on load balancing. To the best of our knowledge we are the first to address this problem. We conduct an experimental study using uniform and nonuniform data distributions to demonstrate the effectiveness and the scalability of our proposals.

...read moreread less

Proceedings Article•DOI•

Topic-based awareness computing model for video-sharing service

[...]

Mariko Kamie¹, Takako Hashimoto², Hiroyuki Kitagawa¹•Institutions (2)

University of Tsukuba¹, Chiba University of Commerce²

17 Dec 2010

TL;DR: This paper proposed a novel framework of a system which improves the awareness of the outline of video retrieval result, detects topics from a set of retrieval matched videos utilizing time data, analyzes topic properties using author's diversity and offers an interface which lets person to grasp the outlines of video retrieved result in one glance.

...read moreread less

Abstract: A video-sharing service, where users can upload videos to, have rapidly spread recently. The number of videos in the service has increased dramatically as well as the number of users. With the rapid growth of the number of videos, we are faced with is having to arrange and categorize videos. However, it is difficult to let a computer aware the contents (or topics) of videos. This paper focused on such video-sharing services and proposed a novel framework of a system which improves the awareness of the outline of video retrieval result. Our proposed system detects topics from a set of retrieval matched videos utilizing time data, analyzes topic properties using author's diversity and offers an interface which lets person to grasp the outline of video retrieval result in one glance. We also presented an experimental result of applying our proposed methods for some sets of video retrieval result.

...read moreread less