scispace - formally typeset
Search or ask a question

Showing papers by "Katsumi Tanaka published in 2007"


Proceedings ArticleDOI
18 Jun 2007
TL;DR: This paper proposes combining the widely used link-based ranking metric with the one derived using social bookmarking data, and shows the prototype system that implements the proposed approach and presents some preliminary results.
Abstract: Social bookmarking is an emerging type of a Web service that helps users share, classify, and discover interesting resources. In this paper, we explore the concept of an enhanced search, in which data from social bookmarking systems is exploited for enhancing search in the Web. We propose combining the widely used link-based ranking metric with the one derived using social bookmarking data. First, this increases the precision of a standard link-based search by incorporating popularity estimates from aggregated data of bookmarking users. Second, it provides an opportunity for extending the search capabilities of existing search engines. Individual contributions of bookmarking users as well as the general statistics of their activities are used here for a new kind of a complex search where contextual, temporal or sentiment-related information is used. We investigate the usefulness of social bookmarking systems for the purpose of enhancing Web search through a series of experiments done on datasets obtained from social bookmarking systems. Next, we show the prototype system that implements the proposed approach and we present some preliminary results.

213 citations


Book ChapterDOI
16 Sep 2007
TL;DR: A system to help users determine the trustworthiness of Web search results by computing and showing each returned page's topic majority, topic coverage, locality of supporting pages, and other information is developed.
Abstract: Increased usage of Web search engines in our daily lives means that the trustworthiness of searched results has become crucial User studies on the usage of search engines and analysis of the factors used to determine trust that users have in search results are described in this paper Based on the analysis, we developed a system to help users determine the trustworthiness of Web search results by computing and showing each returned page's topic majority, topic coverage, locality of supporting pages (ie, pages linked to each search result) and other information The measures proposed in the paper can be applied to the search of Web-based libraries or can be useful in the usage of digital library search systems

77 citations


Proceedings ArticleDOI
15 Jan 2007
TL;DR: This paper proposes a novel method for context-aware query refinement based on their current geographic location and real-world contexts which are inferred from the place-names representing the location and the Web as a knowledge database.
Abstract: Mobile Web search engines as well as fixed ones become much more important for our Web access in the near future. However, mobile computing environments have troublesome limitations, as compared with traditional desktop computing environments. In order to solve the so-called "mismatched query problem" due to more shortness and ambiguity of mobile users' original queries, this paper proposes a novel method for context-aware query refinement based on their current geographic location and real-world contexts which are inferred from the place-names representing the location and the Web as a knowledge database

40 citations


Proceedings ArticleDOI
09 Nov 2007
TL;DR: This paper describes a novel concept for detecting approximate creation dates of content elements in Web pages based on dynamically reconstructing page histories using data extracted from external sources - Web archives and efficiently searching inside them to detect insertion dates ofcontent elements.
Abstract: Web pages often contain objects created at different times. The information about the age of such objects may provide useful context for understanding page content and may serve many potential uses. In this paper, we describe a novel concept for detecting approximate creation dates of content elements in Web pages. Our approach is based on dynamically reconstructing page histories using data extracted from external sources - Web archives and efficiently searching inside them to detect insertion dates of content elements. We discuss various issues involving the proposed approach and demonstrate the example of an application that enhances browsing the Web by inserting annotations with temporal metadata into page content on user request.

35 citations


Patent
12 Jan 2007
TL;DR: A page re-ranking system includes a super page producing part that produces super pages where page contents are combined between multiple versions for each of multiple Web pages that can be obtained as a search result page in compliance with a user's query and to which a page ranking is created as discussed by the authors.
Abstract: A page re-ranking system includes a super page producing part that produces a super page where page contents are combined between multiple versions for each of multiple Web pages that can be obtained as a search result page in compliance with a user's query and to which a page ranking is created, a super page analyzing part that analyzes a covering degree of a topic representation that is contained in the super page produced by the super page producing part, and a re-ranking part that grants a renewed page ranking to each of the Web pages by comparing the analysis results obtained by the super page analyzing part between the super pages.

34 citations


Book ChapterDOI
16 Jun 2007
TL;DR: A system that helps the user to determine trustworthiness of a statement that he or she is unconfident about is proposed and a method to estimate popularity from temporal viewpoint is proposed.
Abstract: If the user wants to know trustworthiness of a proposition, such as whether gthe Japanese Prime Minister is Junichiro Koizumih is true or false, conventional search engines are not appropriate. We therefore propose a system that helps the user to determine trustworthiness of a statement that he or she is unconfident about. In our research, we estimate trustworthiness of a proposition by aggregating knowledge from the Web and analyzing creation time of web pages. We propose a method to estimate popularity from temporal viewpoint by analyzing how many pages discussed the proposition in a certain period of time and how continuously it appeared on the Web.

33 citations


Book ChapterDOI
09 Jan 2007
TL;DR: A system that extracts typical visitor's travel routes based on blog entries and that presents multimedia content relevant to those routes that works as an automatically generated tour guide accessible from a PC or a mobile device is described.
Abstract: It has recently become a common practice for people to post their sightseeing experiences on weblogs (blogs). Their blog entries often contain valuable information for potential tourists, who can learn about various aspects not found on the official websites of sightseeing spots. Bloggers provide images, videos and texts regarding the places they visited. This implies that popular travel routes could be extracted according to the information available in blogs. In this paper, we describe a system that extracts typical visitor's travel routes based on blog entries and that presents multimedia content relevant to those routes. Typical travel routes are extracted by using a sequential pattern mining method. We also introduce a new user interface for presenting multimedia content along the route in a proactive manner. The system works as an automatically generated tour guide accessible from a PC or a mobile device.

31 citations


Book ChapterDOI
16 Jul 2007
TL;DR: This paper investigates the possibility and potential benefits of a hybrid page ranking approach that would combine the ranking criteria of PageRank with the one based on social bookmarks in order to improve the search in the Web.
Abstract: Social bookmarking services have become recently popular in the Web. Along with the rapid increase in the amount of social bookmarks, future applications could leverage this data for enhancing search in the Web. This paper investigates the possibility and potential benefits of a hybrid page ranking approach that would combine the ranking criteria of PageRank with the one based on social bookmarks in order to improve the search in the Web. We demonstrate and discuss the results of analytical study made in order to compare both popularity estimates. In addition, we propose a simple hybrid search method that combines both ranking metrics and we show some preliminary experiments using this approach. We hope that this study will shed new light on the character of data in social bookmarking systems and foster development of new, effective search applications for the Web.

24 citations


Book ChapterDOI
03 Sep 2007
TL;DR: The method is based on text mining applied to Web search results and obtains a set of pairs of a visual modifier and a component/class for a given name of object which best describe its appearance.
Abstract: This paper presents a method to extract appearance descriptions for a given set of objects. Conversion between an object name and its appearance descriptions is useful for various applications, such as searching for an unknown object, memory recall support, and car/walk navigation. The method is based on text mining applied to Web search results. Using a manually constructed dictionary of visual modifiers, our system obtains a set of pairs of a visual modifier and a component/class for a given name of object, which best describe its appearance. The experimental results have demonstrated the effectiveness of our method in discovering appearance descriptions of various types of objects.

21 citations


Book ChapterDOI
03 Sep 2007
TL;DR: This paper proposes methods of reranking Web search results that depend on the user's delete and emphasis operations and proposes a method to support deletion and emphasis by using Tag-Clouds.
Abstract: The conventional Web search has two problems. The first is that users' search intentions are diverse. The second is that search engines return a huge number of search results which are not ordered correctly. These problems decrease the accuracy of Web searches. To solve these problems, in our past work, we proposed a reranking system based on the user's search intentions whereby the user edits a part of the search results and the editing operations are propagated to all of the results to rerank them. In this paper, we propose methods of reranking Web search results that depend on the user's delete and emphasis operations. Then, we describe their evaluation. In addition, we propose a method to support deletion and emphasis by using Tag-Clouds.

21 citations


Proceedings ArticleDOI
28 Jan 2007
TL;DR: This paper proposes a temporal filtering system called the Anti-Spoiler system, which analyzes a user-requested Web content, and uses filters to prevent portions of the content being displayed that might spoil user's enjoyment.
Abstract: This paper proposes a temporal filtering system called the Anti-Spoiler system. The system changes filters dynamically based on user-specified preferences and the user's timetable. The system then blocks contents that would spoil the user's enjoyment of a previously unwatched content. The system analyzes a user-requested Web content, and then uses filters to prevent portions of the content being displayed that might spoil user's enjoyment. For example, the system hides the final score of football from the Web content before watching it on TV.

Book ChapterDOI
12 Sep 2007
TL;DR: Evaluation of FNR showed that the user vectors can be determined by FNR based on the sentiments of the read articles about a topic and that it can provide a unique interface with categories containing the recommended articles.
Abstract: We have developed a news portal site called Fair News Reader (FNR) that recommends news articles with different sentiments for a user in each of the topics in which the user is interested FNR can detect various sentiments of news articles, and determine the sentimetal preferences of a user based on the sentiments of previously read articles by the user While there are many news portal sites on the Web, such as GoogleNews, Yahoo!, and MSN News, they can not recommend and present news articles based on the sentiments they are likely to create since they simply select articles based on whether they contain user-specified keywords FNR collects and recommends news articles based on the topics in which the user is interested and the sentiments the articles are likely to create Eight of the sentiments each article is likely to create are represented by an "article vector" with four elements Each element corresponds to a measure consisting of two symmetrical sentiments The sentiments of the articles previously read with respect to a topic are then extracted and represented as a "user vector" Finally, based on a comparison between the user and article vectors in each topic, FNR recommends articles that have symmetric sentiments against the sentiments of read articles by the user for fair reading about the topic Evaluation of FNR using two experiments showed that the user vectors can be determined by FNR based on the sentiments of the read articles about a topic and that it can provide a unique interface with categories containing the recommended articles

Book ChapterDOI
16 Jul 2007
TL;DR: A system for gathering information from the Web, using it to create a personal history, and presenting it as a chronological table generation that simplifies the task of sorting out the information for various namesakes and dealing with information in widely scattered sources.
Abstract: We have developed a system for gathering information from the Web, using it to create a personal history, and presenting it as a chronological table. It simplifies the task of sorting out the information for various namesakes and dealing with information in widely scattered sources. The system comprises five components: namesake disambiguation, date expression extraction, date expression normalization and completion, relevant information extraction, and chronological table generation.


Proceedings ArticleDOI
15 Apr 2007
TL;DR: This paper proposes methods of reranking Web search results according to user's editing operations while browsing the search results, and describes the implementations of these reranking methods.
Abstract: This paper proposes methods of reranking Web search results according to user's editing operations while browsing the search results. It is difficult for search engines to rank search results which satisfy the user's search intention adequately because the user's Web search intentions are diverse. Hence, the user must check search results sequentially, or re-search creating a new query. In order to solve this problem, we introduce editing operations such as deletion and emphasis for searched results. When the user deletes a part of the search results, our system degrades search results which include the deleted term or sentence. When the user emphasizes a part of the search results, our system upgrades search results which include the emphasized term or sentence. We describe the implementations of these reranking methods and provide evaluations.

Book ChapterDOI
16 Jun 2007
TL;DR: A method for querying inside lecture materials exploiting a textbook ontology, which is automatically constructed from a back-of-the-book index and table of contents, is proposed.
Abstract: One of the main problems of delivering online course materials to learners lies in a deficiency of search capability. We propose a method for querying inside lecture materials exploiting a textbook ontology, which is automatically constructed from a back-of-the-book index and table of contents. The use of a textbook ontology is two-fold: (i) to help users formulate query more efficiently, and (ii) to discriminate query results that are relevant to user information needs from those that merely contain query terms but are not informative.

Book ChapterDOI
03 Dec 2007
TL;DR: A Web browser that has an autonomous search capability for complementary information related to a currently browsed page that automatically searches for pages having the complementary information, and shows a keyword map, in which each keyword is a type of hyperlink anchor.
Abstract: In this paper, we propose a Web browser that has an autonomous search capability for complementary information related to a currently browsed page. The system automatically searches for pages having the complementary information, and shows a keyword map, in which each keyword is a type of hyperlink anchor. When a user moves or double clicks a keyword in the keyword map, the system enables users to navigate from the browsed page to the complementary page just as if navigating by ordinary hyperlinks. The proposed Web browser is particularly useful for navigating Web pages that are not connected by ordinary hyperlinks, to compare them.

Book ChapterDOI
21 Jun 2007
TL;DR: This paper aims at building truly Secure Spaces which protect both freedoms and enhances the proposed model.
Abstract: In public places, information can be accessed by its unauthorized users, while we are sometimes forced to access our unwanted information unexpectedly. We have introduced the concept of Secure Spaces, physical environments where any information is always protected from its unauthorized users, and have proposed a content-based entry control model and an architecture for Secure Spaces which protect their contents' freedom of information delivery but do not protect their visitors' freedom of information access. This paper aims at building truly Secure Spaces which protect both freedoms and enhances our proposed model.

Journal ArticleDOI
TL;DR: The issues involved with mining Web archive data are described and several concepts related to collecting and analysing historical content of Web pages are discussed and two knowledge discovery tasks are described—temporal summarization and object history detection.
Abstract: While much attention has recently focused on preserving the past content of the Web, there is still a lack of efficient tools for utilizing data stored in Web archives. Web archives constitute large data sources that could be extensively analysed and mined for knowledge discovery. In this paper, we describe the issues involved with mining Web archive data. We discuss several concepts related to collecting and analysing historical content of Web pages and briefly describe two knowledge discovery tasks-temporal summarization and object history detection.


Proceedings ArticleDOI
03 Sep 2007
TL;DR: A new search index generation technique is presented that collects fields of office documents which are previously mapped to XML schema elements, and generates an XML-based search index including document types, file locations and fields collected from the documents and their semantic notations.
Abstract: In this paper, we present a new search index generation technique that collects fields of office documents which are previously mapped to XML schema elements, and generate an XML-based search index including document types, file locations and fields collected from the documents and their semantic notations. The search index allows users to effectively find the target documents by specifying the document type and field values collected from the documents. In order to allow the search index to be securely shared by many users, access control policy is unified into the search index. We also propose an algorithm that computes the answer of search query corresponding to access permission of the user.

Book ChapterDOI
Katsumi Tanaka1
26 Jun 2007
TL;DR: The research on fusion of Web and TV broadcasting contents conducted by Kyoto University 21st COE program and NICT communication/broadcasting contents fusion project is described and several ways to acquire information from diverse information sources of different media types, especially from Web and television broadcasting are proposed.
Abstract: We describe the research on fusion of Web and TV broadcasting contents conducted by Kyoto University 21st COE program and NICT communication/broadcasting contents fusion project. Despite much talk about the fusion of broadcasting and the Internet, no technology has been established for fusing web and TV program content. We proposed several ways to acquire information from diverse information sources of different media types, especially from Web and TV broadcasting. A notable difference between Web contents and TV program contents is that the former is a document-based information media and the latter is a time-based continuous information media, which leads to the difference of information accessing methods. Conventional "Web browsing" is an active manner of accessing information. On the other hand, conventional "TV watching" is a passive way of accessing information. In order to search, integrate and view the information of Web and TV, we explored (1) media conversion between Web and TV contents, (2) watching TV with live chats, (3) dynamic TV-content augmentation by Web, and (4) searching for TV contents with Web.

Book ChapterDOI
09 Jan 2007
TL;DR: This paper proposes a method of adjusting the way a map interface is presented by estimating the users' intentions based on their operation history, and focuses on the trajectory, a series of panning operations, and discusses the inference of users' implicit intentions.
Abstract: Advances in dynamic map interfaces have turned maps into interactive media These dynamic interfaces respond to users' operations in real time, and present fully visualized geographic information However, the current systems have only reacted to explicitly specified user intentions For example, users have been required to elaborately specify visible layers to fully utilize a map interface In contrast, we propose a method of adjusting the way a map interface is presented by estimating the users' intentions based on their operation history By reducing their operations, the system facilitates the use of maps especially for novices It is especially effective in online or mobile map interfaces, where it is difficult to adjust the presentation of the map interface, due to the limited bandwidth and the size of the interface This paper specifically focuses on the trajectory, which is a series of panning operations, and discusses our inference of users' implicit intentions

Book ChapterDOI
20 Aug 2007-Contexts
TL;DR: This paper proposes a method of context-aware querying in mobile/ubiquitous Web searches, which provides mobile users with four capabilities: Context-aware keyphrase inference to help them input a keyphrase as a part of their keyword-based query, context- aware subtopic tree generation, discovery of comparable keyphrases to their original query, and meta vertical search focused on one subtopic to make the retrieval results more precise.
Abstract: This paper proposes a method of context-aware querying in mobile/ubiquitous Web searches, which provides mobile users with four capabilities: (1) context-aware keyphrase inference to help them input a keyphrase as a part of their keyword-based query, (2) context-aware subtopic tree generation to help them specify their information demand on one subtopic, (3) discovery of comparable keyphrases to their original query to help them make better decisions, and (4) meta vertical search focused on one subtopic to make the retrieval results more precise.

Book ChapterDOI
09 Jan 2007
TL;DR: Three types of hyperlinks that are helpful in enhancing user comprehension of events are proposed: context, precondition and causal, which are based on an analysis of Fellbaum's verb entailments.
Abstract: The objective of this paper is to help users navigate for additional information for more comprehension of events in video through an explanation-on-demand hypervideo system. Given an XML database consisting of MPEG-7 video annotations and ontologies specifying relationships among events in annotations, the system is responsible for dynamically identifying hyperlinks inside a video that users possibly follow to clarify the points they do not understand. Three types of hyperlinks that are helpful in enhancing user comprehension of events are proposed: context, precondition and causal, which are based on an analysis of Fellbaum's verb entailments [4].

Book ChapterDOI
09 Jan 2007
TL;DR: A new data model is developed that provides operations for the composition, searching, navigation, and playback of ODV data based on video algebra and spatial properties.
Abstract: Omni-Directional Video (ODV), which is recorded using an omni-directional camera, has become widely used because of recent advancements in digital video technologies and photographic equipment. In ODV, as multiple subjects are recorded by the camera, keyword annotation is practically impossible. Furthermore, the basic concepts of indexing and retrieval of traditional video databases, such as frame and shot, are not applicable to ODV data. Therefore, the properties of ODV must be discovered and basic concepts must be defined to find out effective ways of indexing and retrieving such video data. We developed a new data model that provides operations for the composition, searching, navigation, and playback of ODV data based on video algebra and spatial properties.

Proceedings ArticleDOI
Y. Kabutoya1, Takayuki Yumoto1, Satoshi Oyama1, Keishi Tajima1, Katsumi Tanaka1 
15 Apr 2007
TL;DR: A new system to complement TV contents by Web pages, called "TV contents spectrum analyzer," is implemented, which visualizes the degrees of generality and social acceptance of TV contents using WWW.
Abstract: It is difficult to watch TV contents in an active manner such that the user can interactively select TV contents, because TV is originally a broadcast information media. It is also difficult for users to judge whether the information of TV contents is valid because conventional TV contents are not directly linked with related or evidence information. One of the methods to cope with these problems is to provide complementary or comparative information of TV contents obtained from other media such as Web etc. In our research, using the topic structure proposed by Ma et al., we evaluated quality of TV contents, and visualize the quality. In this paper, we defined "contents coverage," "generality," and "social acceptance" as aspects of TV contents' quality, and examined to what extent there is complementary information against TV contents in Web pages. We also implemented a new system to complement TV contents by Web pages, called "TV contents spectrum analyzer," which visualizes the degrees of generality and social acceptance of TV contents using WWW.

Journal Article
TL;DR: This technique provides a means that generates an XML-based search index allowing users to effectively find out target office documents by search conditions, whose definitions are based on document types, search terms, and term descriptions.
Abstract: Office applications are becoming a major pillar of today‟s organizations since they are used to edit a vast amount of digital documents. Finding these office documents in large databases that fit users‟ needs is becoming increasingly important. Broad one- or two-word searches in conventional search engines are often plagued by low precision, returning many irrelevant documents as their output. In order to solve this problem, we propose a technique that allows users to define search terms inside office documents and term descriptions presenting semantic relationships between the office documents and their search terms. This technique provides a means that generates an XML-based search index allowing users to effectively find out target office documents by search conditions, whose definitions are based on document types, search terms, and term descriptions. We also present the schema of the proposed search index that allows users to effectively search office documents of various types, and also present a search framework that uses the proposed technique.

Proceedings ArticleDOI
09 Nov 2007
TL;DR: This paper proposes a novel technique of browsing Web pages called "Edit-and-Propagate" operation based browsing, where edit operation is regarded as the third operation for interacting with the WWW after conventional clicking and scrolling towards realizing the Editable Web Browser.
Abstract: This paper proposes a novel technique of browsing Web pages called "Edit-and-Propagate" operation based browsing, where edit operation is regarded as the third operation for interacting with the WWW after conventional clicking and scrolling towards realizing the Editable Web Browser. Our method enables users to delete/emphasize any portion of a browsed Web page at any time and modifies the page by propagating the edit operation. For example, the user can easily delete almost any uninteresting portion of a Web page merely by deleting an example of an unwanted portion. While browsing a Web search result page, the user can rerank search results by deleting an unwanted term or by emphasizing an important term. In this paper, we describe the concept of "Edit-and-Propagate" based browsing, and the implementation of our prototypes. Then we describe the results of our evaluation, which demonstrate the usefulness of our system.

Journal ArticleDOI
TL;DR: This paper uses both text information from closed captions and visual information from video frames to generate links to enable users to easily explore not only the original video content but also augmented information from the Web.
Abstract: As the amount of recorded TV content is increasing rapidly, people need active and interactive browsing methods. In this paper, we use both text information from closed captions and visual information from video frames to generate links to enable users to easily explore not only the original video content but also augmented information from the Web. This solution especially shows its superiority when the video content cannot be fully represented by closed captions. A prototype system was implemented and some experiments were carried out to prove its effectiveness and efficiency.