Showing papers by "Geun-Sik Jo published in 2008"

PDF

Open Access

Book Chapter•DOI•

A Method for Integration of WordNet-Based Ontologies Using Distance Measures

[...]

Trong Hai Duong¹, Ngoc Thanh Nguyen², Geun-Sik Jo¹•Institutions (2)

Inha University¹, Wrocław University of Technology²

03 Sep 2008

TL;DR: A methodology of WordNet-based distance measures is proposed, and the meaning of concepts of upper ontologies are applied to an ontology integration process by providing semantic network called OnConceptSNet.

...read moreread less

Abstract: While there is a large body of previous work focused on WordNet-based for finding the semantic similarity of concepts and words, the application of these word oriented methods to ontology integration tasks has not been yet explored. In this paper, we propose a methodology of WordNet-based distance measures, and we apply the meaning of concepts of upper ontologies to an ontology integration process by providing semantic network called OnConceptSNet. It is a semantic network of concepts of ontologies in which relations between concepts derived from upper ontology WordNet. We also describe a methodology for conflict in ontology integration process.

...read moreread less

37 citations

Proceedings Article•

A Method for Integration across Text Corpus and WordNet-Based Ontologies.

[...]

Trong Hai Duong¹, Geun-Sik Jo¹, Ngoc Thanh Nguyen²•Institutions (2)

Inha University¹, Wrocław University of Technology²

01 Jan 2008

TL;DR: This paper applies word oriented methods to ontology integration tasks in which a noun phrase is analyzed to identify its head noun, which is useful to avoid wrong relations between entities and proposes a collaborative acquisition algorithm that combines WordNet-based and Text corpus.

...read moreread less

Abstract: Most information in the world exists in the format of text, such as news articles and web pages. Different lines of research have been conducted to discover, understand and access knowledge about real-world entities and relations from text. However, the application of these word oriented methods to ontology integration tasks has not been yet explored. In this paper, we apply these word oriented methods to ontology integration tasks in which we analyze a noun phrase (NP) to identify its head noun, which is useful to avoid wrong relations between entities. We also propose a collaborative acquisition algorithm that combines WordNet-based and Text corpus.

...read moreread less

16 citations

Proceedings Article•DOI•

A Method for Integration across Text Corpus and WordNet-Based Ontologies

[...]

Trong Hai Duong¹, Geun-Sik Jo¹, Ngoc Thanh Nguyen²•Institutions (2)

Inha University¹, Wrocław University of Technology²

09 Dec 2008

TL;DR: In this paper, the authors apply word oriented methods to ontology integration tasks in which they analyze a noun phrase (NP) to identify its head noun, which is useful to avoid wrong relations between entities.

...read moreread less

15 citations

Proceedings Article•DOI•

Semantic Analysis of User Behaviors for Detecting Spam Mail

[...]

Asung Han¹, Hyun-Jun Kim¹, Inay Ha¹, Geun-Sik Jo•Institutions (1)

Inha University¹

10 Jul 2008

TL;DR: An adaptive learning system that filter spam emails based on user's action pattern as time goes by and relationship between user's actions such as what action is took after one action and how long does it take is considered.

...read moreread less

Abstract: According to continuous increasing of spam email, 92.6% of recent total email is known spam email. In this research, we will show an adaptive learning system that filter spam emails based on user's action pattern as time goes by. In this paper, we consider relationship between user's actions such as what action is took after one action and how long does it take. They analyze that each action has how much meaning, and that it has an effect on filtering spam emails. And that in turn determines weight for each email. In experimentation, we will compare results of system of this research and weighted Bayesian classifier using real email data set. Also, we will show how to handle personalization for concept drift and adaptive learning.

...read moreread less

10 citations

Proceedings Article•DOI•

Automatic Subtitles Localization through Speaker Identification in Multimedia System

[...]

Seung-Bo Park¹, Kyung-Jin Oh¹, Heung-Nam Kim¹, Geun-Sik Jo¹•Institutions (1)

Inha University¹

10 Jul 2008

TL;DR: This paper presents a framework for displaying synchronized text around a speaker in video, which identifies speakers using face detection technologies and subsequently detects a subtitles region and adapts DFXP, which is interoperable timed text format of W3C, to support interchanging with existing legacy system.

...read moreread less

Abstract: With the increasing popularity of online video, efficient captioning and displaying the captioned text (subtitles) have also been issued with the accessibility. However, in most cases, subtitles are shown on a separate display below a screen. As a result, some viewers lose condensed information about the contents of the video. To elevate readability and visibility of viewers, in this paper, we present a framework for displaying synchronized text around a speaker in video. The proposed approach first identifies speakers using face detection technologies and subsequently detects a subtitles region. In addition, we adapt DFXP, which is interoperable timed text format of W3C, to support interchanging with existing legacy system. In order to achieve smooth playback of multimedia presentation, such as SMIL and DFXP, a prototype system, namely MoNaPlayer, has been implemented. Our case studies show that the proposed system is feasible to several multimedia applications.

...read moreread less

8 citations

Proceedings Article•DOI•

Table Based Single Pass Algorithm for Clustering Electronic Documents in 20NewsGroups

[...]

Taeho Jo¹, Geun-Sik Jo¹•Institutions (1)

Inha University¹

10 Jul 2008

TL;DR: The goal of this research is to improve the performance of single pass algorithm for text clustering by modifying it into the specialized version where documents are encoded into not numerical vectors but alternative forms.

...read moreread less

Abstract: This research proposes a modified version of single pass algorithm which is specialized for text clustering. Encoding documents into numerical vectors for using the traditional version of single pass algorithm causes the two main problems: huge dimensionality and sparse distribution. Therefore, in order to address the two problems, this research modifies the single pass algorithm into its version where documents are encoded into not numerical vectors but alternative forms. In the proposed version, documents are mapped into tables and a similarity of two documents is computed by comparing their tables with each other. The goal of this research is to improve the performance of single pass algorithm for text clustering by modifying it into the specialized version.

...read moreread less

7 citations

Proceedings Article•DOI•

Semantic Web Service for Freight Management System

[...]

Jin-Guk Jung¹, Seung-Bo Park¹, Sang-Jin Cha¹, Geun-Sik Jo¹•Institutions (1)

Inha University¹

10 Jul 2008

TL;DR: In this article, the authors apply semantic web service technology, which provides a promising common interoperable framework in which information is given well-defined meaning in unambiguous and machine-interpretable form by using ontology such that data and services can be used for more effective discovery, automation, integration, and reuse across various applications.

...read moreread less

Abstract: In the logistics, there are the variety of available data formats. This should make it difficult to quickly implement a system to communicate with other application systems. Furthermore, once the system which can handle the data formats agreed with each other has been implemented, a considerable amount of effort is still required to reformat the data for utilization in any other services like which shippers are able to monitor and track their freight on Web. To overcome these problems, we apply semantic Web service technology, which provides a promising common interoperable framework in which information is given well-defined meaning in unambiguous and machine-interpretable form by using ontology such that data and services can be used for more effective discovery, automation, integration, and reuse across various applications. Finally, we have shown the reasonability of adopting semantic Web service as a case study.

...read moreread less

3 citations

Proceedings Article•DOI•

Catch Crawler: Automatic Web Information Extractor Using Style Sheet

[...]

Kwangcheol Shin¹, Geun-Sik Jo¹•Institutions (1)

Inha University¹

10 Jul 2008

TL;DR: A new automatic Web information extractor called dasiacatch crawlerpsila which uses style sheet to extract interesting data on a target site which gives over 90% of accuracy on average.

...read moreread less

Abstract: Dataset should be free from noise for carrying out tasks of Web mining well. Generally commercial Web pages have a lot of noise which are not relevant to main contents such as navigation panel, advertisements, copyright notices or other service links. In this paper, we present a new automatic Web information extractor called dasiacatch crawlerpsila which uses style sheet to extract interesting data on a target site. Style sheets are generally used for uniform presentation of Web pages in a commercial Web site. To execute catch Crawler, a user lets catch Crawler know the interesting data area by clicking the data on a Web page. The catch Crawler automatically perceives the class of style sheet for the data and generates dataset from the whole Web site following the same style sheet class. Experimental results show that our approach for extracting noiseless Web data gives over 90% of accuracy on average.

...read moreread less

2 citations

Book Chapter•DOI•

A Collaborative Approach to User Modeling for Personalized Content Recommendations

[...]

Heung-Nam Kim¹, Inay Ha¹, Seung-Hoon Lee¹, Geun-Sik Jo¹•Institutions (1)

Inha University¹

02 Dec 2008

TL;DR: This work proposes a collaborative approach to user modeling for generating personalized recommendations for users that first discovers useful and meaningful patterns of users, and then enriches a personal model with collaboration from other similar users.

...read moreread less

Abstract: Recommender systems, which have emerged in response to the problem of information overload, provide users with recommendations of contents that are likely to fit their needs One notable challenge in a recommender system is the cold start problem To address this issue, we propose a collaborative approach to user modeling for generating personalized recommendations for users Our approach first discovers useful and meaningful patterns of users, and then enriches a personal model with collaboration from other similar users In order to evaluate the performance of our approach, we compare experimental results with those of a probabilistic learning model, a user-based collaborative filtering, and vector space model We present experimental results that show how our model performs better than existing work

...read moreread less

2 citations

Discovering Association Rules using Item Clustering on Frequent Pattern Network

[...]

Kyeong-Jin Oh, Jin-Guk Jung, Inay Ha, Geun-Sik Jo

01 Jan 2008

TL;DR: A new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network and generates association rules based on clusters is proposed.

...read moreread less

Abstract: Data mining is defined as the process of discovering meaningful and useful pattern in large volumes of data. In particular, finding associations rules between items in a database of customer transactions has become an important thing. Some data structures and algorithms had been proposed for storing meaningful information compressed from an original database to find frequent itemsets since Apriori algorithm. Though existing method find all association rules, we must have a lot of process to analyze association rules because there are too many rules. In this paper, we propose a new data structure, called a Frequent Pattern Network (FPN), which represents items as vertices and 2-itemsets as edges of the network. In order to utilize FPN, We constitute FPN using item's frequency. And then we use a clustering method to group the vertices on the network into clusters so that the intracluster similarity is maximized and the intercluster similarity is minimized. We generate association rules based on clusters. Our experiments showed accuracy of clustering items on the network using confidence, correlation and edge weight similarity methods. And We generated association rules using clusters and compare traditional and our method. From the results, the confidence similarity had a strong influence than others on the frequent pattern network. And FPN had a flexibility to minimum support value.

...read moreread less

1 citations

Proceedings Article•DOI•

List Based Matching Algorithm for Classifying News Articles in NewsPage.com

[...]

Taeho Jo¹, Geun-Sik Jo¹•Institutions (1)

Inha University¹

10 Jul 2008

TL;DR: The goal of the research is to improve the performance of text categorization by solving the two problems of huge dimensionality and sparse distribution.

...read moreread less

Abstract: This research proposes an alternative approach to machine learning based ones for categorizing news articles given as in plain texts. In order to use one of machine learning based approaches for the task, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. The proposed approach is intended to address the two problems. In other words, the two problems are avoided by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of the research is to improve the performance of text categorization by solving the two problems.

...read moreread less

Combining Collaborative, Diversity and Content Based Filtering for Recommendation System

[...]

Jenu Shrestha, Mohammed Nazim Uddin, Geun-Sik Jo

01 Mar 2008

TL;DR: In this article, a novel method that uses a diversity metric to select the dissimilar items among the recommended items from collaborative filtering, which together with the input when fed into content space let us improve and include new items in the recommendation.

...read moreread less

Abstract: Combining collaborative filtering with some other technique is most common in hybrid recommender systems. As many recommended items from collaborative filtering seem to be similar with respect to content, the collaborative-content hybrid system suffers in terms of quality recommendation and recommending new items as well. To alleviate such problem, we have developed a novel method that uses a diversity metric to select the dissimilar items among the recommended items from collaborative filtering, which together with the input when fed into content space let us improve and include new items in the recommendation. We present experimental results on movielens dataset that shows how our approach performs better than simple content-based system and naive hybrid system.

...read moreread less