Home
/
Authors
/
Shojiro Nishio

Author

Shojiro Nishio

Other affiliations: University of Tokyo, Kyoto University

Bio: Shojiro Nishio is an academic researcher from Osaka University. The author has contributed to research in topics: Mobile computing & Wireless sensor network. The author has an hindex of 33, co-authored 487 publications receiving 4875 citations. Previous affiliations of Shojiro Nishio include University of Tokyo & Kyoto University.

Papers published on a yearly basis

2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

H-mine: hyper-structure mining of frequent patterns in large databases

[...]

Jian Pei¹, Jiawei Han², Hongjun Lu³, Shojiro Nishio⁴, Shiwei Tang¹, Dongqing Yang¹ - Show less +2 more•Institutions (4)

Peking University¹, Simon Fraser University², Hong Kong University of Science and Technology³, Osaka University⁴

29 Nov 2001

TL;DR: This study shows that H-mine has high performance in various kinds of data, outperforms the previously developed algorithms in different settings, and is highly scalable in mining large databases.

...read moreread less

Abstract: Methods for efficient mining of frequent patterns have been studied extensively by many researchers. However, the previously proposed methods still encounter some performance bottlenecks when mining databases with different data characteristics, such as dense vs. sparse, long vs. short patterns, memory-based vs. disk-based, etc. In this study, we propose a simple and novel hyper-linked data structure, H-struct and a new mining algorithm, H-mine, which takes advantage of this data structure and dynamically adjusts links in the mining process. A distinct feature of this method is that it has very limited and precisely predictable space overhead and runs really fast in memory-based setting. Moreover it can be scaled up to very large databases by database partitioning, and when the data set becomes dense, (conditional) FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has high performance in various kinds of data, outperforms the previously developed algorithms in different settings, and is highly scalable in mining large databases. This study also proposes a new data mining methodology, space-preserving mining, which may have strong impact in the future development of efficient and scalable data mining methods.

...read moreread less

452 citations

Journal Article•DOI•

A survey on communication and data management issues in mobile sensor networks

[...]

Chunsheng Zhu¹, Lei Shu², Takahiro Hara², Lei Wang³, Shojiro Nishio², Laurence T. Yang¹ - Show less +2 more•Institutions (3)

St. Francis Xavier University¹, Osaka University², Dalian University of Technology³

01 Jan 2014-Wireless Communications and Mobile Computing

TL;DR: Different research methods regarding communication and data management in MWSNs are discussed and some further open research areas in MW SNs are proposed.

...read moreread less

Abstract: Wireless sensor networks (WSNs) which is proposed in the late 1990s have received unprecedented attention, because of their exciting potential applications in military, industrial, and civilian areas (e.g., environmental and habitat monitoring). Although WSNs have become more and more prospective in human life with the development of hardware and communication technologies, there are some natural limitations of WSNs (e.g., network connectivity, network lifetime) due to the static network style in WSNs. Moreover, more and more application scenarios require the sensors in WSNs to be mobile rather than static so as to make traditional applications in WSNs become smarter and enable some new applications. All this induce the mobile wireless sensor networks (MWSNs) which can greatly promote the development and application of WSNs. However, to the best of our knowledge, there is not a comprehensive survey about the communication and data management issues in MWSNs. In this paper,focusing on researching the communication issues and data management issues in MWSNs, we discuss different research methods regarding communication and data management in MWSNs and propose some further open research areas in MWSNs.Copyright © 2011 John Wiley & Sons, Ltd.

...read moreread less

140 citations

Proceedings Article•DOI•

Mining people's trips from large scale geo-tagged photos

[...]

Yuki Arase¹, Xing Xie¹, Takahiro Hara², Shojiro Nishio²•Institutions (2)

Microsoft¹, Osaka University²

25 Oct 2010

TL;DR: This paper focuses on geo-tagged photos and proposes a method to detect people's frequent trip patterns, i.e., typical sequences of visited cities and durations of stay as well as descriptive tags that characterize the trip patterns.

...read moreread less

Abstract: Photo sharing is one of the most popular Web services. Photo sharing sites provide functions to add tags and geo-tags to photos to make photo organization easy. Considering that people take photos to record something that attracts them, geo-tagged photos are a rich data source that reflects people's memorable events associated with locations. In this paper, we focus on geo-tagged photos and propose a method to detect people's frequent trip patterns, i.e., typical sequences of visited cities and durations of stay as well as descriptive tags that characterize the trip patterns. Our method first segments photo collections into trips and categorizes them based on their trip themes, such as visiting landmarks or communing with nature. Our method mines frequent trip patterns for each trip theme category. We crawled 5.7 million geo-tagged photos and performed photo trip pattern mining. The experimental result shows that our method outperforms other baseline methods and can correctly segment photo collections into photo trips with an accuracy of 78%. For trip categorization, our method can categorize about 80% of trips using tags and titles of photos and visited cities as features. Finally, we illustrate interesting examples of trip patterns detected from our dataset and show an application with which users can search frequent trip patterns by querying a destination, visit duration, and trip theme on the trip.

...read moreread less

134 citations

Proceedings Article•DOI•

Dynamic TDMA slot assignment in ad hoc networks

[...]

Akimitsu Kanzaki¹, Toshiaki Uemukai¹, Takahiro Hara¹, Shojiro Nishio¹•Institutions (1)

Osaka University¹

27 Mar 2003

TL;DR: A TDMA slot assignment protocol to improve the channel utilization, which controls the excessive increase of unassigned slots by changing the frame length dynamically by depriving one of the multiple slots assigned to a node, or enlarging frame length of nodes which can cause collision with each other.

...read moreread less

Abstract: In this paper we propose a TDMA slot assignment protocol to improve the channel utilization, which controls the excessive increase of unassigned slots by changing the frame length dynamically. Our proposed protocol assigns one of the unassigned slots to a node which joins the network. If there are no unassigned slots available, our proposed protocol generates unassigned slots by depriving one of the multiple slots assigned to a node, or enlarging frame length of nodes which can cause collision with each other. Moreover, by setting frame length as a power of 2 slots, our proposed protocol provides the collision-free packet transmission among nodes with different frame length. The simulation results show that our proposed protocol improves the channel utilization dramatically as compared with the conventional protocols.

...read moreread less

130 citations

Book Chapter•DOI•

Wikipedia mining for an association web thesaurus construction

[...]

Kotaro Nakayama¹, Takahiro Hara¹, Shojiro Nishio¹•Institutions (1)

Osaka University¹

03 Dec 2007

TL;DR: An efficient link mining method pfibf (Path Frequency - Inversed Backward link Frequency) and the extension method "forward / backward link weighting (FB weighting)" are proposed in order to construct a huge scale association thesaurus of Wikipedia.

...read moreread less

Abstract: Wikipedia has become a huge phenomenon on the WWW. As a corpus for knowledge extraction, it has various impressive characteristics such as a huge amount of articles, live updates, a dense link structure, brief link texts and URL identification for concepts. In this paper, we propose an efficient link mining method pfibf (Path Frequency - Inversed Backward link Frequency) and the extension method "forward / backward link weighting (FB weighting)" in order to construct a huge scale association thesaurus. We proved the effectiveness of our proposed methods compared with other conventional methods such as cooccurrence analysis and TF-IDF.

...read moreread less

110 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98

Collapse

Cited by

PDF

Open Access

More filters

Book•

Data Mining: Concepts and Techniques

[...]

Jiawei Han¹, Micheline Kamber², Jian Pei²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Simon Fraser University²

08 Sep 2000

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

...read moreread less

23,600 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Posted Content•

SQuAD: 100,000+ Questions for Machine Comprehension of Text

[...]

Pranav Rajpurkar¹, Jian Zhang¹, Konstantin Lopyrev¹, Percy Liang¹•Institutions (1)

Stanford University¹

16 Jun 2016-arXiv: Computation and Language

TL;DR: The Stanford Question Answering Dataset (SQuAD) as mentioned in this paper is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.

...read moreread less

Abstract: We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. We analyze the dataset to understand the types of reasoning required to answer the questions, leaning heavily on dependency and constituency trees. We build a strong logistic regression model, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). However, human performance (86.8%) is much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at this https URL

...read moreread less

4,336 citations