scispace - formally typeset
Search or ask a question
Author

Yutaka Matsuo

Bio: Yutaka Matsuo is an academic researcher from University of Tokyo. The author has contributed to research in topics: Social network & Relationship extraction. The author has an hindex of 41, co-authored 275 publications receiving 10583 citations. Previous affiliations of Yutaka Matsuo include National Institute of Advanced Industrial Science and Technology & Carnegie Mellon University.


Papers
More filters
Proceedings ArticleDOI
26 Apr 2010
TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.
Abstract: Twitter, a popular microblogging service, has received much attention recently. An important characteristic of Twitter is its real-time nature. For example, when an earthquake occurs, people make many Twitter posts (tweets) related to the earthquake, which enables detection of earthquake occurrence promptly, simply by observing the tweets. As described in this paper, we investigate the real-time interaction of events such as earthquakes in Twitter and propose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location. We consider each Twitter user as a sensor and apply Kalman filtering and particle filtering, which are widely used for location estimation in ubiquitous/pervasive computing. The particle filter works better than other comparable methods for estimating the centers of earthquakes and the trajectories of typhoons. As an application, we construct an earthquake reporting system in Japan. Because of the numerous earthquakes and the large number of Twitter users throughout the country, we can detect an earthquake with high probability (96% of earthquakes of Japan Meteorological Agency (JMA) seismic intensity scale 3 or more are detected) merely by monitoring tweets. Our system detects earthquakes promptly and sends e-mails to registered users. Notification is delivered much faster than the announcements that are broadcast by the JMA.

3,976 citations

Journal ArticleDOI
TL;DR: This article presented a new keyword extraction algorithm that applies to a single document without using a corpus and showed comparable performance to tfidf without using TFIDF without using any corpus, but the degree of biases of distribution is measured by the χ 2 -measure.
Abstract: We present a new keyword extraction algorithm that applies to a single document without using a corpus. Frequent terms are extracted first, then a set of cooccurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. Co-occurrence distribution shows importance of a term in the documentas follows. If probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of frequent terms, then term a is likely to be a keyword. The degree of biases of distribution is measured by the χ 2 -measure. Our algorithm shows comparable performance to tfidf without using a corpus.

869 citations

Proceedings ArticleDOI
01 Jan 2007
TL;DR: A robust semantic similarity measure that uses the information available on the Web to measure similarity between words or entities and a novel approach to compute semantic similarity using automatically extracted lexico-syntactic patterns from text snippets is proposed.
Abstract: Semantic similarity measures play important roles in information retrieval and Natural Language Processing. Previous work in semantic web-related applications such as community mining, relation extraction, automatic meta data extraction have used various semantic similarity measures. Despite the usefulness of semantic similarity measures in these applications, robustly measuring semantic similarity between two words (or entities) remains a challenging task. We propose a robust semantic similarity measure that uses the information available on the Web to measure similarity between words or entities. The proposed method exploits page counts and text snippets returned by a Web search engine. We deflne various similarity scores for two given words P and Q, using the page counts for the queries P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using automatically extracted lexico-syntactic patterns from text snippets. These difierent similarity scores are integrated using support vector machines, to leverage a robust semantic similarity measure. Experimental results on Miller-Charles benchmark dataset show that the proposed measure outperforms all the existing web-based semantic similarity measures by a wide margin, achieving a correlation coe‐cient of 0:834. Moreover, the proposed semantic similarity measure signiflcantly improves the accuracy (F-measure of 0:78) in a community mining task, and in an entity disambiguation task, thereby verifying the capability of the proposed measure to capture semantic similarity using web content.

601 citations

Journal ArticleDOI
TL;DR: An earthquake reporting system for use in Japan is developed and an algorithm to monitor tweets and to detect a target event is proposed, which produces a probabilistic spatiotemporal model for the target event that can find the center of the event location.
Abstract: Twitter has received much attention recently. An important characteristic of Twitter is its real-time nature. We investigate the real-time interaction of events such as earthquakes in Twitter and propose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the target event that can find the center of the event location. We regard each Twitter user as a sensor and apply particle filtering, which are widely used for location estimation. The particle filter works better than other comparable methods for estimating the locations of target events. As an application, we develop an earthquake reporting system for use in Japan. Because of the numerous earthquakes and the large number of Twitter users throughout the country, we can detect an earthquake with high probability (93 percent of earthquakes of Japan Meteorological Agency (JMA) seismic intensity scale 3 or more are detected) merely by monitoring tweets. Our system detects earthquakes promptly and notification is delivered much faster than JMA broadcast announcements.

483 citations

Journal ArticleDOI
TL;DR: A social network extraction system called POLYPHONET is proposed, which employs several advanced techniques to extract relations of persons, to detect groups of people, and to obtain keywords for a person.

258 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal Article
TL;DR: The continuing convergence of the digital marketing and sales funnels has created a strategic continuum from digital lead generation to digital sales, which identifies the current composition of this digital continuum while providing opportunities to evaluate sales and marketing digital strategies.
Abstract: MKT 6009 Marketing Internship (0 semester credit hours) Student gains experience and improves skills through appropriate developmental work assignments in a real business environment. Student must identify and submit specific business learning objectives at the beginning of the semester. The student must demonstrate exposure to the managerial perspective via involvement or observation. At semester end, student prepares an oral or poster presentation, or a written paper reflecting on the work experience. Student performance is evaluated by the work supervisor. Pass/Fail only. Prerequisites: (MAS 6102 or MBA major) and department consent required. (0-0) S MKT 6244 Digital Marketing Strategy (2 semester credit hours) Executive Education Course. The course explores three distinct areas within marketing and sales namely, digital marketing, traditional sales prospecting, and executive sales organization and strategy. The continuing convergence of the digital marketing and sales funnels has created a strategic continuum from digital lead generation to digital sales. The course identifies the current composition of this digital continuum while providing opportunities to evaluate sales and marketing digital strategies. Prerequisites: MKT 6301 and instructor consent required. (2-0) Y MKT 6301 (SYSM 6318) Marketing Management (3 semester credit hours) Overview of marketing management methods, principles and concepts including product, pricing, promotion and distribution decisions as well as segmentation, targeting and positioning. (3-0) S MKT 6309 Marketing Data Analysis and Research (3 semester credit hours) Methods employed in market research and data analysis to understand consumer behavior, customer journeys, and markets so as to enable better decision-making. Topics include understanding different sources of data, survey design, experiments, and sampling plans. The course will cover the techniques used for market sizing estimation and forecasting. In addition, the course will cover the foundational concepts and techniques used in data visualization and \"story-telling\" for clients and management. Corequisites: MKT 6301 and OPRE 6301. (3-0) Y MKT 6310 Consumer Behavior (3 semester credit hours) An exposition of the theoretical perspectives of consumer behavior along with practical marketing implication. Study of psychological, sociological and behavioral findings and frameworks with reference to consumer decision-making. Topics will include the consumer decision-making model, individual determinants of consumer behavior and environmental influences on consumer behavior and their impact on marketing. Prerequisite: MKT 6301. (3-0) Y MKT 6321 Interactive and Digital Marketing (3 semester credit hours) Introduction to the theory and practice of interactive and digital marketing. Topics covered include: online-market research, consumer behavior, conversion metrics, and segmentation considerations; ecommerce, search and display advertising, audiences, search engine marketing, email, mobile, video, social networks, and the Internet of Things. (3-0) T MKT 6322 Internet Business Models (3 semester credit hours) Topics to be covered are: consumer behavior on the Internet, advertising on the Internet, competitive strategies, market research using the Internet, brand management, managing distribution and supply chains, pricing strategies, electronic payment systems, and developing virtual organizations. Further, students learn auction theory, web content design, and clickstream analysis. Prerequisite: MKT 6301. (3-0) Y MKT 6323 Database Marketing (3 semester credit hours) Techniques to analyze, interpret, and utilize marketing databases of customers to identify a firm's best customers, understanding their needs, and targeting communications and promotions to retain such customers. Topics

5,537 citations

Proceedings ArticleDOI
26 Apr 2010
TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.
Abstract: Twitter, a popular microblogging service, has received much attention recently. An important characteristic of Twitter is its real-time nature. For example, when an earthquake occurs, people make many Twitter posts (tweets) related to the earthquake, which enables detection of earthquake occurrence promptly, simply by observing the tweets. As described in this paper, we investigate the real-time interaction of events such as earthquakes in Twitter and propose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location. We consider each Twitter user as a sensor and apply Kalman filtering and particle filtering, which are widely used for location estimation in ubiquitous/pervasive computing. The particle filter works better than other comparable methods for estimating the centers of earthquakes and the trajectories of typhoons. As an application, we construct an earthquake reporting system in Japan. Because of the numerous earthquakes and the large number of Twitter users throughout the country, we can detect an earthquake with high probability (96% of earthquakes of Japan Meteorological Agency (JMA) seismic intensity scale 3 or more are detected) merely by monitoring tweets. Our system detects earthquakes promptly and sends e-mails to registered users. Notification is delivered much faster than the announcements that are broadcast by the JMA.

3,976 citations

01 Jan 2012

3,692 citations