scispace - formally typeset
Search or ask a question
Author

Matthew K. Gray

Bio: Matthew K. Gray is an academic researcher from Google. The author has contributed to research in topics: Search engine & Web query classification. The author has an hindex of 4, co-authored 12 publications receiving 2082 citations.

Papers
More filters
Journal ArticleDOI
14 Jan 2011-Science
TL;DR: This work surveys the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000, and shows how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology and the pursuit of fame.
Abstract: We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.

2,257 citations

Patent
27 Sep 2011

31 citations

Patent
31 Jan 2013

5 citations


Cited by
More filters
Journal ArticleDOI
25 Sep 2013-PLOS ONE
TL;DR: This represents the largest study, by an order of magnitude, of language and personality, and found striking variations in language with personality, gender, and age.
Abstract: We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and personality.

1,435 citations

Journal ArticleDOI
TL;DR: A keyword analysis identifies the most popular subjects covered by bibliometric analysis, and multidisciplinary articles are shown to have the highest impact.
Abstract: Bibliometric methods or "analysis" are now firmly established as scientific specialties and are an integral part of research evaluation methodology especially within the scientific and applied fields. The methods are used increasingly when studying various aspects of science and also in the way institutions and universities are ranked worldwide. A sufficient number of studies have been completed, and with the resulting literature, it is now possible to analyse the bibliometric method by using its own methodology. The bibliometric literature in this study, which was extracted from Web of Science, is divided into two parts using a method comparable to the method of Jonkers et al. (Characteristics of bibliometrics articles in library and information sciences (LIS) and other journals, pp. 449---551, 2012: The publications either lie within the Information and Library Science (ILS) category or within the non-ILS category which includes more applied, "subject" based studies. The impact in the different groupings is judged by means of citation analysis using normalized data and an almost linear increase can be observed from 1994 onwards in the non-ILS category. The implication for the dissemination and use of the bibliometric methods in the different contexts is discussed. A keyword analysis identifies the most popular subjects covered by bibliometric analysis, and multidisciplinary articles are shown to have the highest impact. A noticeable shift is observed in those countries which contribute to the pool of bibliometric analysis, as well as a self-perpetuating effect in giving and taking references.

1,098 citations

Journal ArticleDOI
TL;DR: Consensus measurement plays an important role in Delphi research as mentioned in this paper and has been widely used in the Delphi multi-round survey procedure to aggregate expert opinions on future developments and incidents.

987 citations

Journal ArticleDOI
TL;DR: This paper provides a comprehensive review of the current and rapidly emerging ecosystem of the Internet of Things (IOT) and outlines four critical functional steps: data creation, information generation, meaning-making, and action-taking.
Abstract: The number of devices on the Internet exceeded the number of people on the Internet in 2008, and is estimated to reach 50 billion in 2020. A wide-ranging Internet of Things (IOT) ecosystem is emerging to support the process of connecting real-world objects like buildings, roads, household appliances, and human bodies to the Internet via sensors and microprocessor chips that record and transmit data such as sound waves, temperature, movement, and other variables. The explosion in Internet-connected sensors means that new classes of technical capability and application are being created. More granular 24/7 quantified monitoring is leading to a deeper understanding of the internal and external worlds encountered by humans. New data literacy behaviors such as correlation assessment, anomaly detection, and high-frequency data processing are developing as humans adapt to the different kinds of data flows enabled by the IOT. The IOT ecosystem has four critical functional steps: data creation, information generation, meaning-making, and action-taking. This paper provides a comprehensive review of the current and rapidly emerging ecosystem of the Internet of Things (IOT).

895 citations

Journal ArticleDOI
07 Dec 2011-PLOS ONE
TL;DR: Examination of expressions made on the online, global microblog and social networking service Twitter is examined, uncovering and explaining temporal variations in happiness and information levels over timescales ranging from hours to years.
Abstract: Individual happiness is a fundamental societal metric. Normally measured through self-report, happiness has often been indirectly characterized and overshadowed by more readily quantifiable economic indicators such as gross domestic product. Here, we examine expressions made on the online, global microblog and social networking service Twitter, uncovering and explaining temporal variations in happiness and information levels over timescales ranging from hours to years. Our data set comprises over 46 billion words contained in nearly 4.6 billion expressions posted over a 33 month span by over 63 million unique users. In measuring happiness, we construct a tunable, real-time, remote-sensing, and non-invasive, text-based hedonometer. In building our metric, made available with this paper, we conducted a survey to obtain happiness evaluations of over 10,000 individual words, representing a tenfold size improvement over similar existing word sets. Rather than being ad hoc, our word list is chosen solely by frequency of usage, and we show how a highly robust and tunable metric can be constructed and defended.

761 citations