scispace - formally typeset
Search or ask a question

Fundamental Disruption in Big Data Science and Biological Discovery

01 Jan 2013-
TL;DR: In the long-term future, the quantified self may become additionally transformed into the extended exoself as data quantification and self-tracking enable the development of new sense capabilities that are not possible with ordinary senses.
Abstract: A key contemporary trend emerging in big data science is the quantified self (QS)‐individuals engaged in the selftracking of any kind of biological, physical, behavioral, or environmental information as n =1 individuals or in groups. There are opportunities for big data scientists to develop new models to support QS data collection, integration, and analysis, and also to lead in defining open-access database resources and privacy standards for how personal data is used. Next-generation QS applications could include tools for rendering QS data meaningful in behavior change, establishing baselines and variability in objective metrics, applying new kinds of pattern recognition techniques, and aggregating multiple self-tracking data streams from wearable electronics, biosensors, mobile phones, genomic data, and cloud-based services. The long-term vision of QS activity is that of a systemic monitoring approach where an individual’s continuous personal information climate provides real-time performance optimization suggestions. There are some potential limitations related to QS activity—barriers to widespread adoption and a critique regarding scientific soundness—but these may be overcome. One interesting aspect of QS activity is that it is fundamentally a quantitative and qualitative phenomenon since it includes both the collection of objective metrics data and the subjective experience of the impact of these data. Some of this dynamic is being explored as the quantified self is becoming the qualified self in two new ways: by applying QS methods to the tracking of qualitative phenomena such as mood, and by understanding that QS data collection is just the first step in creating qualitative feedback loops for behavior change. In the long-term future, the quantified self may become additionally transformed into the extended exoself as data quantification and self-tracking enable the development of new sense capabilities that are not possible with ordinary senses. The individual body becomes a more knowable, calculable, and administrable object through QS activity, and individuals have an increasingly intimate relationship with data as it mediates the experience of reality.

Content maybe subject to copyright    Report

Abstract
A key contemporary trend emerging in big data science is the quantified self (QS)–individuals engaged in the self-
tracking of any kind of biological, physical, behavioral, or environmental inform ation as n = 1 individuals or in
groups. There are opportunities for big data scientists to develop new models to support QS data collection,
integration, and analysis, and also to lead in defining open-access database resources and privacy standards for
how personal data is used. Next-generation QS applications could include tools for rendering QS data meaningful
in behavior change, establishing baselines and variability in objective metrics, applying new kinds of pattern
recognition techniques, and aggregating multiple self-tracking data streams from wearable electronics, biosensors,
mobile phones, genomic data, and cloud-based services. The long-term vision of QS activity is that of a systemic
monitoring approach where an individual’s continuous personal information climate provides real-time perfor-
mance optimization suggestions. There are some potential limitations related to QS activity—barriers to wide-
spread adoption and a critique regarding scientific soundness—but these may be overcome. One interesting aspect
of QS activity is that it is fundamentally a quantitative and qualitative phenomenon since it includes both the
collection of objective metrics data and the subjective experience of the imp act of these data. Some of this dynamic
is being explored as the quantified self is becoming the qualified self in two new ways: by applying QS methods to
the tracking of qualitative phenomena such as mood, and by understanding that QS data collection is just the first
step in creating qualitative feedback loops for behavior change. In the long-term future, the quantified self may
become additionally transformed into the extended exoself as data quantification and self-tracking enable the
development of new sense capabilities that are not possible with ordinary senses. The individual body becomes a
more knowable, calculable, and administrable object through QS activity, and individuals have an increasingly
intimate relationship with data as it mediates the experience of reality.
Introduction
What is the quantified self?
The quantified self (QS) is any individual engaged in
the self-tracking of any kind of biological, physical, behav-
ioral, or environmental information. There is a proactive
stance toward obtaining information and acting on it. A
variety of areas may be tracked and analyzed, for example,
weight, energy level, mood, time usage, sleep quality, health,
cognitive performance, athletics, and learning strategies
(Table 1).
1
Health is an important but not exclusive focus,
where objectives may range from general tracking to pathology
resolution to physical and mental performance enhancement.
In some sense everyone is already a self-tracker since many
individuals measure something about themselves or have things
measured about them regularly, and also because humans have
innate curiosity, tinkering, and problem-solving capabilities.
One of the earliest recorded examples of quantified self-tracking
is that of Sanctorius of Padua, who studied energy expenditure
in living systems by tracking his weight versus food intake and
elimination for 30 years in the 16th century.
2
Likewise there is a
philosophical precedent for the quantified self as intellectuals
THE QUANTIFIED
SELF:
Fundamental Disruption in Big Data Science
and Biological Discovery
Melanie Swan
MS Futures Gro up, Palo Alto, California
REVIEW
DOI: 10.1089/big.2012.0002
MARY ANN LIEBERT, INC.
VOL. 1 NO. 2
JUNE 2013 BIG DATA
BD85

ranging from the Epicureans to Heidegger and Foucault have
been concerned with the ‘care of the self.’ The terms ‘quanti-
fied self and ‘self-tracker’ are labels, contemporary formal-
izations belonging to the general progression in human history
of using measurement, science, and technology to bring order,
understanding, manipulation, and control to the natural world,
including the human body. While the concept of the quantified
self may have begun in n = 1 self-tracking at the individual level,
the term is quickly being extended to include other permuta-
tions like ‘group data’’—the idea of aggregated data from
multiple quantified selves as self-trackers share and work col-
laboratively with their data.
The Quantified Self in More Detail
The quantified self is starting to be a mainstream phenomenon
as 60% of U.S. adults are currently tracking their weight, diet,
or exercise routine, and 33% are monitoring other factors such
as blood sugar, blood pressure, headaches, or sleep patterns.
3,4
Further, 27% of U.S. Internet users track health data online,
5
9% have signed up for text message health alerts,
6
and there are
40,000 smartphone health applications available.
7
Diverse
publications such as the BBC,
8
Forbes,
9
and Vanity Fair
10
have
covered the quantified self movement, and it was a key theme
at CES 2013, a global consumer electronics trade show.
11
Commentators at a typical industry conference in 2012, Health
2.0, noted that more than 500 companies were making or
developing self-management tools, up 35% from the beginning
of the year, and that venture financing in the commensurate
period had risen 20%.
12
At the center of the quantified self
movement is, appropriately, the Quantified Self community,
which in October 2012 comprised 70 worldwide meetup
groups with 5,000 participants having attended 120 events
since the community formed in 2008 (event videos are avail-
able online at http://quantifiedself.com/). At the ‘show-and-
tell meetings, self-trackers come together in an environment
of trust, sharing, and reciprocity to discuss projects, tools,
techniques, and experiences. There is a standard format in
which projects are presented in a simplified version of the
scientific method, answering three questions: ‘What did you
do? ‘How did you do it?’ and ‘What did you learn?’ The
group’s third conference was held at Stanford University in
September 2012 with over 400 attendees. Other community
groups address related issues, for example Habit Design
(www.habitdesign.org), a U.S.-based national cooperative for
sharing best practices in developing sustainable daily habits via
behavior-change psychology and other mechanisms.
Exemplar quantified self projects
A variety of quantified self-tracking projects have been con-
ducted, and a few have been selected and described here to
give an overall sense of the diverse activity. One example is
design student Lauren Manning’s year of food visualization
(Fig. 1), where every type of food consumed was tracked over
a one-year period and visualized in different infographic
formats.
13
Another project is Tim McCormick’s Information
Diet, an investigation of media consumption and reading
practices in which he developed a mechanism for quantifying
the value of different infor mation inputs (e.g., Twitter feeds,
online news sites, blogs) to derive a prioritized information
stream for personal consumption.
14
A third example is Ro-
sane Oliveira’s multiyear investigation into diabetes and heart
disease risk, using her identical twin sister as a con trol, and
testing vegan dietary shifts and metabolism markers such as
insulin and glucose.
15
A fourth project nicely incorporating various elements of
quantified self-tracking, hardware hacking, quality-of-life
improvements, and serendipity is Nancy Dougherty’s smile-
triggered electromyogram (EMG) muscle sensor with an light
emitting diode (LED) headband display. The project is
Table 1. Quantified Self Tracking Categories and Variables
Physical activities: miles, steps, calories, repetitions, sets,
METs (metabolic equivalents)
Diet: calories consumed, carbs, fat, protein, specific
ingredients, glycemic index, satiety, portions, supplement
doses, tastiness, cost, location
Psychological states and traits: mood, happiness, irritation ,
emotions, anxiety, self-esteem, depression, confidence
Mental and cognitive states and traits: IQ, alertness, focus,
selective/sustained/divided attention, reaction, memory,
verbal fluency, patience, creativity, reasoning,
psychomotor vigilance
Environmental variables: location, architecture, weather,
noise, pollution, clutter, light, season
Situational variables: context, situation, gratification of
situation, time of day, day of week
Social variables: influence, trust, charism a, karma, current
role/status in the group or social network
Source: K. Augemberg.
1
(Reproduced with permission from
K. Augemberg)
FIG. 1. One year of food consumption visualization by Lauren
Manning.
THE QUANTIFIED SELF
Swan
86BD
BIG DATA JUNE 2013

designed to create unexpected moments of joy in human
interaction.
16
A fifth project of ongoing investigation has
been Robin Barooah’s personalized analysis of coffee con-
sumption, productivity, and meditation, with a finding that
concentration increased with the cessation of coffee drink-
ing.
17
Finally is Amy Robinson’s idea-tracking process in
which she e-mails ideas and inspirations to herself and later
visualizes them in Ge phi (an open-source graphing tool).
18
These projects demonstrate the range of topics, depth of
problem solving, and variety of methodologies characteristic
of QS projects. An additional indication of the tenor and
context of QS experimentation can be seen in exemplar
comments from the community’s 2012 conference (Table 2).
Tools for self-tracking and self-exper imentation
The range of tools used for QS tracking and experimentation
extends from the pen and paper of
manual tracking to spreadsheets,
mobile applications, and specialized
devices. Standard contemporary QS
devices include Fitbit pedometers,
myZeo sleep trackers, and Nike +
and Jawbone UP fitness trackers.
The Quantified Self web site listed
over 500 tools as of October 2012
(http://quantifiedself.com/guide/),
mostly concerning exercise, weight,
health, and goal achievement. Uni-
fied tracking for multiple activities is
available in mobile applications
such as Track and Share (www
.trackandshareapps.com) and Daily
Tracker (www.thedailytracker.com/).
19
Many QS solutions
pair the device with a web interface for data aggregation,
infographic display, and personal recommendations and ac-
tion plans. At present, the vast majority of QS tools do not
collect data automatically and require manual user data in-
put. A recent emergence in the community is tools created
explicitly for the rapid design and conduct of QS experi-
ments, including PACO, the Personal Analytics Companion
(https://quantifiedself.appspot.com/), and studycure (http://
studycure.com/).
Motivations for quantified self experimentation
Self-experimenters may have a wide range of motivations.
There is at least one study investigating self-tracking projects,
the DIYgenomi cs Knowledge Generation through Self-Ex-
perimentation Study (http://genomera.com/stud ies/knowl-
edge-generation-through-self-experimentation). The study
has found that the main reason individuals conducted QS
projects was to resolve or optimize a specific l ifestyle issue
such as sleep quality .
20
Another key
finding was that QS experimenters
often iterated through many differ-
ent solutions, and kinds of solutions,
before finding a final resolution
point. Some specific findings were
that poor sleep quality was the big-
gest factor that attributed to work
productivity for multiple individu-
als. For one individual, raising the
bed mattress solved the problem,
and for another, tracking and re-
ducing caffeine consumption. An-
other finding was that there was not
much introspection as to experi-
mental results and their meaning but
rather a pragmatic attitude toward having had a problem that
needed solving. A significant benefit of self-experimentation
projects is that the velocity of question asking and experiment
iterating can be much greater than with traditional methods.
At the meta-level, it is important to study the impact of the
practice of self-tracking. One reason is that health informa-
tion is itself an inte rvention.
21
Some studies have found that
there may be detrimental effects,
22
while others have docu-
mented the overall benefits of self-tracking to health and
wellness outcomes as well as the psychology of empowerment
and responsibility taking.
23–25
How the Quantified Self is Becoming an
Interesting Challenge for Big Data Science
Quantified self projects are becoming an interesting data
management and manipulation challenge for big data science
in the areas of data collection, integration, and analysis.
While quantified self data streams may not seem to conform
to the traditional concept and definition of big data—
‘data sets too large and complex to process with on-hand
database management tools’ (http://en.wikipedia.org/wiki/
Big_data)—or connote examples like Walmart’s 1 million
Table 2. Quotable Quotes from the 2012
Quantified Self Conference
Can I query my shirt, or am I limited to consuming the
querying that comes packaged in my shirt?
Our mission as quantified selves is to discover our mission.
Data is the new oil.
The lean hardware movement becomes the lean heartware
movement.
Information wants to be linked.
We think more about our cats/dogs than we do our real
pets, our microbiome.
Information conveyance, not data visualization.
Quantified emotion and data sensation through haptics.
Display of numerical data and graphs are the interface.
Quantifying is the intermediary step.exosenses (haptics,
wearable electronic senses) is really what we want.
Perpetual data explosion.
The application of the metric distorts the data and the
experience.
‘ANOTHER FINDING WAS THAT
THERE WAS NOT MUCH
INTROSPECTION AS TO
EXPERIMENTAL RESULTS AND
THEIR MEANING BUT RATHER
A PRAGMATIC ATTITUDE
TOWARD HAVING HAD A
PROBLEM THAT NEEDED
SOLVING.’
Swan
REVIEW
MARY ANN LIEBERT, INC.
VOL. 1 NO. 2
JUNE 2013 BIG DATA
BD87

transactions per hour being transmitted to databases that are
2.5 petabytes in size (http://wikibon.org/blog/big-data-statis-
tics/), the quantified self, and health and biology more gen-
erally, are becoming full-fledged big data problems in many
ways. First, individuals may not have the tools available on
local computing resources to store, query, and manipulate QS
data sets. Second, QS data sets are growing in size. Early QS
projects may have consisted of manageable data sets of man-
ually-tracked data (i.e., ‘small data’). This is no longer the case
as much larger QS data sets are being generated. For example,
heart rate monitors, important for predictive cardiac risk
monitoring, take samples on the order of 250 times per second,
which generates 9 gigabytes of data per person per month.
Appropriate compression algorithms and a translation of the
raw data into aggregated data that would be more appropriate
for long-term storage have not yet been developed.
Another example is personal genomic data from ‘SNP chip’
(i.e., single nucleotide polymorphism) companies like 23an-
dMe, Navigenics, and deCODEme. These files constitute 1–2%
of the human genome and typically have 1–1.2 million records,
which are unwieldy to load and query (especially when com-
paring multiple files) without specific data-management tools.
Whole human genome files are much larger than SNP files.
Vendors Illumina and Knome ship multi-terabyte-sized files to
the consumer in a virtually unusable format on a standalone
computer or zip drive. In the short-term, standard cloud-based
services for QS data storage, sharing, and manipulation would
be extremely useful. In the long-term, big data solutions are
needed to implement the vision of a systemic and continuous
approach to automated, unobtrusive data collection from
multiple sources that is processed into a stream of behavioral
insights and interventions. Making progress in the critical
contemporary challenge of preventive medicine–recognizing
early warning signs and eliminating conditions during the 80%
of their preclinical lifecycle—may likely require regular col-
lection on the order of a billion data points per person.
26
Specific big data science opportunities in data collection, in-
tegration, and analysis are discussed below in the sections data
collection, data integration, data analysis, and opportunities in
working with large data corpora.
Data collection: big health data streams
There is a need for big data scientists to facilitate the iden-
tification, collection, and storage of data streams related to
QS activity. Both traditional institutio nal health professionals
and QS individuals are starting to find themselves in a whole
new era of massively expanded data and have the attendant
challenge of employing these new data streams toward pa-
thology resolution and wellness outcomes. Big health data
streams can be grouped into three categories: traditional
medical data (personal and family health history, medication
history, lab reports, etc.), ‘omics’ data (genomics, micro-
biomics, proteomics, metabolomics, etc.), and quantified-self
tracking data (Fig. 2).
27
A key shift is that due to the plum-
meting cost of sequencing and Internet-based data storage,
many of these data streams are now available directly to
consumers. In the omics category of health data streams, as of
January 2013 genomic profiling was available for $99 from
23andMe (sequencing 1 million of the most-researched
SNPs) and microbiomic profiling was available for $79 from
uBiome (www.indiegogo.com/ubiome) and $99 from the
American Gut Project (www.indiegogo.com/americangut). A
broad consumer application of integrated omics data streams
is not yet available, as institutional projects
28,29
are them-
selves in early stages, but could quickly emerge from acade-
mia through consumer proteomics services such as Talking20
(the body’s 20 amino acids) who offers $5 home blood-test
cards for a multi-item pane l (e.g., vitamins, steroids, and
cholesterol).
30
Data integration
A key challenge in QS projects and the realization of pre-
ventive medicine more generally is integrating big health data
streams, especially blending genomic and environmental
data. As U.S. National Institutes of Health director Francis
Collins remarked in 2010, ‘Genetics loads the gun and en-
vironment pulls the trigger.’
31
It is a general heuristic for
common disease conditions like cancer and heart disease that
genetics have a one-third contribution to outcome and en-
vironment two-thirds.
32
There are some notable examples of
QS projects involving the integration of multiple big health
data streams. Self trackers typically obtain underlying geno-
mic and microbiomic profiling and review this information
together with blood tests and proteomic tests to determine
baseline levels and variability for a diversity of markers
and then experiment with different interventions for opti-
mized health and pathology reduction. Some examples
of these kinds of QS data integration projects include
DIYgenomics studies,
33
Leroy Hood’s 4P medicine (predictive,
FIG. 2. Big health data streams are becoming increasingly
consumer-available.
THE QUANTIFIED SELF
Swan
88BD
BIG DATA JUNE 2013

personalized, preventive, and participatory),
26
David Duncan’s
Experimental Man project,
34
Larry Smarr’s Crohn’s disease
tracking microbiomic sequencing and lactoferrin analysis
project,
35
and Steven Fowkes’s Thyroid Hormone Testing
project.
36
Studies may be conducted individually (n = 1), in
groups (aggregations of n = 1 individuals), or in systems (e.g.,
families, athletic teams, or workplace groups). For group
studies, crowdsourced research collaborations, health social
networks, and mobile applications are allowing studies to be
conducted at new levels of scale and specificity, for example,
having thousands of participants as opposed to dozens or
hundreds.
37,38
The ability to aggregate dozens of QS data streams to look for
correlations is being developed by projects such as Singly,
Fluxstream, Bodytrack, Sympho.Me, Sen.se, Cosm, and the
Health Graph API.
39
Figure 3 shows a ‘mulitviz’ display
from Sen.se that plots coffee consumption, social interaction,
and mood to find apparent linkage between social interaction
and mood, although correlation is not necessarily causa-
tion.
40
The aggregation of multiple
data streams could be a preliminary
step toward two-way communica-
tion in big data QS applications that
offer real-time inter ventional sug-
gestions based on insights from
multifactor sensor input processing.
This kind of functionality could be
extended to the development of
flexible services that respond in real-
time to demand at not just the in-
dividual level but also the commu-
nity level. A concrete example could
be using the timing, type, and cy-
clicality of 4 million purchase transactions that occurred
during Easter week in Spain (http://senseable.mit.edu/bbva/)
to design flexible bank, gas station, and store hours, and
purchase recommendation services that respond in real-time
to community demand.
Data analysis
Following data collect ion and integration, the next step is
data analysis. A classic big data science problem is extracting
signal from noise. The objective of many QS projects is sifting
through large sets of collected data to find the exception that
is the sign of a shift in pattern or an early warning signal.
Ultimately, 99% of the data may be useless and easily dis-
carded. However, since continuous monitoring and QS
sensing is a new area and use cases have not been defined and
formalized, much of the data must be stored for character-
ization, investigation, and validation. A high-profile use case
is heart failure, where there is typically a two-week prevention
window before a cardiac event during which heart rate var-
iability may be predictive of pathology development. Trans-
lating heart rate data sampled at 250 times per second into
early warnings and intervention is an unres olved challenge.
One thing that could help is the invention of a new genera-
tion of data compression algorithms that could allow
searching and pattern-finding within compressed files. Si-
milar to the challenge of producing meaningful signals from
heart-rate variability data is the ex-
ample of galvanic skin response
(GSR). Here too, data metrics that
are sampled at many times per sec-
ond have been available for decades,
but the information has been too
noisy to produce useful signals cor-
related with external stimulus and
behavior. It is only through the ap-
plication of innovations in multiple
areas—hardware design, wearable
biosensors, signal processing, and
big data methods—that GSR infor-
mation is starting to become more
useful.
41
Analyzing multiple QS data streams in real-time (for
example, heart-r ate variability, galvanic skin response, tem-
perature, movement, and EEG activity) may likely be re-
quired for accurate assessment and intervention regarding
biophysical state.
FIG. 3. Seeking correlations: multiviz data stream infographing available on the Sen.se Platform.
40
(Reproduced with permission from
Sen.se)
‘THE OBJECTIVE OF MANY QS
PROJECTS IS SIFTING
THROUGH LARGE SETS OF
COLLECTED DATA TO FIND
THE EXCEPTION THAT IS THE
SIGN OF A SHIFT IN PATTERN
OR AN EARLY WARNING
SIGNAL.’
Swan
REVIEW
MARY ANN LIEBERT, INC.
VOL. 1 NO. 2
JUNE 2013 BIG DATA
BD89

Citations
More filters
Journal Article
TL;DR: Prospect Theory led cognitive psychology in a new direction that began to uncover other human biases in thinking that are probably not learned but are part of the authors' brain’s wiring.
Abstract: In 1974 an article appeared in Science magazine with the dry-sounding title “Judgment Under Uncertainty: Heuristics and Biases” by a pair of psychologists who were not well known outside their discipline of decision theory. In it Amos Tversky and Daniel Kahneman introduced the world to Prospect Theory, which mapped out how humans actually behave when faced with decisions about gains and losses, in contrast to how economists assumed that people behave. Prospect Theory turned Economics on its head by demonstrating through a series of ingenious experiments that people are much more concerned with losses than they are with gains, and that framing a choice from one perspective or the other will result in decisions that are exactly the opposite of each other, even if the outcomes are monetarily the same. Prospect Theory led cognitive psychology in a new direction that began to uncover other human biases in thinking that are probably not learned but are part of our brain’s wiring.

4,351 citations

Journal ArticleDOI
TL;DR: The present paper analyzes in detail the potential of 5G technologies for the IoT, by considering both the technological and standardization aspects and illustrates the massive business shifts that a tight link between IoT and 5G may cause in the operator and vendors ecosystem.
Abstract: The IoT paradigm holds the promise to revolutionize the way we live and work by means of a wealth of new services, based on seamless interactions between a large amount of heterogeneous devices. After decades of conceptual inception of the IoT, in recent years a large variety of communication technologies has gradually emerged, reflecting a large diversity of application domains and of communication requirements. Such heterogeneity and fragmentation of the connectivity landscape is currently hampering the full realization of the IoT vision, by posing several complex integration challenges. In this context, the advent of 5G cellular systems, with the availability of a connectivity technology, which is at once truly ubiquitous, reliable, scalable, and cost-efficient, is considered as a potentially key driver for the yet-to emerge global IoT. In the present paper, we analyze in detail the potential of 5G technologies for the IoT, by considering both the technological and standardization aspects. We review the present-day IoT connectivity landscape, as well as the main 5G enablers for the IoT. Last but not least, we illustrate the massive business shifts that a tight link between IoT and 5G may cause in the operator and vendors ecosystem.

1,224 citations


Cites background from "Fundamental Disruption in Big Data ..."

  • ...paradigm [9] which is currently unfolding with the advent of fitness and health tracking systems, smart watches and sensor...

    [...]

Journal ArticleDOI
TL;DR: This paper will provide psychophysiological researchers with recommendations and practical advice concerning experimental designs, data analysis, and data reporting to ensure that researchers starting a project with HRV and cardiac vagal tone are well informed regarding methodological considerations in order for their findings to contribute to knowledge advancement in their field.
Abstract: Psychophysiological research integrating heart rate variability (HRV) has increased during the last two decades, particularly given the fact that HRV is able to index cardiac vagal tone. Vagal tone, which represents the activity of the parasympathetic system, is acknowledged to be linked with many phenomena relevant for psychophysiological research, including self-regulation at the cognitive, emotional, social, and health levels. The ease of HRV collection and measurement coupled with the fact it is relatively affordable, non-invasive and pain free makes it widely accessible to many researchers. This ease of access should not obscure the difficulty of interpretation of HRV findings that can be easily misconstrued, however this can be controlled to some extent through correct methodological processes. Standards of measurement were developed two decades ago by a Task Force within HRV research, and recent reviews updated several aspects of the Task Force paper. However, many methodological aspects related to HRV in psychophysiological research have to be considered if one aims to be able to draw sound conclusions, which makes it difficult to interpret findings and to compare results across laboratories. Those methodological issues have mainly been discussed in separate outlets, making difficult to get a grasp on them, and thus this paper aims to address this issue. It will help to provide psychophysiological researchers with recommendations and practical advice concerning experimental designs, data analysis, and data reporting. This will ensure that researchers starting a project with HRV and cardiac vagal tone are well informed regarding methodological considerations in order for their findings to contribute to knowledge advancement in their field.

1,096 citations


Cites background from "Fundamental Disruption in Big Data ..."

  • ...…upward spike of the R wave of the QRS complex which can best be determined through ECG. Finally, the emergence of the quantified self movement (Swan, 2013) – lay people willing to track and monitor their own psychophysiological data – is giving birth to many consumer devices aiming to measure…...

    [...]

17 Dec 2010
TL;DR: The authors survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000, using a corpus of digitized texts containing about 4% of all books ever printed.
Abstract: L'article, publie dans Science, sur une des premieres utilisations analytiques de Google Books, fondee sur les n-grammes (Google Ngrams) We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can ...

735 citations

Journal ArticleDOI
TL;DR: This paper compiles, summarizes, and organizes machine learning challenges with Big Data, highlighting the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity.
Abstract: The Big Data revolution promises to transform how we live, work, and think by enabling process optimization, empowering insight discovery and improving decision making. The realization of this grand potential relies on the ability to extract value from such massive data through data analytics; machine learning is at its core because of its ability to learn from data and provide data driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era, and thus are based upon multiple assumptions, such as the data set fitting entirely into memory, what unfortunately no longer holds true in this new context. These broken assumptions, together with the Big Data characteristics, are creating obstacles for the traditional techniques. Consequently, this paper compiles, summarizes, and organizes machine learning challenges with Big Data. In contrast to other research that discusses challenges, this work highlights the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity. Moreover, emerging machine learning approaches and techniques are discussed in terms of how they are capable of handling the various challenges with the ultimate objective of helping practitioners select appropriate solutions for their use cases. Finally, a matrix relating the challenges and approaches is presented. Through this process, this paper provides a perspective on the domain, identifies research gaps and opportunities, and provides a strong foundation and encouragement for further research in the field of machine learning with Big Data.

592 citations


Cites background from "Fundamental Disruption in Big Data ..."

  • ...Swan [49] suggested that data analysis should include a step to extract signal from noise directly following the steps of data collection and integration....

    [...]

References
More filters
Journal Article
TL;DR: Prospect Theory led cognitive psychology in a new direction that began to uncover other human biases in thinking that are probably not learned but are part of the authors' brain’s wiring.
Abstract: In 1974 an article appeared in Science magazine with the dry-sounding title “Judgment Under Uncertainty: Heuristics and Biases” by a pair of psychologists who were not well known outside their discipline of decision theory. In it Amos Tversky and Daniel Kahneman introduced the world to Prospect Theory, which mapped out how humans actually behave when faced with decisions about gains and losses, in contrast to how economists assumed that people behave. Prospect Theory turned Economics on its head by demonstrating through a series of ingenious experiments that people are much more concerned with losses than they are with gains, and that framing a choice from one perspective or the other will result in decisions that are exactly the opposite of each other, even if the outcomes are monetarily the same. Prospect Theory led cognitive psychology in a new direction that began to uncover other human biases in thinking that are probably not learned but are part of our brain’s wiring.

4,351 citations

Book
17 Apr 2007
TL;DR: The Black Swan: The Impact of the Highly Improbable as mentioned in this paper is a book about Black Swans: the random events that underlie our lives, from bestsellers to world disasters, that are impossible to predict; yet after they happen we always try to rationalize them.
Abstract: Nassim Nicholas Taleb's phenomenal international bestseller The Black Swan: The Impact of the Highly Improbable shows us how to stop trying to predict everything - and take advantage of uncertainty. What have the invention of the wheel, Pompeii, the Wall Street Crash, Harry Potter and the internet got in common? Why are all forecasters con-artists? What can Catherine the Great's lovers tell us about probability? Why should you never run for a train or read a newspaper? This book is all about Black Swans: the random events that underlie our lives, from bestsellers to world disasters. Their impact is huge; they're impossible to predict; yet after they happen we always try to rationalize them. "Taleb is a bouncy and even exhilarating guide...I came to relish what he said, and even develop a sneaking affection for him as a person." (Will Self, Independent on Sunday). "He leaps like some superhero of the mind." (Boyd Tonkin, Independent). "Funny, quirky and thought-provoking...confirms his status as a guru for every would-be Damien Hirst, George Soros and aspirant despot." (John Cornwell, Sunday Times). "Idiosyncratically brilliant." (Niall Ferguson, Sunday Telegraph). "Great fun...brash, stubborn, entertaining, opinionated, curious, cajoling. " (Stephen J. Dubner, Co-Author of Freakonomics).

4,036 citations

01 Jan 2012

2,450 citations

Journal ArticleDOI
14 Jan 2011-Science
TL;DR: This work surveys the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000, and shows how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology and the pursuit of fame.
Abstract: We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.

2,257 citations

Posted Content
TL;DR: In this paper, a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization was used to train a face detector without having to label images as containing a face or not.
Abstract: We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.

1,796 citations