scispace - formally typeset
Search or ask a question

Showing papers by "Eemil Lagerspetz published in 2021"


Journal ArticleDOI
TL;DR: In this article, the authors survey the rapidly growing research landscape of low-cost sensor technologies for air quality monitoring and their calibration using machine learning techniques and identify open research challenges and present directions for future research.
Abstract: The significance of air pollution and the problems associated with it are fueling deployments of air quality monitoring stations worldwide. The most common approach for air quality monitoring is to rely on environmental monitoring stations, which unfortunately are very expensive both to acquire and to maintain. Hence, environmental monitoring stations are typically sparsely deployed, resulting in limited spatial resolution for measurements. Recently, low-cost air quality sensors have emerged as an alternative that can improve the granularity of monitoring. The use of low-cost air quality sensors, however, presents several challenges: They suffer from cross-sensitivities between different ambient pollutants; they can be affected by external factors, such as traffic, weather changes, and human behavior; and their accuracy degrades over time. Periodic re-calibration can improve the accuracy of low-cost sensors, particularly with machine-learning-based calibration, which has shown great promise due to its capability to calibrate sensors in-field. In this article, we survey the rapidly growing research landscape of low-cost sensor technologies for air quality monitoring and their calibration using machine learning techniques. We also identify open research challenges and present directions for future research.

58 citations


Journal ArticleDOI
TL;DR: In this paper, the authors explored the relationship between the behavioral features and depression using correlation and bivariate linear mixed models (LMMs) and leveraged 5 supervised machine learning (ML) algorithms with hyperparameter optimization, nested cross-validation, and imbalanced data handling to predict depression.
Abstract: Background: Depression is a prevalent mental health challenge. Current depression assessment methods using self-reported and clinician-administered questionnaires have limitations. Instrumenting smartphones to passively and continuously collect moment-by-moment data sets to quantify human behaviors has the potential to augment current depression assessment methods for early diagnosis, scalable, and longitudinal monitoring of depression. Objective: The objective of this study was to investigate the feasibility of predicting depression with human behaviors quantified from smartphone data sets, and to identify behaviors that can influence depression. Methods: Smartphone data sets and self-reported 8-item Patient Health Questionnaire (PHQ-8) depression assessments were collected from 629 participants in an exploratory longitudinal study over an average of 22.1 days (SD 17.90; range 8-86). We quantified 22 regularity, entropy, and SD behavioral markers from the smartphone data. We explored the relationship between the behavioral features and depression using correlation and bivariate linear mixed models (LMMs). We leveraged 5 supervised machine learning (ML) algorithms with hyperparameter optimization, nested cross-validation, and imbalanced data handling to predict depression. Finally, with the permutation importance method, we identified influential behavioral markers in predicting depression. Results: Of the 629 participants from at least 56 countries, 69 (10.97%) were females, 546 (86.8%) were males, and 14 (2.2%) were nonbinary. Participants’ age distribution is as follows: 73/629 (11.6%) were aged between 18 and 24, 204/629 (32.4%) were aged between 25 and 34, 156/629 (24.8%) were aged between 35 and 44, 166/629 (26.4%) were aged between 45 and 64, and 30/629 (4.8%) were aged 65 years and over. Of the 1374 PHQ-8 assessments, 1143 (83.19%) responses were nondepressed scores (PHQ-8 score <10), while 231 (16.81%) were depressed scores (PHQ-8 score ≥10), as identified based on PHQ-8 cut-off. A significant positive Pearson correlation was found between screen status–normalized entropy and depression (r=0.14, P<.001). LMM demonstrates an intraclass correlation of 0.7584 and a significant positive association between screen status–normalized entropy and depression (β=.48, P=.03). The best ML algorithms achieved the following metrics: precision, 85.55%-92.51%; recall, 92.19%-95.56%; F1, 88.73%-94.00%; area under the curve receiver operating characteristic, 94.69%-99.06%; Cohen κ, 86.61%-92.90%; and accuracy, 96.44%-98.14%. Including age group and gender as predictors improved the ML performances. Screen and internet connectivity features were the most influential in predicting depression. Conclusions: Our findings demonstrate that behavioral markers indicative of depression can be unobtrusively identified from smartphone sensors’ data. Traditional assessment of depression can be augmented with behavioral markers from smartphones for depression diagnosis and monitoring.

35 citations


Journal ArticleDOI
TL;DR: In this paper, the feasibility of using wearable low-cost pollution sensors for capturing the total exposure of commuters is analyzed by using extensive experiments carried out in the Helsinki metropolitan region, and they demonstrate that wearable sensors can capture subtle variations caused by differing routes, passenger density, location within a carriage, and other factors.
Abstract: Transit activities are a significant contributor to a person’s daily exposure to pollutants. Currently obtaining accurate information about the personal exposure of a commuter is challenging as existing solutions either have a coarse monitoring resolution that omits subtle variations in pollutant concentrations or are laborious and costly to use. We contribute by systematically analysing the feasibility of using wearable low-cost pollution sensors for capturing the total exposure of commuters. Through extensive experiments carried out in the Helsinki metropolitan region, we demonstrate that low-cost sensors can capture the overall exposure with sufficient accuracy, while at the same time providing insights into variations within transport modalities. We also demonstrate that wearable sensors can capture subtle variations caused by differing routes, passenger density, location within a carriage, and other factors. For example, we demonstrate that location within the vehicle carriage can result in up to 25 % increase in daily pollution exposure – a significant difference that existing solutions are unable to capture. Finally, we highlight the practical benefits of low-cost sensors as a pollution monitoring solution by introducing applications that are enabled by low-cost wearable sensors.

15 citations


Journal ArticleDOI
TL;DR: It is discovered that Covid-19 leads to a decrease in users’ smartphone engagement and network switches, but an increase in WiFi usage, while the values of smartphone usage data for fighting against the epidemic are explored.
Abstract: The outbreak of Covid-19 changed the world as well as human behavior. In this article, we study the impact of Covid-19 on smartphone usage. We gather smartphone usage records from a global data collection platform called Carat, including the usage of mobile users in North America from November 2019 to April 2020. We then conduct the first study on the differences in smartphone usage across the outbreak of Covid-19. We discover that Covid-19 leads to a decrease in users’ smartphone engagement and network switches, but an increase in WiFi usage. Also, its outbreak causes new typical diurnal patterns of both memory usage and WiFi usage. Additionally, we investigate the correlations between smartphone usage and daily confirmed cases of Covid-19. The results reveal that memory usage, WiFi usage, and network switches of smartphones have significant correlations, whose absolute values of Pearson coefficients are greater than 0.8. Moreover, smartphone usage behavior has the strongest correlation with the Covid-19 cases occurring after it, which exhibits the potential of inferring outbreak status. By conducting extensive experiments, we demonstrate that for the inference of outbreak stages, both Macro-F1 and Micro-F1 can achieve over 0.8. Our findings explore the values of smartphone usage data for fighting against the epidemic.

15 citations


Proceedings ArticleDOI
14 Jun 2021
TL;DR: In this article, the feasibility of re-purposing existing infrastructure of occupancy monitoring sensors and environmental sensors for the dual purpose of monitoring social distancing and supporting disease transmission risk estimation was evaluated.
Abstract: Social distancing is a critical tool for mitigating disease transmission, particularly in crowded indoor spaces. In this paper, we contribute by assessing the feasibility of re-purposing existing infrastructure of occupancy monitoring sensors and environmental sensors for the dual purpose of monitoring social distancing and supporting disease transmission risk estimation. We consider 410 continuous days of measurements from CO 2 and PIR (passive infrared) motion detectors collected from a collaborative smart space, prior to the start of the pandemic in 2017-2018. We demonstrate how these sensors can be used to estimate occupancy levels, as well as analyze occupancy patterns within the space. We also consider the use of overall air quality within the space for estimating insights about potential transmission risks. Based on our analysis, we derive insights into how infrastructure-based sensors can be used to detect problematic areas in the space and offer guidelines on how to modify these areas to be more social distancing aware.

7 citations


Journal ArticleDOI
09 Jul 2021
TL;DR: In this article, the problem of private data release through probabilistic modeling is formulated, and the problem is transformed into choosing a model for the data, allowing also the inclusion of prior knowledge, which improves the quality of the synthetic data.
Abstract: Summary Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this limitation but would leave open the problem of designing what kind of synthetic data. We propose formulating the problem of private data release through probabilistic modeling. This approach transforms the problem of designing the synthetic data into choosing a model for the data, allowing also the inclusion of prior knowledge, which improves the quality of the synthetic data. We demonstrate empirically, in an epidemiological study, that statistical discoveries can be reliably reproduced from the synthetic data. We expect the method to have broad use in creating high-quality anonymized data twins of key datasets for research.

6 citations


Journal ArticleDOI
21 Apr 2021
TL;DR: It is revealed that there is significant evolution in long-term app usage that 60%-70% of users change their app usage patterns during the duration of more than 3 years and a variety of app pattern change modes are discovered.
Abstract: In the past decade, mobile app usage has played an important role in our daily life. Existing studies have shown that app usage is intrinsically linked with, among others, demographics, social and economic factors. However, due to data limitations, most of these studies have a short time span and treat users in a static manner. To date, no study has shown whether changes in socioeconomic status or other demographics are reflected in long-term app usage behavior. In this paper, we contribute by presenting the first ever long-term study of individual mobile app usage dynamics and how app usage behavior of individuals is influenced by changes in socioeconomic demographic factors over time. Through a novel app dataset we collected, from which we extracted records of 1608 long-term users with more than 3-year app usage and their detailed socioeconomic attributes, we verify the stable correlation between user app usage and user socioeconomic attributes over time and identify a number of representative app usage patterns in connection with specific user attributes. On the basis, we analyze the long-term app usage dynamics and reveal that there is significant evolution in long-term app usage that 60–70% of users change their app usage patterns during the duration of more than 3 years. We further discover a variety of app pattern change modes and demonstrate that the long-term app usage behavior change reflects corresponding transition in socioeconomic attributes, such as change of civil status, family size, transition in job or economic status.

4 citations


Journal ArticleDOI
TL;DR: In this article, the authors study the mobility laws of location-based games and find that the characteristics governing personal mobility remain consistent with a truncated Levy-flight model and that the increase can be explained by a larger number of short-hops, i.e., individuals explore their local neighborhoods more thoroughly instead of actively visiting new areas.
Abstract: Mobility is a fundamental characteristic of human society that shapes various aspects of our everyday interactions. This pervasiveness of mobility makes it paramount to understand factors that govern human movement and how it varies across individuals. Currently, factors governing variations in personal mobility are understudied with existing research focusing on explaining the aggregate behaviour of individuals. Indeed, empirical studies have shown that the aggregate behaviour of individuals follows a truncated Levy-flight model, but little understanding exists of the laws that govern intra-individual variations in mobility resulting from transportation choices, social interactions, and exogenous factors such as location-based mobile applications. Understanding these variations is essential for improving our collective understanding of human mobility, and the factors governing it. In this article, we study the mobility laws of location-based gaming—an emerging and increasingly popular exogenous factor influencing personal mobility. We analyse the mobility changes considering the popular PokemonGO application as a representative example of location-based games and study two datasets with different reporting granularity, one captured through location-based social media, and the other through smartphone application logging. Our analysis shows that location-based games, such as PokemonGO, increase mobility—in line with previous findings—but the characteristics governing mobility remain consistent with a truncated Levy-flight model and that the increase can be explained by a larger number of short-hops, i.e., individuals explore their local neighborhoods more thoroughly instead of actively visiting new areas. Our results thus suggest that intra-individual variations resulting from location-based gaming can be captured by re-parameterization of existing mobility models.

1 citations