Machine Learning with partially labeled Data for Indoor Outdoor Detection
Summary (2 min read)
Introduction
- The authors also propose to extend it to mobile networks to deal with the challenge of detecting the environmental context of mobile users from network side.
- The data measured by multiple UEs during their connection is sent to eNB, using standardized procedures.
- The authors are interested in Machine Learning (ML), one of the popular techniques, for automatic IOD.
- Among ML families, the authors consider supervised learning and more particularly semi-supervised learning which can be seen as a mix of supervised and unsupervised approaches.
III. COLLECTED DATA FOR IOD
- The authors analyze the statistical differences by focusing on the empirical cumulative distribution function (CDF) between indoor and outdoor environments, using a large and real data-set collected at multiple places, many environments.
- The authors illustrate the impact of the two environments on the empirical CDFs, according to where the data is collected.
A. Data Description
- The authors large data set consists in Time, 3 LUMD radio signals, the metric Timing Advance (TA) and the label when it is known.
- The set of these signals has been collected during 9 months, 24h/7 (From October 2017 until June 2018), with an average of 1 measurement per 15 seconds while the mobile phone session is active and 1 measurement per 2 minutes otherwise.
- The dataset is made of 40% of labeled data and 60% of unlabelled data.
B. Data collection: crowdsourcing vs. drive-test mode
- In crowdsourcing mode, the collected data consists of signals measured by the mobile phone and sent to the eNB.
- The significant offset between the indoor and the outdoor curves, results from substantial difference and attenuation variation in radio signal propagation.
- Also the extreme values seen in the two indoor and outdoor CDFs (located in tails) get similar and the division between the two gets blurred.
- To model this way of collecting data, referred as drivetest mode, the authors extract a portion data (EPD) from the whole dataset.
IV. CLASSIFICATION USING SUPERVISED LEARNING OR CLUSTERING
- After analyzing the statistical properties of I/O environments, the authors first evaluate the accuracy and the performance of supervised classifiers for IOD.
- For this, the authors use the accuracy metric which is the ratio of correctly classified instances divided by the total instances and the metric F1− score that is by definition the weighted average of Precision and Recall according to the following relation: F1− score = 2. P recision.
- Additionally, Tables I, and III show that (RSRP, CQI) as input provides similar results as (RSRP, RSRQ) when used for classifying EPD or crowdsourcing data.
- The results shows that the information contained in CQI is also useful for IOD and thus, (RSRP, CQI) is also a good candidate for IOD.
- Learning the user environment, only based on drive-test data, is thus not enough to learn the complexities of users’ real life.
VI. RESULTS AND DISCUSSION
- This section evaluates the performance of HSSL on the crowdsourced data.
- Actually, ReLU is the most widely used activation function while designing neural networks today.
- The system receives both labeled and unlabeled data as inputs.
- For this, the authors aim to compare HSSL (including SVM or DL) with SVM and DL, alone, when trained over same amount of tagged data (with the only difference that HSSL in addition also uses untagged data).
- The authors observe that IOD performs better using DL than using SVM in both cases.
VII. CONCLUSION
- The authors investigated the problem of IOD performed at network side using 3GPP signals and Timing Advance data collected inside the infrastructure.
- The authors first showed that using a drive test dataset is insufficient to mimic the real world complexity and reveal the real user behavior.
- By diversifying the environments more (using a highly representative crowdsourced dataset) during the training phase, the authors showed that the more environments they have for the training phase, the better the supervised classifier performs.
- The authors also showed that adding a new parameter, Timing Advance, can improve IOD performance.
- The HSSL system presents satisfactory performance even when facing unknown environments.
Did you find this useful? Give us your feedback
Citations
12 citations
9 citations
7 citations
Cites background or methods from "Machine Learning with partially lab..."
...This unbalancing between the categories of various environments, observed in case of two classes in [3], is also augmented when classifying with more than two classes....
[...]
...In [12], authors used the same signals as [3], but with addition of a mobility indicator to solve some difficult cases of detection like when the user is in train (outdoor environment), and suffers from a drastic deterioration of the RSRP signal, then the Mobility Indicator helps to better detect such case....
[...]
...Both papers [3] and [12] show good performance of the UED...
[...]
...collected by the phone device and sent to the mobile network via 3GPP procedures [3]....
[...]
...In [3], [12] authors used grid search to optimize the hyperparameters of their deep learning model....
[...]
7 citations
Cites background or methods from "Machine Learning with partially lab..."
...On the other hand, the Indoor-Outdoor (IO) can be detected by analyzing signal strengths from external sensors [8], [9], by recognizing certain landmarks with image processing [10], or by looking for specific signal features with the help of machine learning algorithms [11]–[13]....
[...]
...This can be realized by machine leaning methods such as in [8], [11]–[13], [21] as well as by image processing methods to recognize certain landmarks [10]....
[...]
4 citations
Additional excerpts
...In fact, different types of data representing the channel state may be used, such as SNIR, CQI, and others [17]....
[...]
References
47,974 citations
"Machine Learning with partially lab..." refers methods in this paper
...We have used both scikit-learn [20] and keras [21] in python for the HSSL implementation....
[...]
28,898 citations
958 citations
440 citations
"Machine Learning with partially lab..." refers background in this paper
...Recently, mobile devices are being utilized to know the consuming habits of individuals and communities [1], [2], [3]....
[...]
330 citations
"Machine Learning with partially lab..." refers methods in this paper
...As in [5], [15], [16], our approach uses both tagged and untagged data in order to improve the IOD classifier training, while maintaining the same good performances for a given ratio of tagged and untagged data....
[...]