scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Tweet-scan-post: a system for analysis of sensitive private data disclosure in online social media

TL;DR: In this article, a Tweet-Scan-Post (TSP) framework is proposed to identify the presence of sensitive private data (SPD) in user's posts under personal, professional, and health domains.
Abstract: The social media technologies are open to users who are intended in creating a community and publishing their opinions of recent incidents. The participants of the online social networking sites remain ignorant of the criticality of disclosing personal data to the public audience. The private data of users are at high risk leading to many adverse effects like cyberbullying, identity theft, and job loss. This research work aims to define the user entities or data like phone number, email address, family details, health-related information as user’s sensitive private data (SPD) in a social media platform. The proposed system, Tweet-Scan-Post (TSP), is mainly focused on identifying the presence of SPD in user’s posts under personal, professional, and health domains. The TSP framework is built based on the standards and privacy regulations established by social networking sites and organizations like NIST, DHS, GDPR. The proposed approach of TSP addresses the prevailing challenges in determining the presence of sensitive PII, user privacy within the bounds of confidentiality and trustworthiness. A novel layered classification approach with various state-of-art machine learning models is used by the TSP framework to classify tweets as sensitive and insensitive. The findings of TSP systems include 201 Sensitive Privacy Keywords using a boosting strategy, sensitivity scaling that measures the degree of sensitivity allied with a tweet. The experimental results revealed that personal tweets were highly related to mother and children, professional tweets with apology, and health tweets with concern over the father’s health condition.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors propose an approach based on machine learning and sentence embedding techniques with the primary goal of providing privacy awareness to users and, as a consequence, full control over their data during online activities.

22 citations

Proceedings ArticleDOI
26 Jul 2022
TL;DR: An agent called Aegis is proposed to calculate the potential risk incurred by multi-party members in order to push privacy-preserving nudges to the sharer and is inspired by the consequentialist approach in normative ethical problem-solving techniques.
Abstract: The proliferation of social media set the foundation for the culture of over-disclosure where many people document every single event, incident, trip, etc. for everyone to see. Raising the individual's awareness of the privacy issues that they are subjecting themselves to can be challenging. This becomes more complex when the post being shared includes data "owned" by others. The existing approaches aiming to assist users in multi-party disclosure situations need to be revised to go beyond preferences to the "good" of the collective. This paper proposes an agent called Aegis to calculate the potential risk incurred by multi-party members in order to push privacy-preserving nudges to the sharer. Aegis is inspired by the consequentialist approach in normative ethical problem-solving techniques. The main contribution is the introduction of a social media-specific risk equation based on data valuation and the propagation of the post from intended to unintended audience. The proof-of-concept reports on how Aegis performs based on real-world data from the SNAP dataset and synthetically generated networks.

2 citations

Book ChapterDOI
27 Sep 2022
TL;DR: In this paper , a novel model K-MNSOA is proposed for privacy preserving data publishing, which protect sensitive data privacy breach, even if the data set contains multiple sensitive numerical overlapped attributes.
Abstract: Knowledge is the main discussing and explored topic of today’s era. Everyone is working toward improving information and tries to consider it as a ladder to move forward. Data is the main object to get information, and data is considered as a big data nowadays as it contains numerous information in all directions. As knowledge is bliss, it is also possible that an adversary can use this information to harm an individual. To protect data from an adversary privacy preserving data publishing techniques is used. But when multiple sensitive data present in a data set which is correlated to each other’s several model are unable to protect data in an efficient way. In this paper, a novel model K-MNSOA is proposed for privacy preserving data publishing, which protect sensitive data privacy breach, even if the data set contains multiple sensitive numerical overlapped attributes. A proposed model assumes that all sensitive attributes are not actually sensitive, so when data is protected, information loss will increase. To overcome this issue, new model suggests to divide sensitive data into levels of sensitivity and apply generalization only for the privacy of high sensitive attribute.
Journal ArticleDOI
TL;DR: In this article , a blockchain-based framework for decentralised online social networks is proposed to provide advanced security and privacy to OSN users, where the authors focus on the privacy and security challenges accompanying with OSNs and illustrate some models of attack by the attackers.
Abstract: Online social networking has become an integral component of human life. Many individuals, on the other hand, user are unaware of the security and privacy concerns that come along with its use. As a result of social media sites like Facebook and Twitter, people’s personal information, such as their date of birth and phone number, profile photos, etc. might be dangerously exposed. To get access to private information, such as a user’s banking credentials or to launch a security attack, hackers first obtain the user’s public information from their social media posts. Assaults or leaks of personal information might have a significant impact on their daily life. In this day and age of cutting-edge technology, internet users must understand the risks of using social media websites like Facebook and Twitter. The current status of online social networks, their hazards, and possible solutions are examined in depth in this paper. In this review paper, we deliberately focus on the privacy and security challenges accompanying with OSNs illustrate some models of attack by the attackers and investigate some techniques used to secure a user’s private information and prevent it from attackers while leaving the privacy destruction mostly unresolved. In the end, we are proposing a blockchain-based framework for decentralised that provides advanced security and privacy to OSN users.
References
More filters
Journal ArticleDOI
TL;DR: This article investigates how content producers navigate ‘imagined audiences’ on Twitter, talking with participants who have different types of followings to understand their techniques, including targeting different audiences, concealing subjects, and maintaining authenticity.
Abstract: Social media technologies collapse multiple audiences into single contexts, making it difficult for people to use the same techniques online that they do to handle multiplicity in face-to-face conversation. This article investigates how content producers navigate ‘imagined audiences’ on Twitter. We talked with participants who have different types of followings to understand their techniques, including targeting different audiences, concealing subjects, and maintaining authenticity. Some techniques of audience management resemble the practices of ‘micro-celebrity’ and personal branding, both strategic self-commodification. Our model of the networked audience assumes a many-to-many communication through which individuals conceptualize an imagined audience evoked through their tweets.

3,062 citations

Book ChapterDOI
Robert E. Schapire1
01 Jan 2003
TL;DR: This chapter overviews some of the recent work on boosting including analyses of AdaBoost's training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of Ada boost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.
Abstract: Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, this chapter overviews some of the recent work on boosting including analyses of AdaBoost’s training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of AdaBoost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.

1,979 citations

Journal ArticleDOI
TL;DR: It is argued that individuals take with UGM in different ways for different purposes: they consume contents for fulfilling their information, entertainment, and mood management needs; they participate through interacting with the content as well as with other users for enhancing social connections and virtual communities.
Abstract: Purpose – User‐generated media (UGM) like YouTube, MySpace, and Wikipedia have become tremendously popular over the last few years. The purpose of this paper is to present an analytical framework for explaining the appeal of UGM.Design/methodology/approach – This paper is mainly theoretical due to a relative lack of empirical evidence. After an introduction on the emergence of UGM, this paper investigates in detail how and why people use UGM, and what factors make UGM particularly appealing, through a uses and gratifications perspective. Finally, the key elements of this study are summarized and the future research directions about UGM are discussed.Findings – This paper argues that individuals take with UGM in different ways for different purposes: they consume contents for fulfilling their information, entertainment, and mood management needs; they participate through interacting with the content as well as with other users for enhancing social connections and virtual communities; and they produce their...

971 citations

Journal ArticleDOI
TL;DR: A large quantity of techniques and methods are categorized and compared in the area of sentiment analysis, and different types of data and advanced tools for research are introduced, as well as their limitations.
Abstract: Sentiments or opinions from social media provide the most up-to-date and inclusive information, due to the proliferation of social media and the low barrier for posting the message. Despite the growing importance of sentiment analysis, this area lacks a concise and systematic arrangement of prior efforts. It is essential to: (1) analyze its progress over the years, (2) provide an overview of the main advances achieved so far, and (3) outline remaining limitations. Several essential aspects, therefore, are addressed within the scope of this survey. On the one hand, this paper focuses on presenting typical methods from three different perspectives (task-oriented, granularity-oriented, methodology-oriented) in the area of sentiment analysis. Specifically, a large quantity of techniques and methods are categorized and compared. On the other hand, different types of data and advanced tools for research are introduced, as well as their limitations. On the basis of these materials, the essential prospects lying ahead for sentiment analysis are identified and discussed.

290 citations