Home
/
Authors
/
R. Geetha

Author

R. Geetha

Sri Sivasubramaniya Nadar College of Engineering

Bio: R. Geetha is an academic researcher from Sri Sivasubramaniya Nadar College of Engineering. The author has contributed to research in topics: Privacy law & Regret. The author has an hindex of 1, co-authored 2 publications receiving 2 citations.

Topics: Privacy law, Regret, Social media, Identity theft ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

‘Will I Regret for This Tweet?’—Twitter User’s Behavior Analysis System for Private Data Disclosure

[...]

R. Geetha¹, S. Karthika¹, Ponnurangam Kumaraguru²•Institutions (2)

Sri Sivasubramaniya Nadar College of Engineering¹, Indraprastha Institute of Information Technology²

09 May 2020-The Computer Journal

TL;DR: The Tweet-Scan-Post system scans the tweets contextually for sensitive messages and formulates a sensitivity scaling called TSP’s Tweet Sensitivity Scale based on Senti-Cyber features composed of Sensitive Privacy Keyword, Cyber-keywords with Non-Sensitive Privacy Keywords and Non-Cybersensitive Keywords to detect the degree of disclosed sensitive information.

...read moreread less

Abstract: Twitter is an extensively used micro-blogging site for publishing user’s views on recent happenings. This wide reachability of messages over large audience poses a threat, as the degree of personally identifiable information disclosed might lead to user regrets. The Tweet-Scan-Post system scans the tweets contextually for sensitive messages. The tweet repository was generated using cyber-keywords for personal, professional and health tweets. The Rules of Sensitivity and Contextuality was defined based on standards established by various national regulatory bodies. The naive sensitivity regression function uses the Bag-of-Words model built from short text messages. The imbalanced classes in dataset result in misclassification with 25% of sensitive and 75% of insensitive tweets. The system opted stacked classification to combat the problem of imbalanced classes. The system initially applied various state-of-art algorithms and predicted 26% of the tweets to be sensitive. The proposed stacked classification approach increased the overall proportion of sensitive tweets to 35%. The system contributes a vocabulary set of 201 Sensitive Privacy Keyword using the boosting approach for three tweet categories. Finally, the system formulates a sensitivity scaling called TSP’s Tweet Sensitivity Scale based on Senti-Cyber features composed of Sensitive Privacy Keywords, Cyber-keywords with Non-Sensitive Privacy Keywords and Non-Cyber-keywords to detect the degree of disclosed sensitive information.

...read moreread less

8 citations

Journal Article•DOI•

Tweet-scan-post: a system for analysis of sensitive private data disclosure in online social media

[...]

R. Geetha¹, S. Karthika¹, Ponnurangam Kumaraguru²•Institutions (2)

Sri Sivasubramaniya Nadar College of Engineering¹, Indraprastha Institute of Information Technology²

01 Sep 2021-Knowledge and Information Systems

TL;DR: In this article, a Tweet-Scan-Post (TSP) framework is proposed to identify the presence of sensitive private data (SPD) in user's posts under personal, professional, and health domains.

...read moreread less

Abstract: The social media technologies are open to users who are intended in creating a community and publishing their opinions of recent incidents. The participants of the online social networking sites remain ignorant of the criticality of disclosing personal data to the public audience. The private data of users are at high risk leading to many adverse effects like cyberbullying, identity theft, and job loss. This research work aims to define the user entities or data like phone number, email address, family details, health-related information as user’s sensitive private data (SPD) in a social media platform. The proposed system, Tweet-Scan-Post (TSP), is mainly focused on identifying the presence of SPD in user’s posts under personal, professional, and health domains. The TSP framework is built based on the standards and privacy regulations established by social networking sites and organizations like NIST, DHS, GDPR. The proposed approach of TSP addresses the prevailing challenges in determining the presence of sensitive PII, user privacy within the bounds of confidentiality and trustworthiness. A novel layered classification approach with various state-of-art machine learning models is used by the TSP framework to classify tweets as sensitive and insensitive. The findings of TSP systems include 201 Sensitive Privacy Keywords using a boosting strategy, sensitivity scaling that measures the degree of sensitivity allied with a tweet. The experimental results revealed that personal tweets were highly related to mother and children, professional tweets with apology, and health tweets with concern over the father’s health condition.

...read moreread less

5 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

An automatic mechanism to provide privacy awareness and control over unwittingly dissemination of online private information

[...]

Alfonso Guarino¹, Delfina Malandrino¹, Rocco Zaccagnino¹•Institutions (1)

University of Salerno¹

01 Jan 2022-Computer Networks

TL;DR: In this article, the authors propose an approach based on machine learning and sentence embedding techniques with the primary goal of providing privacy awareness to users and, as a consequence, full control over their data during online activities.

...read moreread less

22 citations

Proceedings Article•DOI•

Aegis: An Agent for Multi-party Privacy Preservation

[...]

Rim Ben Salem, Esma Aïmeur, Hicham Hage

26 Jul 2022

TL;DR: An agent called Aegis is proposed to calculate the potential risk incurred by multi-party members in order to push privacy-preserving nudges to the sharer and is inspired by the consequentialist approach in normative ethical problem-solving techniques.

...read moreread less

Abstract: The proliferation of social media set the foundation for the culture of over-disclosure where many people document every single event, incident, trip, etc. for everyone to see. Raising the individual's awareness of the privacy issues that they are subjecting themselves to can be challenging. This becomes more complex when the post being shared includes data "owned" by others. The existing approaches aiming to assist users in multi-party disclosure situations need to be revised to go beyond preferences to the "good" of the collective. This paper proposes an agent called Aegis to calculate the potential risk incurred by multi-party members in order to push privacy-preserving nudges to the sharer. Aegis is inspired by the consequentialist approach in normative ethical problem-solving techniques. The main contribution is the introduction of a social media-specific risk equation based on data valuation and the propagation of the post from intended to unintended audience. The proof-of-concept reports on how Aegis performs based on real-world data from the SNAP dataset and synthetically generated networks.

...read moreread less

2 citations

Journal Article•DOI•

Automated Detection of Doxing on Twitter

[...]

Younes Karimi, Anna Squicciarini, Shomir Wilson

02 Feb 2022-Proceedings of the ACM on human-computer interaction

TL;DR: This work proposes and evaluates a set of approaches for automatically detecting second- and third-party disclosures on Twitter of sensitive private information, a subset of which constitutes doxing, and compares nine different approaches for automated detection based on string-matching and one-hot encoded heuristics, as well as word and contextualized string embedding representations of tweets.

...read moreread less

Abstract: Doxing refers to the practice of disclosing sensitive personal information about a person without their consent. This form of cyberbullying is an unpleasant and sometimes dangerous phenomenon for online social networks. Although prior work exists on automated identification of other types of cyberbullying, a need exists for methods capable of detecting doxing on Twitter specifically. We propose and evaluate a set of approaches for automatically detecting second- and third-party disclosures on Twitter of sensitive private information, a subset of which constitutes doxing. We summarize our findings of common intentions behind doxing episodes and compare nine different approaches for automated detection based on string-matching and one-hot encoded heuristics, as well as word and contextualized string embedding representations of tweets. We identify an approach providing 96.86% accuracy and 97.37% recall using contextualized string embeddings and conclude by discussing the practicality of our proposed methods.

...read moreread less

2 citations

Journal Article•DOI•

Auto-Off ID: Automatic Detection of Offensive Language in Social Media

[...]

R. Geetha, S. Karthika, Chaluvadi Jwala Sowmika, Bharathi M. Janani

01 May 2021

Book Chapter•DOI•

Generative Adversarial Networks and Their Application in Fake Tweet Detection

[...]

Janice Marian Jockim¹•Institutions (1)

Sri Sivasubramaniya Nadar College of Engineering¹

14 Nov 2022

TL;DR: In this paper , an adversarial process between a discriminator and a generator is proposed to detect false samples created by the generator from those which are genuine, and the target tries to comprise a generator that practically maps tests from guaranteed (straightforward) earlier appropriation, to synthetic information that seem, by all accounts, to be sensible.

...read moreread less

Abstract: The Generative Adversarial Network (GAN) has made incredible progress in creating sensible synthetic information. The authors propose a structure for creating convincing text through adversarial training. It enables the creation of new sentences whilst maintaining the semantics and syntax of genuine phrases whilst being possibly unique from any of the models used to evaluate the model. The authors propose an adversarial process between a discriminator and a generator. The discriminator’s goal is to detect false samples created by the generator from those which are genuine. The target tries to comprise a generator, that practically maps tests from guaranteed (straightforward) earlier appropriation, to synthetic information that seem, by all accounts, to be sensible. In this paper, the authors present various classifiers and test them based on various performance metrics and develop a suitable model to test the authenticity of tweets.

...read moreread less