scispace - formally typeset
Search or ask a question
Author

Tanmoy Chakraborty

Bio: Tanmoy Chakraborty is an academic researcher from Indraprastha Institute of Information Technology. The author has contributed to research in topics: Computer science & Density functional theory. The author has an hindex of 26, co-authored 319 publications receiving 2782 citations. Previous affiliations of Tanmoy Chakraborty include Jadavpur University & Manipal University Jaipur.


Papers
More filters
Proceedings ArticleDOI
01 Sep 2019
TL;DR: SpotFake-a multi-modal framework for fake news detection that detects fake news without taking into account any other subtasks and exploits both the textual and visual features of an article.
Abstract: A rapid growth in the amount of fake news on social media is a very serious concern in our society. It is usually created by manipulating images, text, audio, and videos. This indicates that there is a need of multimodal system for fake news detection. Though, there are multimodal fake news detection systems but they tend to solve the problem of fake news by considering an additional sub-task like event discriminator and finding correlations across the modalities. The results of fake news detection are heavily dependent on the subtask and in absence of subtask training, the performance of fake news detection degrade by 10% on an average. To solve this issue, we introduce SpotFake-a multi-modal framework for fake news detection. Our proposed solution detects fake news without taking into account any other subtasks. It exploits both the textual and visual features of an article. Specifically, we made use of language models (like BERT) to learn text features, and image features are learned from VGG-19 pre-trained on ImageNet dataset. All the experiments are performed on two publicly available datasets, i.e., Twitter and Weibo. The proposed model performs better than the current state-of-the-art on Twitter and Weibo datasets by 3.27% and 6.83%, respectively.

192 citations

Journal ArticleDOI
TL;DR: A survey of the metrics used for community detection and evaluation can be found in this paper, where the authors also conduct experiments on synthetic and real networks to present a comparative analysis of these metrics in measuring the goodness of the underlying community structure.
Abstract: Detecting and analyzing dense groups or communities from social and information networks has attracted immense attention over the last decade due to its enormous applicability in different domains. Community detection is an ill-defined problem, as the nature of the communities is not known in advance. The problem has turned even more complicated due to the fact that communities emerge in the network in various forms such as disjoint, overlapping, and hierarchical. Various heuristics have been proposed to address these challenges, depending on the application in hand. All these heuristics have been materialized in the form of new metrics, which in most cases are used as optimization functions for detecting the community structure, or provide an indication of the goodness of detected communities during evaluation. Over the last decade, a large number of such metrics have been proposed. Thus, there arises a need for an organized and detailed survey of the metrics proposed for community detection and evaluation. Here, we present a survey of the start-of-the-art metrics used for the detection and the evaluation of community structure. We also conduct experiments on synthetic and real networks to present a comparative analysis of these metrics in measuring the goodness of the underlying community structure.

189 citations

Book ChapterDOI
TL;DR: A manually annotated dataset of 10,700 social media posts and articles of real and fake news on COVID-19 is curate and released, and four machine learning baselines are benchmarked.
Abstract: Along with COVID-19 pandemic we are also fighting an `infodemic'. Fake news and rumors are rampant on social media. Believing in rumors can cause significant harm. This is further exacerbated at the time of a pandemic. To tackle this, we curate and release a manually annotated dataset of 10,700 social media posts and articles of real and fake news on COVID-19. We benchmark the annotated dataset with four machine learning baselines - Decision Tree, Logistic Regression, Gradient Boost, and Support Vector Machine (SVM). We obtain the best performance of 93.46% F1-score with SVM. The data and code is available at: this https URL

178 citations

Proceedings ArticleDOI
24 Aug 2014
TL;DR: Compared to other metrics, permanence provides a more accurate estimate of a derived community structure to the ground-truth community and is more sensitive to perturbations in the network.
Abstract: Despite the prevalence of community detection algorithms, relatively less work has been done on understanding whether a network is indeed modular and how resilient the community structure is under perturbations. To address this issue, we propose a new vertex-based metric called "permanence", that can quantitatively give an estimate of the community- like structure of the network.The central idea of permanence is based on the observation that the strength of membership of a vertex to a community depends upon the following two factors: (i) the distribution of external connectivity of the vertex to individual communities and not the total external connectivity, and (ii) the strength of its internal connectivity and not just the total internal edges.In this paper, we demonstrate that compared to other metrics, permanence provides (i) a more accurate estimate of a derived community structure to the ground-truth community and (ii) is more sensitive to perturbations in the network. As a by-product of this study, we have also developed a community detection algorithm based on maximizing permanence. For a modular network structure, the results of our algorithm match well with ground-truth communities.

111 citations

Proceedings ArticleDOI
06 Aug 2020
TL;DR: The SemEval-2020 Task 9 on Sentiment Analysis of Code-Mixed Tweets (SentiMix 2020) as discussed by the authors was the first task focusing on sentiment analysis of code-mixed tweets.
Abstract: In this paper, we present the results of the SemEval-2020 Task 9 on Sentiment Analysis of Code-Mixed Tweets (SentiMix 2020). We also release and describe our Hinglish (Hindi-English)and Spanglish (Spanish-English) corpora annotated with word-level language identification and sentence-level sentiment labels. These corpora are comprised of 20K and 19K examples, respectively. The sentiment labels are - Positive, Negative, and Neutral. SentiMix attracted 89 submissions in total including 61 teams that participated in the Hinglish contest and 28 submitted systems to the Spanglish competition. The best performance achieved was 75.0% F1 score for Hinglish and 80.6% F1 for Spanglish. We observe that BERT-like models and ensemble methods are the most common and successful approaches among the participants.

98 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Journal ArticleDOI
TL;DR: The authors found that people are much more likely to believe stories that favor their preferred candidate, especially if they have ideologically segregated social media networks, and that the average American adult saw on the order of one or perhaps several fake news stories in the months around the 2016 U.S. presidential election, with just over half of those who recalled seeing them believing them.
Abstract: Following the 2016 U.S. presidential election, many have expressed concern about the effects of false stories (“fake news”), circulated largely through social media. We discuss the economics of fake news and present new data on its consumption prior to the election. Drawing on web browsing data, archives of fact-checking websites, and results from a new online survey, we find: (i) social media was an important but not dominant source of election news, with 14 percent of Americans calling social media their “most important” source; (ii) of the known false news stories that appeared in the three months before the election, those favoring Trump were shared a total of 30 million times on Facebook, while those favoring Clinton were shared 8 million times; (iii) the average American adult saw on the order of one or perhaps several fake news stories in the months around the election, with just over half of those who recalled seeing them believing them; and (iv) people are much more likely to believe stories that favor their preferred candidate, especially if they have ideologically segregated social media networks.

3,959 citations

Journal ArticleDOI
TL;DR: While the book is a standard fixture in most chemical and physical laboratories, including those in medical centers, it is not as frequently seen in the laboratories of physician's offices (those either in solo or group practice), and I believe that the Handbook can be useful in those laboratories.
Abstract: There is a special reason for reviewing this book at this time: it is the 50th edition of a compendium that is known and used frequently in most chemical and physical laboratories in many parts of the world. Surely, a publication that has been published for 56 years, withstanding the vagaries of science in this century, must have had something to offer. There is another reason: while the book is a standard fixture in most chemical and physical laboratories, including those in medical centers, it is not as frequently seen in the laboratories of physician's offices (those either in solo or group practice). I believe that the Handbook can be useful in those laboratories. One of the reasons, among others, is that the various basic items of information it offers may be helpful in new tests, either physical or chemical, which are continuously being published. The basic information may relate

2,493 citations

Journal ArticleDOI
TL;DR: It is shown that the full set of hydromagnetic equations admit five more integrals, besides the energy integral, if dissipative processes are absent, which made it possible to formulate a variational principle for the force-free magnetic fields.
Abstract: where A represents the magnetic vector potential, is an integral of the hydromagnetic equations. This -integral made it possible to formulate a variational principle for the force-free magnetic fields. The integral expresses the fact that motions cannot transform a given field in an entirely arbitrary different field, if the conductivity of the medium isconsidered infinite. In this paper we shall show that the full set of hydromagnetic equations admit five more integrals, besides the energy integral, if dissipative processes are absent. These integrals, as we shall presently verify, are I2 =fbHvdV, (2)

1,858 citations