scispace - formally typeset
Search or ask a question
Institution

Mitre Corporation

CompanyBedford, Massachusetts, United States
About: Mitre Corporation is a company organization based out in Bedford, Massachusetts, United States. It is known for research contribution in the topics: Air traffic control & National Airspace System. The organization has 4884 authors who have published 6053 publications receiving 124808 citations. The organization is also known as: Mitre & MITRE.


Papers
More filters
Proceedings Article
27 Jul 2011
TL;DR: The construction of a large, multilingual dataset labeled with gender is described and statistical models for determining the gender of uncharacterized Twitter users are investigated, and several different classifier types are explored.
Abstract: Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users. We explore several different classifier types on this dataset. We show the degree to which classifier accuracy varies based on tweet volumes as well as when various kinds of profile metadata are included in the models. We also perform a large-scale human assessment using Amazon Mechanical Turk. Our methods significantly out-perform both baseline models and almost all humans on the same task.

632 citations

Journal ArticleDOI
TL;DR: In this paper, the authors focus on the classification of human, bot, and cyborg accounts on Twitter and conduct a set of large-scale measurements with a collection of over 500,000 accounts.
Abstract: Twitter is a new web application playing dual roles of online social networking and microblogging. Users communicate with each other by publishing text-based posts. The popularity and open structure of Twitter have attracted a large number of automated programs, known as bots, which appear to be a double-edged sword to Twitter. Legitimate bots generate a large amount of benign tweets delivering news and updating feeds, while malicious bots spread spam or malicious contents. More interestingly, in the middle between human and bot, there has emerged cyborg referred to either bot-assisted human or human-assisted bot. To assist human users in identifying who they are interacting with, this paper focuses on the classification of human, bot, and cyborg accounts on Twitter. We first conduct a set of large-scale measurements with a collection of over 500,000 accounts. We observe the difference among human, bot, and cyborg in terms of tweeting behavior, tweet content, and account properties. Based on the measurement results, we propose a classification system that includes the following four parts: 1) an entropy-based component, 2) a spam detection component, 3) an account properties component, and 4) a decision maker. It uses the combination of features extracted from an unknown user to determine the likelihood of being a human, bot, or cyborg. Our experimental evaluation demonstrates the efficacy of the proposed classification system.

600 citations

Journal ArticleDOI
TL;DR: The approach is distinguished from other work by the simplicity of the model, the precision of the results it produces, and the ease of developing intelligible and reliable proofs even without automated support.
Abstract: A strand is a sequence of events; it represents either an execution by a legitimate party in a security protocol or else a sequence of actions by a penetrator. A strand space is a collection of strands, equipped with a graph structure generated by causal interaction. In this framework, protocol correctness claims may be expressed in terms of the connections between strands of different kinds. Preparing for a first example, the Needham-Schroeder-Lowe protocol, we prove a lemma that gives a bound on the abilities of the penetrator in any protocol. Our analysis of the example gives a detailed view of the conditions under which it achieves authentication and protects the secrecy of the values exchanged. We also use our proof methods to explain why the original Needham-Schroeder protocol fails. Before turning to a second example, we introduce ideals as a method to prove additional bounds on the abilities of the penetrator. We can then prove a number of correctness properties of the Otway-Rees protocol, and we clarify its limitations. We believe that our approach is distinguished from other work by the simplicity of the model, the precision of the results it produces, and the ease of developing intelligible and reliable proofs even without automated support.

574 citations

Journal ArticleDOI
TL;DR: The first BioCreAtIvE assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology.
Abstract: The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28–31, 2004. The articles collected in this BMC Bioinformatics supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation. BioCreAtIvE focused on two tasks. The first dealt with extraction of gene or protein names from text, and their mapping into standardized gene identifiers for three model organism databases (fly, mouse, yeast). The second task addressed issues of functional annotation, requiring systems to identify specific text passages that supported Gene Ontology annotations for specific proteins, given full text articles. The first BioCreAtIvE assessment achieved a high level of international participation (27 groups from 10 countries). The assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. The results for the advanced task (functional annotation from free text) were significantly lower, demonstrating the current limitations of text-mining approaches where knowledge extrapolation and interpretation are required. In addition, an important contribution of BioCreAtIvE has been the creation and release of training and test data sets for both tasks. There are 22 articles in this special issue, including six that provide analyses of results or data quality for the data sets, including a novel inter-annotator consistency assessment for the test set used in task 2.

552 citations

Posted Content
TL;DR: A novel architecture, called the TL-embedding network, is proposed, to learn an embedding space with generative and predictable properties, which enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval.
Abstract: What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.

532 citations


Authors

Showing all 4896 results

NameH-indexPapersCitations
Sushil Jajodia10166435556
Myles R. Allen8229532668
Barbara Liskov7620425026
Alfred D. Steinberg7429520974
Peter T. Cummings6952118942
Vincent H. Crespi6328720347
Michael J. Pazzani6218328036
David Goldhaber-Gordon5819215709
Yeshaiahu Fainman5764814661
Jonathan Anderson5719510349
Limsoon Wong5536713524
Chris Clifton5416011501
Paul Ward5240812400
Richard M. Fujimoto5229013584
Bhavani Thuraisingham5256310562
Network Information
Related Institutions (5)
IBM
253.9K papers, 7.4M citations

83% related

Hewlett-Packard
59.8K papers, 1.4M citations

83% related

Carnegie Mellon University
104.3K papers, 5.9M citations

83% related

George Mason University
39.9K papers, 1.3M citations

83% related

Georgia Institute of Technology
119K papers, 4.6M citations

82% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20234
202210
202195
2020139
2019145
2018132