scispace - formally typeset
Search or ask a question
Author

Ana Lucic

Other affiliations: DePaul University
Bio: Ana Lucic is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Machine learning & Noun. The author has an hindex of 5, co-authored 16 publications receiving 56 citations. Previous affiliations of Ana Lucic include DePaul University.

Papers
More filters
Journal ArticleDOI
TL;DR: A two-step approach is introduced to automatically extract three facets - two entities (the agent and object) and the way in which the entities are compared (the endpoint) from direct comparative sentences in full-text articles to accelerate the systematic review process and identify gaps where future research should be focused.

18 citations

Journal ArticleDOI
TL;DR: A new high-level feature based on local syntactic dependencies that an author uses when referring to a named entity (in this case a person’s name) is introduced and a series of experiments reveal how the amount of data in both the training and test sets influences predictive performance.
Abstract: Accurately determining who wrote a manuscript has captivated scholars of literary history for centuries, as the true author can have important ramifications in religion, law, literary studies, philosophy, and education. A wide array of lexical, character, syntactic, semantic, and application-specific features have been proposed to represent a text so that authorship attribution can be established automatically. Although surface-level features have been tested extensively, few studies have systematically explored high-level features, in part due to limitations in the natural language processing techniques required to capture high-level features. However, high-level features, such as sentence structure, are used subconsciously by a writer and thus may be more consistent than surface-level features, such as word choice. In this article, we introduce a new high-level feature based on local syntactic dependencies that an author uses when referring to a named entity (in our case a person’s name). The series of experiments in the contexts of movie reviews reveal how the amount of data in both the training and test sets influences predictive performance. Finally, we measure authorship consistency with respect to this new feature and show how consistency influences predictive performance. These results provide other researchers with a new model for how to evaluate new features and suggest that the local syntactic dependencies warrant further investigation.

9 citations

01 Jan 2020
TL;DR: This essay presents quantitative capture and predictive modeling for one of the largest and longest running mass reading programs of the past two decades: “One Book One Chicago” (OBOC) sponsored by the Chicago Public Library (CPL).
Abstract: This essay presents quantitative capture and predictive modeling for one of the largest and longest running mass reading programs of the past two decades: “One Book One Chicago” (OBOC) sponsored by the Chicago Public Library (CPL). The Reading Chicago Reading project uses data associated with OBOC as a probe into city-scale library usage and, by extension, as a window onto contemporary reading behavior. The first half of the essay explains why CPL’s OBOC program is conducive for modeling purposes, and the second half documents the creation of our models, their underlying data, and the results.

9 citations

Proceedings Article
01 Jan 2016
TL;DR: It is shown empirically that establishing if head noun is an amount or measure provides a statistically significant improvement that increases the endpoint precision from 0.42 to 0.56 on longer and from0.51 to0.58 on shorter sentences and recall.
Abstract: Authors of biomedical articles use comparison sentences to communicate the findings of a study, and to compare the results of the current study with earlier studies The Claim Framework defines a comparison claim as a sentence that includes at least two entities that are being compared, and an endpoint that captures the way in which the entities are compared Although automated methods have been developed to identify comparison sentences from the text, identifying the role that a specific noun plays (ie entity or endpoint) is much more difficult Automated methods have been successful at identifying the second entity, but classification models were unable to clearly differentiate between the first entity and the endpoint We show empirically that establishing if head noun is an amount or measure provides a statistically significant improvement that increases the endpoint precision from 042 to 056 on longer and from 051 to 058 on shorter sentences and recall from 064 to 071 on longer and from 069 to 074 on shorter sentences The differences were not statistically significant for the second compared entity

6 citations

15 Mar 2015
TL;DR: This paper focuses on the identification of any claim in an article and on the Identification of explicit claims, a subtype of a more general claim, and frames the problem as a classification task and employs three different domainindependent feature selection strategies.
Abstract: The idea of automating systematic reviews has been motivated by both advances in technology that have increased the availability of full-text scientific articles and by sociological changes that have increased the adoption of evidence-based medicine. Although much work has focused on automating the information retrieval step of the systematic review process with a few exceptions the information extraction and analysis have been largely overlooked. In particular, there is a lack of systems that automatically identify the results of an empirical study. Our goal in this paper is to fill that gap. More specifically, we focus on the identification of 1) any claim in an article and on the identification of 2) explicit claims, a subtype of a more general claim. We frame the problem as a classification task and employ three different domainindependent feature selection strategies (χ statistic, information gain, and mutual information) with two different classifiers [support vector machines (SVM) and naive Bayes (NB)]. With respect to both accuracy and F1, the χ statistic and information gain consistently outperform mutual information. The SVM and NB classifiers had similar accuracy when predicting any claim but NB had better F1 performance for explicit claims. Lastly, we explored a semantic model developed for a different dataset. Accuracy was lower for the semantic model but when used with SVM plus sentence location information, this model actually achieved a higher F1 score for predicting explicit claims than all of the feature selection strategies. When used with NB, the prior model for explicit claims performed better than MI, but the F1 score dropped 0.04 to 0.08 compared with models built on training data in the same collection. Further work is needed to understand how features developed for one collection might be used to minimize the amount of training data needed for a new collection.

5 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, Imagined communities: Reflections on the origin and spread of nationalism are discussed. And the history of European ideas: Vol. 21, No. 5, pp. 721-722.

13,842 citations

Journal ArticleDOI
TL;DR: Sampson, Robert J. as mentioned in this paper, The Great American city: Chicago and the enduring neighborhood effect. Chicago: University of Chicago Press. 2012. pp. 552, $27.50 cloth.
Abstract: Sampson, Robert J. 2012. Great American city: Chicago and the enduring neighborhood effect. Chicago: University of Chicago Press. ISBN-13: 9780226734569. pp. 552, $27.50 cloth. Robert J. Sampson’s ...

1,089 citations

Proceedings Article
16 May 2014
TL;DR: The EPFL-CONF-203561 study highlights the need to understand more fully the role of social media in the decision-making process and the role that media outlets play in this process.
Abstract: Keywords: Crisis Informatics ; Social Media Collection ; Social Media Analysis Reference EPFL-CONF-203561 Record created on 2014-11-26, modified on 2017-05-12

339 citations

Journal Article
TL;DR: The authors reviewed the book "The Sixth Extinction: An Unnatural History" by Elizabeth Kolbert and found it to be a good book to read for any history book reader, regardless of genre.
Abstract: The article reviews the book "The Sixth Extinction: An Unnatural History" by Elizabeth Kolbert.

220 citations

Journal ArticleDOI
TL;DR: Overall, progress on the 20-year transition plan laid out by the US NRC in 2007 has been substantial and government agencies within the United States and internationally are beginning to incorporate the new approach methodologies envisaged in the original TT21C vision into regulatory practice.
Abstract: Advances in the biological sciences have led to an ongoing paradigm shift in toxicity testing based on expanded application of high-throughput in vitro screening and in silico methods to assess potential health risks of environmental agents. This review examines progress on the vision for toxicity testing elaborated by the US National Research Council (NRC) during the decade that has passed since the 2007 NRC report on Toxicity Testing in the 21st Century (TT21C). Concomitant advances in exposure assessment, including computational approaches and high-throughput exposomics, are also documented. A vision for the next generation of risk science, incorporating risk assessment methodologies suitable for the analysis of new toxicological and exposure data, resulting in human exposure guidelines is described. Case study prototypes indicating how these new approaches to toxicity testing, exposure measurement, and risk assessment are beginning to be applied in practice are presented. Overall, progress on the 20-year transition plan laid out by the US NRC in 2007 has been substantial. Importantly, government agencies within the United States and internationally are beginning to incorporate the new approach methodologies envisaged in the original TT21C vision into regulatory practice. Future perspectives on the continued evolution of toxicity testing to strengthen regulatory risk assessment are provided.

177 citations