Showing papers presented at "Cross-Language Evaluation Forum in 2012"

PDF

Open Access

Proceedings Article•

Overview of the ImageCLEF 2012 medical image retrieval and classiFIcation tasks

[...]

Henning Müller¹, Henning Müller², Alba García Seco de Herrera², Jayashree Kalpathy-Cramer³, Dina Demner-Fushman⁴, Sameer Antani⁴, Ivan Eggel² - Show less +3 more•Institutions (4)

University of Geneva¹, University of Applied Sciences Western Switzerland², Harvard University³, National Institutes of Health⁴

01 Jan 2012

TL;DR: The ninth edition of the ImageCLEF medical image retrieval and classication tasks was organized in 2012, using a larger number of over 300'000 images than in 2011, adding mainly complexity.

...read moreread less

Abstract: The ninth edition of the ImageCLEF medical image retrieval and classication tasks was organized in 2012. A subset of the open access collection of PubMed Central was used as the database in 2012, using a larger number of over 300'000 images than in 2011. As in previous years, there were three subtasks: modality classication, image{based and case{based retrieval. A new hierarchy for article gures was created for the modality classi- cation task. The modality detection could be one of the most important lters to limit the search and focus the results sets. The goal of the image{based and the case{based retrieval tasks were similar compared to 2011 adding mainly complexity. The number of groups submitting runs has remained stable at 17, with the number of submitted runs remaining roughly the same with 202 (207 in 2011). Of these, 122 were image{based retrieval runs, 37 were case{based runs while the remaining 43 were modality classication runs. Depending on the exact nature of the task, visual, textual or multimodal approaches performed better.

...read moreread less

115 citations

Book Chapter•DOI•

Bringing the algorithms to the data: cloud---based benchmarking for medical image analysis

[...]

Allan Hanbury¹, Henning Müller², Georg Langs³, Marc-André Weber⁴, Bjoern H. Menze⁵, Tomas Salas Fernandez - Show less +2 more•Institutions (5)

Vienna University of Technology¹, University of Applied Sciences Western Switzerland², Medical University of Vienna³, Heidelberg University⁴, ETH Zurich⁵

17 Sep 2012

TL;DR: This text presents reflections and ideas of a concrete project on using cloud---based benchmarking paradigms for medical image analysis and retrieval using cloud computing technology to run two evaluation campaigns in 2013 and 2014 using the proposed technology.

...read moreread less

Abstract: Benchmarks have shown to be an important tool to advance science in the fields of information analysis and retrieval. Problems of running benchmarks include obtaining large amounts of data, annotating it and then distributing it to the participants of a benchmark. Distribution of the data to participants is currently mostly done via data download that can take hours for large data sets and in countries with slow Internet connections even days. Sending physical hard disks was also used for distributing very large scale data sets (for example by TRECvid) but also this becomes infeasible if the data sets reach sizes of 5---10 TB. With cloud computing it is possible to make very large data sets available in a central place with limited costs. Instead of distributing the data to the participants, the participants can compute their algorithms on virtual machines of the cloud providers. This text presents reflections and ideas of a concrete project on using cloud---based benchmarking paradigms for medical image analysis and retrieval. It is planned to run two evaluation campaigns in 2013 and 2014 using the proposed technology.

...read moreread less

45 citations

Proceedings Article•

The ImageCLEF 2012 Plant Identification Task

[...]

Hervé Goëau, Pierre Bonnet, Alexis Joly, Itheri Yahiaoui, Daniel Barthélémy, Nozha Boujemaa¹, Jean-François Molino² - Show less +3 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Institut de recherche pour le développement²

17 Sep 2012

TL;DR: This overview presents more precisely the resources and assessments of task, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main evaluation results.

...read moreread less

Abstract: The ImageCLEF's plant identification task provides a testbed for the system-oriented evaluation of plant identification, more precisely on the 126 tree species identification based on leaf images. Three types of image content are considered: Scan, Scan-like (leaf photographs with a white uniform background), and Photograph (unconstrained leaf with natural background). The main originality of this data is that it was specifically built through a citizen sciences initiative conducted by Tela Botanica, a French social network of amateur and expert botanists. This makes the task closer to the conditions of a real-world application. This overview presents more precisely the resources and assessments of task, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main evaluation results. With a total of eleven groups from eight countries and with a total of 30 runs submitted, involving distinct and original methods, this second year pilot task confirms Image Retrieval community interest for biodiversity and botany, and highlights further challenging studies in plant identification.

...read moreread less

36 citations

Book Chapter•DOI•

Cross-Language high similarity search using a conceptual thesaurus

[...]

Parth Gupta¹, Alberto Barrón-Cedeño¹, Paolo Rosso¹•Institutions (1)

Polytechnic University of Valencia¹

17 Sep 2012

TL;DR: This work proposes a concept-based similarity model for the issue of cross-language high similarity and near-duplicates search and finds, though the proposed model is very generic, it produces competitive results and is significantly stable and consistent across the corpora.

...read moreread less

Abstract: This work addresses the issue of cross-language high similarity and near-duplicates search, where, for the given document, a highly similar one is to be identified from a large cross-language collection of documents. We propose a concept-based similarity model for the problem which is very light in computation and memory. We evaluate the model on three corpora of different nature and two language pairs English-German and English-Spanish using the Eurovoc conceptual thesaurus. Our model is compared with two state-of-the-art models and we find, though the proposed model is very generic, it produces competitive results and is significantly stable and consistent across the corpora.

...read moreread less

32 citations

Book Chapter•DOI•

DIRECTions: design and specification of an IR evaluation infrastructure

[...]

Maristella Agosti¹, Emanuele Di Buccio¹, Nicola Ferro¹, Ivano Masiero¹, Simone Peruzzo¹, Gianmaria Silvello¹ - Show less +2 more•Institutions (1)

University of Padua¹

17 Sep 2012

TL;DR: This paper describes the specification for an Information Retrieval (IR) evaluation infrastructure by conceptually modeling the entities involved in information retrieval experimental evaluation and their relationships and by defining the architecture of the proposed evaluation infrastructure and the APIs for accessing it.

...read moreread less

Abstract: Information Retrieval (IR) experimental evaluation is an essential part of the research on and development of information access methods and tools. Shared data sets and evaluation scenarios allow for comparing methods and systems, understanding their behaviour, and tracking performances and progress over the time. On the other hand, experimental evaluation is an expensive activity in terms of human effort, time, and costs required to carry it out. Software and hardware infrastructures that support experimental evaluation operation as well as management, enrichment, and exploitation of the produced scientific data provide a key contribution in reducing such effort and costs and carrying out systematic and throughout analysis and comparison of systems and methods, overall acting as enablers of scientific and technical advancement in the field. This paper describes the specification for an Information Retrieval (IR) evaluation infrastructure by conceptually modeling the entities involved in Information Retrieval (IR) experimental evaluation and their relationships and by defining the architecture of the proposed evaluation infrastructure and the APIs for accessing it.

...read moreread less

30 citations

Proceedings Article•

Conversation Level Constraints on Pedophile Detection in Chat Rooms

[...]

Claudia Peersman¹, Frederik Vaassen¹, Vincent Van Asch¹, Walter Daelemans¹•Institutions (1)

University of Antwerp¹

01 Jan 2012

TL;DR: A dictionary of words and expressions relating to predators' grooming stages is described, which was used to identify which posts in the predators' conversations were most distinctive for their grooming behavior.

...read moreread less

Abstract: In this paper we present a new approach for detecting online pedophiles in chat rooms that combines the results of predictions on the level of the individual post, the level of the user and the level of the entire conversation, and describe the results of this three-stage system in the PAN 2012 competition. Also, we describe a resampling and a filtering strategy to circumvent issues regarding the unbalanced dataset. Finally, we describe the creation of a dictionary of words and expressions relating to predators' grooming stages, which we used to identify which posts in the predators' conversations were most distinctive for their grooming behavior.

...read moreread less

27 citations

Book Chapter•DOI•

Better than their reputation? on the reliability of relevance assessments with students

[...]

Philipp Schaer¹•Institutions (1)

Leibniz Association¹

17 Sep 2012

TL;DR: This study focuses on the retrieval performance of the system but on the relevance assessments and the inter-assessor reliability, and applies Fleiss' Kappa and Krippendorff's Alpha to quantify the agreement.

...read moreread less

Abstract: During the last three years we conducted several information retrieval evaluation series with more than 180 LIS students who made relevance assessments on the outcomes of three specific retrieval services. In this study we do not focus on the retrieval performance of our system but on the relevance assessments and the inter-assessor reliability. To quantify the agreement we apply Fleiss' Kappa and Krippendorff's Alpha. When we compare these two statistical measures on average Kappa values were 0.37 and Alpha values 0.15. We use the two agreement measures to drop too unreliable assessments from our data set. When computing the differences between the unfiltered and the filtered data set we see a root mean square error between 0.02 and 0.12. We see this as a clear indicator that disagreement affects the reliability of retrieval evaluations. We suggest not to work with unfiltered results or to clearly document the disagreement rates.

...read moreread less

22 citations

Book Chapter•DOI•

Generating pseudo test collections for learning to rank scientific articles

[...]

Richard Berendsen¹, Manos Tsagkias¹, Maarten de Rijke¹, Edgar Meij¹•Institutions (1)

University of Amsterdam¹

17 Sep 2012

TL;DR: The results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments, and show how the pseudo test collection ranks systems compared to editorial topics with editorial judgements.

...read moreread less

Abstract: Pseudo test collections are automatically generated to provide training material for learning to rank methods. We propose a method for generating pseudo test collections in the domain of digital libraries, where data is relatively sparse, but comes with rich annotations. Our intuition is that documents are annotated to make them better findable for certain information needs. We use these annotations and the associated documents as a source for pairs of queries and relevant documents. We investigate how learning to rank performance varies when we use different methods for sampling annotations, and show how our pseudo test collection ranks systems compared to editorial topics with editorial judgements. Our results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments. In some cases, performance is on par with learning on manually obtained ground truth.

...read moreread less

22 citations

Proceedings Article•

Annotating modality and negation for a machine reading evaluation

[...]

Roser Morante¹, Walter Daelemans¹•Institutions (1)

University of Antwerp¹

01 Jan 2012

TL;DR: The pilot task Processing modality and negation as mentioned in this paper, which was organized in the framework of the Question Answering for Machine Reading Evaluation Lab at CLEF 2012, defined as an annotation exercise consisting on determining whether an event mentioned in a text is presented as negated, modalised (i.e. affected by an expression of modality), or both.

...read moreread less

Abstract: This paper describes the pilot task Processing modality and negation, which was organized in the framework of the Question Answering for Machine Reading Evaluation Lab at CLEF 2012. This task was defined as an annotation exercise consisting on determining whether an event mentioned in a text is presented as negated, modalised (i.e. affected by an expression of modality), or both. Three teams participated in the task submitting a total of 6 runs. The highest score obtained by a system was 0.6368 macroaveraged F1 measure.

...read moreread less

21 citations

Proceedings Article•

ReVeS Participation - Tree Species Classification using Random Forests and Botanical Features

[...]

Guillaume Cerutti, Violaine Antoine, Laure Tougne, Julien Mille, Lionel Valet, Didier Coquin, Antoine Vacavant - Show less +3 more

17 Sep 2012

TL;DR: In this article, a two-step model-driven segmentation and evaluation of high-level characteristics that make a semantic interpretation possible, as well as more generic shape features are combined in a random forest classification algorithm, and their significance evaluated.

...read moreread less

Abstract: This paper summarizes the participation of the ReVeS project to the ImageCLEF 2012 Plant Identification task. Aiming to develop a system for tree leaf identification on mobile devices, our method is designed to cope with the challenges of complex natural images and to enable a didactic interaction with the user. The approach relies on a two step model-driven segmentation and on the evaluation of high-level characteristics that make a semantic interpretation possible, as well as more generic shape features. All these descriptors are combined in a random forest classification algorithm, and their significance evaluated. Our team ranks 4th overall, 3rd on natural images, which constitutes a very satisfying performance with respect to the project's objectives.

...read moreread less

17 citations

Book Chapter•DOI•

Seven years of INEX interactive retrieval experiments --- lessons and challenges

[...]

Ragnar Nordlie¹, Nils Pharo¹•Institutions (1)

Oslo and Akershus University College of Applied Sciences¹

17 Sep 2012

TL;DR: This paper summarizes a major effort in interactive search investigation, the INEX i-track, a collective effort run over a seven-year period, and presents the experimental conditions, findings of the participating groups, and examines the challenges posed by this kind of collective experimental effort.

...read moreread less

Abstract: This paper summarizes a major effort in interactive search investigation, the INEX i-track, a collective effort run over a seven-year period. We present the experimental conditions, report some of the findings of the participating groups, and examine the challenges posed by this kind of collective experimental effort.

...read moreread less

Book Chapter•DOI•

Cumulated relative position: a metric for ranking evaluation

[...]

Marco Angelini¹, Nicola Ferro², Kalervo Järvelin³, Heikki Keskustalo³, Ari Pirkola³, Giuseppe Santucci¹, Gianmaria Silvello² - Show less +3 more•Institutions (3)

Sapienza University of Rome¹, University of Padua², University of Tampere³

17 Sep 2012

TL;DR: In this article, the authors propose a new metric for ranking evaluation, the CRP, which is based on the observation that a document of a given degree of relevance may be ranked too early or too late regarding the ideal ranking of documents for a query.

...read moreread less

Abstract: The development of multilingual and multimedia information access systems calls for proper evaluation methodologies to ensure that they meet the expected user requirements and provide the desired effectiveness. IR research offers a strong evaluation methodology and a range of evaluation metrics, such as MAP and (n)DCG. In this paper, we propose a new metric for ranking evaluation, the CRP. We start with the observation that a document of a given degree of relevance may be ranked too early or too late regarding the ideal ranking of documents for a query. Its relative position may be negative, indicating too early ranking, zero indicating correct ranking, or positive, indicating too late ranking. By cumulating these relative rankings we indicate, at each ranked position, the net effect of document displacements, the CRP. We first define the metric formally and then discuss its properties, its relationship to prior metrics, and its visualization. Finally we propose different visualizations of CRP by exploiting a test collection to demonstrate its behavior.

...read moreread less

Book Chapter•DOI•

Language independent query focused snippet generation

[...]

Pinaki Bhaskar¹, Sivaji Bandyopadhyay¹•Institutions (1)

Jadavpur University¹

17 Sep 2012

TL;DR: The present paper describes the development of a language independent query focused snippet generation module that takes the query and content of each retrieving document and generates a query dependent snippet for each retrieved document.

...read moreread less

Abstract: The present paper describes the development of a language independent query focused snippet generation module. This module takes the query and content of each retrieved document and generates a query dependent snippet for each retrieved document. The algorithm of this module based on the sentence extraction, sentence scoring and sentence ranking. Subjective evaluation has been. English snippet got the best evaluation score, i.e. 1 and overall average evaluation score of 0.83 has been achieved in the scale of 0 to 1.

...read moreread less

Proceedings Article•

Index Expansion for Machine Reading and Question Answering

[...]

Giuseppe Attardi¹, L. Atzori, Maria Simi¹•Institutions (1)

University of Pisa¹

01 Jan 2012

TL;DR: This paper introduced the technique of index expansion, which relies on building a search index enriched with information gathered from a linguistic analysis of texts, and reported their experiments in tackling the CLEF 2012 Pilot Task on Machine Reading for Question Answering.

...read moreread less

Abstract: The paper reports our experiments in tackling the CLEF 2012 Pilot Task on Machine Reading for Question Answering. We introduce the technique of index expansion, which relies on building a search index enriched with information gathered from a linguistic analysis of texts. The index provides a highly tangled representation of the sentences where each word is directly connected to others representing both meaning and relations. Instead of keeping the knowledge base separate, the relevant knowledge gets embedded within the text. We can hence use efficient indexing techniques to represent such knowledge and query it very effectively. We explain how index expansion was used in the task and describe the experiments that we performed. The results achieved are quite positive and a final error analysis shows how the technique can be further improved.

...read moreread less

Book Chapter•DOI•

Analysis and refinement of cross-lingual entity linking

[...]

Taylor Cassidy¹, Heng Ji¹, Hongbo Deng², Jing Zheng³, Jiawei Han² - Show less +1 more•Institutions (3)

City University of New York¹, University of Illinois at Urbana–Champaign², SRI International³

17 Sep 2012

TL;DR: A strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking is enhanced to achieve 11.2% improvement in B-Cubed+ F-measure.

...read moreread less

Abstract: In this paper we propose two novel approaches to enhance cross-lingual entity linking (CLEL). One is based on cross-lingual information networks, aligned based on monolingual information extraction, and the other uses topic modeling to ensure global consistency. We enhance a strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking to achieve 11.2% improvement in B-Cubed+ F-measure. Our system achieved highly competitive results in the NIST Text Analysis Conference (TAC) Knowledge Base Population (KBP2011) evaluation. We also provide detailed qualitative and quantitative analysis on the contributions of each approach and the remaining challenges.

...read moreread less

Book Chapter•DOI•

Going beyond CLEF-IP: the 'reality' for patent searchers?

[...]

Julia J. Jürgens¹, Preben Hansen², Christa Womser-Hacker¹•Institutions (2)

University of Hildesheim¹, Swedish Institute of Computer Science²

17 Sep 2012

TL;DR: This paper gives an overview of several different approaches that have been applied by participants in the CLEF-IP evaluation initiative and suggested techniques and experimental paradigms could be helpful in further improving the results and making the experiments more realistic.

...read moreread less

Abstract: This paper gives an overview of several different approaches that have been applied by participants in the CLEF-IP evaluation initiative. On this basis, it is suggested that other techniques and experimental paradigms could be helpful in further improving the results and making the experiments more realistic. The field of information seeking is therefore incorporated and its potential gain for patent retrieval explained. Furthermore, the different search tasks that are undertaken by patent searchers are introduced as possible use cases. They can serve as a basis for development in patent retrieval research in that they present the diverse scenarios with their special characteristics and give the research community therefore a realistic picture of the patent user's work.

...read moreread less

Book Chapter•DOI•

Penalty functions for evaluation measures of unsegmented speech retrieval

[...]

Petra Galuščáková¹, Pavel Pecina¹, Jan Hajič¹•Institutions (1)

Charles University in Prague¹

17 Sep 2012

TL;DR: A lab test is performed to study satisfaction of users of a speech retrieval system to empirically estimate the optimal shape of the Penalty Function, the evaluation measure widely used for unsegmented speech retrieval.

...read moreread less

Abstract: This paper deals with evaluation of information retrieval from unsegmented speech. We focus on Mean Generalized Average Precision, the evaluation measure widely used for unsegmented speech retrieval. This measure is designed to allow certain tolerance in matching retrieval results (starting points of relevant segments) against a gold standard relevance assessment. It employs a Penalty Function which evaluates non-exact matches in the retrieval results based on their distance from the beginnings of their nearest true relevant segments. However, the choice of the Penalty Function is usually ad-hoc and does not necessary reflect users' perception of the speech retrieval quality. We perform a lab test to study satisfaction of users of a speech retrieval system to empirically estimate the optimal shape of the Penalty Function.

...read moreread less

Book Chapter•DOI•

Effects of language and topic size in patent IR: an empirical study

[...]

Florina Piroi¹, Mihai Lupu¹, Allan Hanbury¹•Institutions (1)

Vienna University of Technology¹

17 Sep 2012

TL;DR: The effects that various characteristics of the topic documents have on the effectiveness of the systems for the task of finding prior art in the patent domain are revisit and the reader interested in approaching the domain a guide of the issues that need to be addressed.

...read moreread less

Abstract: We revisit the effects that various characteristics of the topic documents have on the effectiveness of the systems for the task of finding prior art in the patent domain. In doing so, we provide the reader interested in approaching the domain a guide of the issues that need to be addressed in this context. For the current study, we select two patent based test collections with a common document representation schema and look at topic characteristics specific to the objectives of the collections. We look at the effect of languages on retrieval and at the length of the topic documents. We present the correlations between these topic facets and their retrieval results, as well as their relevant documents.

...read moreread less

Book Chapter•DOI•

MusiClef: multimodal music tagging task

[...]

Nicola Orio¹, Cynthia C. S. Liem², Geoffroy Peeters³, Markus Schedl⁴•Institutions (4)

University of Padua¹, Delft University of Technology², IRCAM³, Johannes Kepler University of Linz⁴

17 Sep 2012

TL;DR: The setup of MusiClef is described, showing how it complements existing benchmarking initiatives and fosters less explored methodological directions in Music Information Retrieval.

...read moreread less

Abstract: MusiClef is a multimodal music benchmarking initiative that will be running a MediaEval 2012 Brave New Task on Multimodal Music Tagging. This paper describes the setup of this task, showing how it complements existing benchmarking initiatives and fosters less explored methodological directions in Music Information Retrieval. MusiClef deals with a concrete use case, encourages multimodal approaches based on these, and strives for transparency of results as much as possible. Transparency is encouraged at several levels and stages, from the feature extraction procedure up to the evaluation phase, in which a dedicated categorization of ground truth tags will be used to deepen the understanding of the relation between the proposed approaches and experimental results.

...read moreread less

Proceedings Article•

CLaC Labs: Processing Modality and Negation. Working Notes for QA4MRE Pilot Task at CLEF 2012.

[...]

Sabine Rosenberg¹, Halil Kilicoglu¹, Sabine Bergler¹•Institutions (1)

Concordia University¹

01 Jan 2012

TL;DR: The strong results confirm the suitability of linguistic heuristics for low-level semantic features and showcase their robustness across the different subgenres of the QA4MRE corpora.

...read moreread less

Abstract: For the QA4MRE 2012 Pilot Task on Negation and Modality, CLaC Labs implemented a general, lightweight negation and modality module based on linguistic rules. The strong results confirm the suitability of linguistic heuristics for low-level semantic features and showcase their robustness across the different subgenres of the QA4MRE corpora.

...read moreread less

Proceedings Article•

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

[...]

Arjumand Younus¹, Colm O'Riordan¹, Gabriella Pasi²•Institutions (2)

National University of Ireland, Galway¹, University of Milano-Bicocca²

01 Jan 2012

TL;DR: A Wikipedia-based two-pass algorithm is posed to address the issue of how to disambiguate tweets with respect to company names and the effectiveness of the proposed approach is demonstrated.

...read moreread less

Abstract: Using Twitter as an effective marketing tool has become a gold mine for companies interested in their online reputation. A quite sig- nificant research challenge related to the above issue is to disambiguate tweets with respect to company names. In fact, finding if a particu- lar tweet is relevant or irrelevant to a company is an important task not satisfactorily solved yet; to address this issue in this paper we pro- pose a Wikipedia-based two-pass algorithm. The experimental evalua- tions demonstrate the effectiveness of the proposed approach.

...read moreread less

Proceedings Article•

Bitem site report for the claims to passage task in CLEF-IP 2012

[...]

Julien Gobeill¹, Patrick Ruch¹•Institutions (1)

École Normale Supérieure¹

01 Jan 2012

TL;DR: In CLEF-IP 2012, the Claims to Passage task where the goal was to return relevant passages according to sets of claims, for patentability or novelty search purposes, a two-step retriev- al system was designed to cope with the problems induced by this large dataset.

...read moreread less

Abstract: In CLEF-IP 2012, we participated in the Claims to Passage task where the goal was to return relevant passages according to sets of claims, for patentability or novelty search purposes. The collection contained 2.3M of doc- uments, corresponding to an estimated volume of 250M of passages. To cope with the problems induced by this large dataset, we designed a two-step retriev- al system. In the first step, the 2.3M of patent application documents were in- dexed ; for each topic, we then retrieved the k most similar documents with a classical Prior Art Search. Document representations and tuning of the IR en- gine were set relying on training data and on the expertise we acquired in past similar tasks. In particular, we used not only claims for topics, but also the full description of the application document, and the applicants/inventors details ; moreover, we discarded retrieved documents that didn't share at least one IPC code with the topic. The k parameter ranged from 5 to 1000 according to the computed run. In the second step, for each topic (i.e. "on the fly"), we indexed the passages contained in these k most similar documents and queried with the topic claims in order to obtained the final runs. Thus, we dealt with approxi- mately 11M of passages instead of 250M. The best k parameter with the train- ing data was 10. Hence, we decided to submit four runs with k set to 10, 20, 50, and 100. Finally, we analyzed the training data and observed that the position of a passage in the document played a role, as passages at the end of the descrip- tion were more likely to be relevant. Thus, we re-ranked each run according to passages' positions in the document in order to submit four supplementary runs.

...read moreread less

Proceedings Article•

Biosurveillance for Invasive Fungal Infections via Text Mining

[...]

David Martinez¹, Hanna Suominen², Michelle Ananda-Rajah¹, Lawrence Cavedon¹•Institutions (2)

University of Melbourne¹, Australian National University²

01 Jan 2012

TL;DR: An application of text mining techniques, using machine learn- ing over a range of features, to automatically detect cases of patients with IFD from the text in the reports of CT scans performed on them is described.

...read moreread less

Abstract: Invasive fungal diseases (IFDs) cause more than 1,000 deaths in hos- pitals and cost the health system more than AUD100m in Australia each year. The most common life-threatening IFD is aspergillosis and a patient with this IFD typically has 12 days prolonged in-patient time in hospital and an 8% mor- tality rate. Surveillance and detection of IFDs irrespective of the stage of diag- nosis (i.e., early or late in disease) is important. We describe an application of text mining techniques, using machine learn- ing over a range of features, to automatically detect cases of patients with IFD from the text in the reports of CT scans performed on them. We focus on de- tecting the presence of aspergillosis; however, we anticipate the approach to be transferable to other diseases or conditions by training the text mining compo- nent over appropriate reports. Previous systems based on language technology have been deployed for processing radiology reports and for detecting hospital- acquired infection using language-processing technology, with significant suc- cess. Our approach differs by using a purely statistical/machine-learning ap- proach to the language technology, and by being trained and tested on data col- lected from a number of hospitals. We collected reports for 288 IFD and 291control patients from three different hospitals in Melbourne, Australia: Alfred Health, Melbourne Health, and Peter MacCallum Cancer Centre. We extracted a sample of 69 IFD and 49 control pa- tients to perform detailed analysis of the text with regard to IFD; each patient had possibly multiple scans (and associated reports), resulting in a total of 398 scan reports from IFD-positive patients and 83 scan reports from control pa- tients. We had medical experts annotate the patient-level classification on all scan reports at both sentence and report level: The annotators had to decide, for each sentence and report, whether it was positive, neutral, or negative with re- gards to IFD. We classify reports and patients as IFD-positive if they contain at least one positive sentence, and as negative otherwise. We used the Weka SVM implementation and employed a variety of text- and concept-based features, including bag-of-words, punctuation, UMLS concepts and negated contexts extracted using MetaMap. We also automatically extract-

...read moreread less

Proceedings Article•

The Sheffield and Basque Country Universities Entry to CHiC: Using Random Walks and Similarity to Access Cultural Heritage

[...]

Eneko Agirre¹, Paul Clough², Samuel Fernando², Mark M. Hall², Arantxa Otegi¹, Mark Stevenson² - Show less +2 more•Institutions (2)

University of the Basque Country¹, University of Sheffield²

01 Jan 2012

TL;DR: The Cultural Heritage in CLEF 2012 (CHiC) pilot evaluation included these tasks: ad-hoc retrieval, semantic enrichment and variability tasks, attempting the three English monolingual tasks.

...read moreread less

Abstract: The Cultural Heritage in CLEF 2012 (CHiC) pilot evaluation included these tasks: ad-hoc retrieval, semantic enrichment and variability tasks. At CHiC 2012, the University of She�eld and the University of the Basque Country submitted a joint entry, attempting the three English monolingual tasks. For the ad-hoc task, the baseline approach used the Indri Search engine. Query expansion approaches used random walks using Personalised Page Rank over graphs constructed from Wikipedia and WordNet, and also by �nding similar articles within Wikipedia. For the semantic enrichment task, random walks using Personalised Page Rank were again used. Additionally links to Wikipedia were added and further approaches used this information to �nd enrichment terms. Finally for the variability task, TF-IDF scores were calculated from text and meta-data �elds. The �final results were selected using MMR (Maximal Marginal Relevance) and cosine similarity.

...read moreread less

Book Chapter•DOI•

Hidden markov model for term weighting in verbose queries

[...]

Xueliang Yan¹, Guanglai Gao¹, Xiangdong Su¹, Hongxi Wei¹, Xueliang Zhang¹, Qianqian Lu¹ - Show less +2 more•Institutions (1)

Inner Mongolia University¹

17 Sep 2012

TL;DR: Observed that sentence-like queries encode information related to the term importance in the grammatical structure, a Hidden Markov Model (HMM) based method is proposed to extract such information to do term weighting.

...read moreread less

Abstract: It has been observed that short queries generally have better performance than their corresponding long versions when retrieved by the same IR model. This is mainly because most of the current models do not distinguish the importance of different terms in the query. Observed that sentence-like queries encode information related to the term importance in the grammatical structure, we propose a Hidden Markov Model (HMM) based method to extract such information to do term weighting. The basic idea of choosing HMM is motivated by its successful application in capturing the relationship between adjacent terms in NLP field. Since we are dealing with queries of natural language form, we think that HMM can also be used to capture the dependence between the weights and the grammatical structures. Our experiments show that our assumption is quite reasonable and that such information, when utilized properly, can greatly improve retrieval performance.

...read moreread less

Book Chapter•DOI•

Comparing IR system components using beanplots

[...]

Jens Kürsten¹, Maximilian Eibl¹•Institutions (1)

Chemnitz University of Technology¹

17 Sep 2012

TL;DR: An approach to gain a better understanding of the interactions between search tasks, test collections and components and configurations of retrieval systems by testing a large set of experiment configurations against standard ad-hoc test collections is demonstrated.

...read moreread less

Abstract: In this poster we demonstrate an approach to gain a better understanding of the interactions between search tasks, test collections and components and configurations of retrieval systems by testing a large set of experiment configurations against standard ad-hoc test collections.

...read moreread less

Book Chapter•DOI•

The appearance of the giant component in descriptor graphs and its application for descriptor selection

[...]

Anita Keszler¹, Levente Kovács¹, Tamás Szirányi¹•Institutions (1)

Hungarian Academy of Sciences¹

17 Sep 2012

TL;DR: Experimental results prove the existence of the giant component in such graphs, and based on the evaluation of their behaviour the graphs, the corresponding descriptors are compared, and validated in proof-of-concept retrieval tests.

...read moreread less

Abstract: The paper presents a random graph based analysis approach for evaluating descriptors based on pairwise distance distributions on real data. Starting from the Erdős-Renyi model the paper presents results of investigating random geometric graph behaviour in relation with the appearance of the giant component as a basis for choosing descriptors based on their clustering properties. Experimental results prove the existence of the giant component in such graphs, and based on the evaluation of their behaviour the graphs, the corresponding descriptors are compared, and validated in proof-of-concept retrieval tests.

...read moreread less

Book Chapter•DOI•

A test collection to evaluate plagiarism by missing or incorrect references

[...]

Solange de L. Pertile¹, Viviane Pereira Moreira¹•Institutions (1)

Universidade Federal do Rio Grande do Sul¹

17 Sep 2012

TL;DR: This paper presents a test collection with cases of plagiarism by missing and incorrect references, which contains automatically generated academic papers in which passages from other documents have been inserted.

...read moreread less

Abstract: In recent years, several methods and tools been developed together with test collections to aid in plagiarism detection. However, both methods and collections have focused on content analysis, overlooking citation analysis. In this paper, we aim at filling this gap and present a test collection with cases of plagiarism by missing and incorrect references. The collection contains automatically generated academic papers in which passages from other documents have been inserted. Such passages were either: adequately referenced (i.e., not plagiarized), not referenced, or incorrectly referenced. Annotation files identifying each passage enable the evaluation of plagiarism detection systems.

...read moreread less

Proceedings Article•

Biomedical text mining about Alzheimer's diseases for Machine Reading evaluation

[...]

Bing Han Tsai¹, Yu Zheng Liu¹, Wen-Juan Hou•Institutions (1)

National Taiwan Normal University¹

01 Jan 2012

TL;DR: The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012, where the last two runs use TF or TF-IDF weighting scheme as well as the OMIM termsabout Alzheimer for query expansion.

...read moreread less

Abstract: The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012. We have submitted total five unique runs in the pilot task. One run uses Term Frequency (TF) of the query words to weight the sentence. Two runs use Term Frequency-Inverted Document Frequency (TF-IDF) of the query words to weight the sentences. The two unique runs differ in the way that when multiple answers get the same scores by our system, we choose the different answer in the different runs. The last two runs use TF or TF-IDF weighting scheme as well as the OMIM terms about Alzheimer for query expansion. Stopwords are removed from the query words and answers. Each sentence in the associated document is assigned a weighting score with respect to query words. The sentence that receives the higher weighting score corresponding to the query words is identified as the more relevant sentence to the document. The corresponding answer option to the given question is scored according to the sentence weighting score and the highest ranked answer is selected as the final answer.

...read moreread less

Proceedings Article•

The PRA and AmILAB at ImageCLEF 2012 Photo Flickr Annotation Task

[...]

Luca Piras¹, Roberto Tronci¹, Gabriele Murgia, Giorgio Giacinto¹•Institutions (1)

University of Cagliari¹

01 Jan 2012

TL;DR: Experimental results show that just combining concept detectors not specically designed for handling the large variety of concepts does not allow reaching satisfactory results.

...read moreread less

Abstract: This paper presents the rst participation of the Pattern Recognition and Application Group (PRA Group), and the Ambient In- telligence Lab (AmILAB) at the ImageCLEF 2012 Photo Flickr Concept Annotation Task. In this task, the teams' goal is to detect the presence of 94 concepts in the images, and to provide a condence score related to the condence of the decision of each concept detector. We faced the challenge by relying on visual information only, combining dierent image descriptors by means of dierent score combination techniques. Experimental results show that just combining concept detectors not specically designed for handling the large variety of concepts does not allow reaching satisfactory results.

...read moreread less