scispace - formally typeset
Search or ask a question

Showing papers presented at "Cross-Language Evaluation Forum in 2012"


Proceedings Article
01 Jan 2012
TL;DR: The ninth edition of the ImageCLEF medical image retrieval and classication tasks was organized in 2012, using a larger number of over 300'000 images than in 2011, adding mainly complexity.
Abstract: The ninth edition of the ImageCLEF medical image retrieval and classication tasks was organized in 2012. A subset of the open access collection of PubMed Central was used as the database in 2012, using a larger number of over 300'000 images than in 2011. As in previous years, there were three subtasks: modality classication, image{based and case{based retrieval. A new hierarchy for article gures was created for the modality classi- cation task. The modality detection could be one of the most important lters to limit the search and focus the results sets. The goal of the image{based and the case{based retrieval tasks were similar compared to 2011 adding mainly complexity. The number of groups submitting runs has remained stable at 17, with the number of submitted runs remaining roughly the same with 202 (207 in 2011). Of these, 122 were image{based retrieval runs, 37 were case{based runs while the remaining 43 were modality classication runs. Depending on the exact nature of the task, visual, textual or multimodal approaches performed better.

115 citations


Book ChapterDOI
17 Sep 2012
TL;DR: This text presents reflections and ideas of a concrete project on using cloud---based benchmarking paradigms for medical image analysis and retrieval using cloud computing technology to run two evaluation campaigns in 2013 and 2014 using the proposed technology.
Abstract: Benchmarks have shown to be an important tool to advance science in the fields of information analysis and retrieval. Problems of running benchmarks include obtaining large amounts of data, annotating it and then distributing it to the participants of a benchmark. Distribution of the data to participants is currently mostly done via data download that can take hours for large data sets and in countries with slow Internet connections even days. Sending physical hard disks was also used for distributing very large scale data sets (for example by TRECvid) but also this becomes infeasible if the data sets reach sizes of 5---10 TB. With cloud computing it is possible to make very large data sets available in a central place with limited costs. Instead of distributing the data to the participants, the participants can compute their algorithms on virtual machines of the cloud providers. This text presents reflections and ideas of a concrete project on using cloud---based benchmarking paradigms for medical image analysis and retrieval. It is planned to run two evaluation campaigns in 2013 and 2014 using the proposed technology.

45 citations


Proceedings Article
17 Sep 2012
TL;DR: This overview presents more precisely the resources and assessments of task, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main evaluation results.
Abstract: The ImageCLEF's plant identification task provides a testbed for the system-oriented evaluation of plant identification, more precisely on the 126 tree species identification based on leaf images. Three types of image content are considered: Scan, Scan-like (leaf photographs with a white uniform background), and Photograph (unconstrained leaf with natural background). The main originality of this data is that it was specifically built through a citizen sciences initiative conducted by Tela Botanica, a French social network of amateur and expert botanists. This makes the task closer to the conditions of a real-world application. This overview presents more precisely the resources and assessments of task, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main evaluation results. With a total of eleven groups from eight countries and with a total of 30 runs submitted, involving distinct and original methods, this second year pilot task confirms Image Retrieval community interest for biodiversity and botany, and highlights further challenging studies in plant identification.

36 citations


Book ChapterDOI
17 Sep 2012
TL;DR: This work proposes a concept-based similarity model for the issue of cross-language high similarity and near-duplicates search and finds, though the proposed model is very generic, it produces competitive results and is significantly stable and consistent across the corpora.
Abstract: This work addresses the issue of cross-language high similarity and near-duplicates search, where, for the given document, a highly similar one is to be identified from a large cross-language collection of documents. We propose a concept-based similarity model for the problem which is very light in computation and memory. We evaluate the model on three corpora of different nature and two language pairs English-German and English-Spanish using the Eurovoc conceptual thesaurus. Our model is compared with two state-of-the-art models and we find, though the proposed model is very generic, it produces competitive results and is significantly stable and consistent across the corpora.

32 citations


Book ChapterDOI
17 Sep 2012
TL;DR: This paper describes the specification for an Information Retrieval (IR) evaluation infrastructure by conceptually modeling the entities involved in information retrieval experimental evaluation and their relationships and by defining the architecture of the proposed evaluation infrastructure and the APIs for accessing it.
Abstract: Information Retrieval (IR) experimental evaluation is an essential part of the research on and development of information access methods and tools. Shared data sets and evaluation scenarios allow for comparing methods and systems, understanding their behaviour, and tracking performances and progress over the time. On the other hand, experimental evaluation is an expensive activity in terms of human effort, time, and costs required to carry it out. Software and hardware infrastructures that support experimental evaluation operation as well as management, enrichment, and exploitation of the produced scientific data provide a key contribution in reducing such effort and costs and carrying out systematic and throughout analysis and comparison of systems and methods, overall acting as enablers of scientific and technical advancement in the field. This paper describes the specification for an Information Retrieval (IR) evaluation infrastructure by conceptually modeling the entities involved in Information Retrieval (IR) experimental evaluation and their relationships and by defining the architecture of the proposed evaluation infrastructure and the APIs for accessing it.

30 citations


Proceedings Article
01 Jan 2012
TL;DR: A dictionary of words and expressions relating to predators' grooming stages is described, which was used to identify which posts in the predators' conversations were most distinctive for their grooming behavior.
Abstract: In this paper we present a new approach for detecting online pedophiles in chat rooms that combines the results of predictions on the level of the individual post, the level of the user and the level of the entire conversation, and describe the results of this three-stage system in the PAN 2012 competition. Also, we describe a resampling and a filtering strategy to circumvent issues regarding the unbalanced dataset. Finally, we describe the creation of a dictionary of words and expressions relating to predators' grooming stages, which we used to identify which posts in the predators' conversations were most distinctive for their grooming behavior.

27 citations


Book ChapterDOI
17 Sep 2012
TL;DR: This study focuses on the retrieval performance of the system but on the relevance assessments and the inter-assessor reliability, and applies Fleiss' Kappa and Krippendorff's Alpha to quantify the agreement.
Abstract: During the last three years we conducted several information retrieval evaluation series with more than 180 LIS students who made relevance assessments on the outcomes of three specific retrieval services. In this study we do not focus on the retrieval performance of our system but on the relevance assessments and the inter-assessor reliability. To quantify the agreement we apply Fleiss' Kappa and Krippendorff's Alpha. When we compare these two statistical measures on average Kappa values were 0.37 and Alpha values 0.15. We use the two agreement measures to drop too unreliable assessments from our data set. When computing the differences between the unfiltered and the filtered data set we see a root mean square error between 0.02 and 0.12. We see this as a clear indicator that disagreement affects the reliability of retrieval evaluations. We suggest not to work with unfiltered results or to clearly document the disagreement rates.

22 citations


Book ChapterDOI
17 Sep 2012
TL;DR: The results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments, and show how the pseudo test collection ranks systems compared to editorial topics with editorial judgements.
Abstract: Pseudo test collections are automatically generated to provide training material for learning to rank methods. We propose a method for generating pseudo test collections in the domain of digital libraries, where data is relatively sparse, but comes with rich annotations. Our intuition is that documents are annotated to make them better findable for certain information needs. We use these annotations and the associated documents as a source for pairs of queries and relevant documents. We investigate how learning to rank performance varies when we use different methods for sampling annotations, and show how our pseudo test collection ranks systems compared to editorial topics with editorial judgements. Our results demonstrate that it is possible to train a learning to rank algorithm on generated pseudo judgments. In some cases, performance is on par with learning on manually obtained ground truth.

22 citations


Proceedings Article
01 Jan 2012
TL;DR: The pilot task Processing modality and negation as mentioned in this paper, which was organized in the framework of the Question Answering for Machine Reading Evaluation Lab at CLEF 2012, defined as an annotation exercise consisting on determining whether an event mentioned in a text is presented as negated, modalised (i.e. affected by an expression of modality), or both.
Abstract: This paper describes the pilot task Processing modality and negation, which was organized in the framework of the Question Answering for Machine Reading Evaluation Lab at CLEF 2012. This task was defined as an annotation exercise consisting on determining whether an event mentioned in a text is presented as negated, modalised (i.e. affected by an expression of modality), or both. Three teams participated in the task submitting a total of 6 runs. The highest score obtained by a system was 0.6368 macroaveraged F1 measure.

21 citations


Proceedings Article
17 Sep 2012
TL;DR: In this article, a two-step model-driven segmentation and evaluation of high-level characteristics that make a semantic interpretation possible, as well as more generic shape features are combined in a random forest classification algorithm, and their significance evaluated.
Abstract: This paper summarizes the participation of the ReVeS project to the ImageCLEF 2012 Plant Identification task. Aiming to develop a system for tree leaf identification on mobile devices, our method is designed to cope with the challenges of complex natural images and to enable a didactic interaction with the user. The approach relies on a two step model-driven segmentation and on the evaluation of high-level characteristics that make a semantic interpretation possible, as well as more generic shape features. All these descriptors are combined in a random forest classification algorithm, and their significance evaluated. Our team ranks 4th overall, 3rd on natural images, which constitutes a very satisfying performance with respect to the project's objectives.

17 citations


Book ChapterDOI
17 Sep 2012
TL;DR: This paper summarizes a major effort in interactive search investigation, the INEX i-track, a collective effort run over a seven-year period, and presents the experimental conditions, findings of the participating groups, and examines the challenges posed by this kind of collective experimental effort.
Abstract: This paper summarizes a major effort in interactive search investigation, the INEX i-track, a collective effort run over a seven-year period. We present the experimental conditions, report some of the findings of the participating groups, and examine the challenges posed by this kind of collective experimental effort.

Book ChapterDOI
17 Sep 2012
TL;DR: In this article, the authors propose a new metric for ranking evaluation, the CRP, which is based on the observation that a document of a given degree of relevance may be ranked too early or too late regarding the ideal ranking of documents for a query.
Abstract: The development of multilingual and multimedia information access systems calls for proper evaluation methodologies to ensure that they meet the expected user requirements and provide the desired effectiveness. IR research offers a strong evaluation methodology and a range of evaluation metrics, such as MAP and (n)DCG. In this paper, we propose a new metric for ranking evaluation, the CRP. We start with the observation that a document of a given degree of relevance may be ranked too early or too late regarding the ideal ranking of documents for a query. Its relative position may be negative, indicating too early ranking, zero indicating correct ranking, or positive, indicating too late ranking. By cumulating these relative rankings we indicate, at each ranked position, the net effect of document displacements, the CRP. We first define the metric formally and then discuss its properties, its relationship to prior metrics, and its visualization. Finally we propose different visualizations of CRP by exploiting a test collection to demonstrate its behavior.

Book ChapterDOI
17 Sep 2012
TL;DR: The present paper describes the development of a language independent query focused snippet generation module that takes the query and content of each retrieving document and generates a query dependent snippet for each retrieved document.
Abstract: The present paper describes the development of a language independent query focused snippet generation module. This module takes the query and content of each retrieved document and generates a query dependent snippet for each retrieved document. The algorithm of this module based on the sentence extraction, sentence scoring and sentence ranking. Subjective evaluation has been. English snippet got the best evaluation score, i.e. 1 and overall average evaluation score of 0.83 has been achieved in the scale of 0 to 1.

Proceedings Article
01 Jan 2012
TL;DR: This paper introduced the technique of index expansion, which relies on building a search index enriched with information gathered from a linguistic analysis of texts, and reported their experiments in tackling the CLEF 2012 Pilot Task on Machine Reading for Question Answering.
Abstract: The paper reports our experiments in tackling the CLEF 2012 Pilot Task on Machine Reading for Question Answering. We introduce the technique of index expansion, which relies on building a search index enriched with information gathered from a linguistic analysis of texts. The index provides a highly tangled representation of the sentences where each word is directly connected to others representing both meaning and relations. Instead of keeping the knowledge base separate, the relevant knowledge gets embedded within the text. We can hence use efficient indexing techniques to represent such knowledge and query it very effectively. We explain how index expansion was used in the task and describe the experiments that we performed. The results achieved are quite positive and a final error analysis shows how the technique can be further improved.

Book ChapterDOI
17 Sep 2012
TL;DR: A strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking is enhanced to achieve 11.2% improvement in B-Cubed+ F-measure.
Abstract: In this paper we propose two novel approaches to enhance cross-lingual entity linking (CLEL). One is based on cross-lingual information networks, aligned based on monolingual information extraction, and the other uses topic modeling to ensure global consistency. We enhance a strong baseline system derived from a combination of state-of-the-art machine translation and monolingual entity linking to achieve 11.2% improvement in B-Cubed+ F-measure. Our system achieved highly competitive results in the NIST Text Analysis Conference (TAC) Knowledge Base Population (KBP2011) evaluation. We also provide detailed qualitative and quantitative analysis on the contributions of each approach and the remaining challenges.

Book ChapterDOI
17 Sep 2012
TL;DR: This paper gives an overview of several different approaches that have been applied by participants in the CLEF-IP evaluation initiative and suggested techniques and experimental paradigms could be helpful in further improving the results and making the experiments more realistic.
Abstract: This paper gives an overview of several different approaches that have been applied by participants in the CLEF-IP evaluation initiative. On this basis, it is suggested that other techniques and experimental paradigms could be helpful in further improving the results and making the experiments more realistic. The field of information seeking is therefore incorporated and its potential gain for patent retrieval explained. Furthermore, the different search tasks that are undertaken by patent searchers are introduced as possible use cases. They can serve as a basis for development in patent retrieval research in that they present the diverse scenarios with their special characteristics and give the research community therefore a realistic picture of the patent user's work.

Book ChapterDOI
17 Sep 2012
TL;DR: A lab test is performed to study satisfaction of users of a speech retrieval system to empirically estimate the optimal shape of the Penalty Function, the evaluation measure widely used for unsegmented speech retrieval.
Abstract: This paper deals with evaluation of information retrieval from unsegmented speech. We focus on Mean Generalized Average Precision, the evaluation measure widely used for unsegmented speech retrieval. This measure is designed to allow certain tolerance in matching retrieval results (starting points of relevant segments) against a gold standard relevance assessment. It employs a Penalty Function which evaluates non-exact matches in the retrieval results based on their distance from the beginnings of their nearest true relevant segments. However, the choice of the Penalty Function is usually ad-hoc and does not necessary reflect users' perception of the speech retrieval quality. We perform a lab test to study satisfaction of users of a speech retrieval system to empirically estimate the optimal shape of the Penalty Function.

Book ChapterDOI
17 Sep 2012
TL;DR: The effects that various characteristics of the topic documents have on the effectiveness of the systems for the task of finding prior art in the patent domain are revisit and the reader interested in approaching the domain a guide of the issues that need to be addressed.
Abstract: We revisit the effects that various characteristics of the topic documents have on the effectiveness of the systems for the task of finding prior art in the patent domain. In doing so, we provide the reader interested in approaching the domain a guide of the issues that need to be addressed in this context. For the current study, we select two patent based test collections with a common document representation schema and look at topic characteristics specific to the objectives of the collections. We look at the effect of languages on retrieval and at the length of the topic documents. We present the correlations between these topic facets and their retrieval results, as well as their relevant documents.

Book ChapterDOI
17 Sep 2012
TL;DR: The setup of MusiClef is described, showing how it complements existing benchmarking initiatives and fosters less explored methodological directions in Music Information Retrieval.
Abstract: MusiClef is a multimodal music benchmarking initiative that will be running a MediaEval 2012 Brave New Task on Multimodal Music Tagging. This paper describes the setup of this task, showing how it complements existing benchmarking initiatives and fosters less explored methodological directions in Music Information Retrieval. MusiClef deals with a concrete use case, encourages multimodal approaches based on these, and strives for transparency of results as much as possible. Transparency is encouraged at several levels and stages, from the feature extraction procedure up to the evaluation phase, in which a dedicated categorization of ground truth tags will be used to deepen the understanding of the relation between the proposed approaches and experimental results.

Proceedings Article
01 Jan 2012
TL;DR: The strong results confirm the suitability of linguistic heuristics for low-level semantic features and showcase their robustness across the different subgenres of the QA4MRE corpora.
Abstract: For the QA4MRE 2012 Pilot Task on Negation and Modality, CLaC Labs implemented a general, lightweight negation and modality module based on linguistic rules. The strong results confirm the suitability of linguistic heuristics for low-level semantic features and showcase their robustness across the different subgenres of the QA4MRE corpora.

Proceedings Article
01 Jan 2012
TL;DR: A Wikipedia-based two-pass algorithm is posed to address the issue of how to disambiguate tweets with respect to company names and the effectiveness of the proposed approach is demonstrated.
Abstract: Using Twitter as an effective marketing tool has become a gold mine for companies interested in their online reputation. A quite sig- nificant research challenge related to the above issue is to disambiguate tweets with respect to company names. In fact, finding if a particu- lar tweet is relevant or irrelevant to a company is an important task not satisfactorily solved yet; to address this issue in this paper we pro- pose a Wikipedia-based two-pass algorithm. The experimental evalua- tions demonstrate the effectiveness of the proposed approach.

Proceedings Article
01 Jan 2012
TL;DR: In CLEF-IP 2012, the Claims to Passage task where the goal was to return relevant passages according to sets of claims, for patentability or novelty search purposes, a two-step retriev- al system was designed to cope with the problems induced by this large dataset.
Abstract: In CLEF-IP 2012, we participated in the Claims to Passage task where the goal was to return relevant passages according to sets of claims, for patentability or novelty search purposes. The collection contained 2.3M of doc- uments, corresponding to an estimated volume of 250M of passages. To cope with the problems induced by this large dataset, we designed a two-step retriev- al system. In the first step, the 2.3M of patent application documents were in- dexed ; for each topic, we then retrieved the k most similar documents with a classical Prior Art Search. Document representations and tuning of the IR en- gine were set relying on training data and on the expertise we acquired in past similar tasks. In particular, we used not only claims for topics, but also the full description of the application document, and the applicants/inventors details ; moreover, we discarded retrieved documents that didn't share at least one IPC code with the topic. The k parameter ranged from 5 to 1000 according to the computed run. In the second step, for each topic (i.e. "on the fly"), we indexed the passages contained in these k most similar documents and queried with the topic claims in order to obtained the final runs. Thus, we dealt with approxi- mately 11M of passages instead of 250M. The best k parameter with the train- ing data was 10. Hence, we decided to submit four runs with k set to 10, 20, 50, and 100. Finally, we analyzed the training data and observed that the position of a passage in the document played a role, as passages at the end of the descrip- tion were more likely to be relevant. Thus, we re-ranked each run according to passages' positions in the document in order to submit four supplementary runs.

Proceedings Article
01 Jan 2012
TL;DR: An application of text mining techniques, using machine learn- ing over a range of features, to automatically detect cases of patients with IFD from the text in the reports of CT scans performed on them is described.
Abstract: Invasive fungal diseases (IFDs) cause more than 1,000 deaths in hos- pitals and cost the health system more than AUD100m in Australia each year. The most common life-threatening IFD is aspergillosis and a patient with this IFD typically has 12 days prolonged in-patient time in hospital and an 8% mor- tality rate. Surveillance and detection of IFDs irrespective of the stage of diag- nosis (i.e., early or late in disease) is important. We describe an application of text mining techniques, using machine learn- ing over a range of features, to automatically detect cases of patients with IFD from the text in the reports of CT scans performed on them. We focus on de- tecting the presence of aspergillosis; however, we anticipate the approach to be transferable to other diseases or conditions by training the text mining compo- nent over appropriate reports. Previous systems based on language technology have been deployed for processing radiology reports and for detecting hospital- acquired infection using language-processing technology, with significant suc- cess. Our approach differs by using a purely statistical/machine-learning ap- proach to the language technology, and by being trained and tested on data col- lected from a number of hospitals. We collected reports for 288 IFD and 291control patients from three different hospitals in Melbourne, Australia: Alfred Health, Melbourne Health, and Peter MacCallum Cancer Centre. We extracted a sample of 69 IFD and 49 control pa- tients to perform detailed analysis of the text with regard to IFD; each patient had possibly multiple scans (and associated reports), resulting in a total of 398 scan reports from IFD-positive patients and 83 scan reports from control pa- tients. We had medical experts annotate the patient-level classification on all scan reports at both sentence and report level: The annotators had to decide, for each sentence and report, whether it was positive, neutral, or negative with re- gards to IFD. We classify reports and patients as IFD-positive if they contain at least one positive sentence, and as negative otherwise. We used the Weka SVM implementation and employed a variety of text- and concept-based features, including bag-of-words, punctuation, UMLS concepts and negated contexts extracted using MetaMap. We also automatically extract-

Proceedings Article
01 Jan 2012
TL;DR: The Cultural Heritage in CLEF 2012 (CHiC) pilot evaluation included these tasks: ad-hoc retrieval, semantic enrichment and variability tasks, attempting the three English monolingual tasks.
Abstract: The Cultural Heritage in CLEF 2012 (CHiC) pilot evaluation included these tasks: ad-hoc retrieval, semantic enrichment and variability tasks. At CHiC 2012, the University of She�eld and the University of the Basque Country submitted a joint entry, attempting the three English monolingual tasks. For the ad-hoc task, the baseline approach used the Indri Search engine. Query expansion approaches used random walks using Personalised Page Rank over graphs constructed from Wikipedia and WordNet, and also by �nding similar articles within Wikipedia. For the semantic enrichment task, random walks using Personalised Page Rank were again used. Additionally links to Wikipedia were added and further approaches used this information to �nd enrichment terms. Finally for the variability task, TF-IDF scores were calculated from text and meta-data �elds. The �final results were selected using MMR (Maximal Marginal Relevance) and cosine similarity.

Book ChapterDOI
17 Sep 2012
TL;DR: Observed that sentence-like queries encode information related to the term importance in the grammatical structure, a Hidden Markov Model (HMM) based method is proposed to extract such information to do term weighting.
Abstract: It has been observed that short queries generally have better performance than their corresponding long versions when retrieved by the same IR model. This is mainly because most of the current models do not distinguish the importance of different terms in the query. Observed that sentence-like queries encode information related to the term importance in the grammatical structure, we propose a Hidden Markov Model (HMM) based method to extract such information to do term weighting. The basic idea of choosing HMM is motivated by its successful application in capturing the relationship between adjacent terms in NLP field. Since we are dealing with queries of natural language form, we think that HMM can also be used to capture the dependence between the weights and the grammatical structures. Our experiments show that our assumption is quite reasonable and that such information, when utilized properly, can greatly improve retrieval performance.

Book ChapterDOI
17 Sep 2012
TL;DR: An approach to gain a better understanding of the interactions between search tasks, test collections and components and configurations of retrieval systems by testing a large set of experiment configurations against standard ad-hoc test collections is demonstrated.
Abstract: In this poster we demonstrate an approach to gain a better understanding of the interactions between search tasks, test collections and components and configurations of retrieval systems by testing a large set of experiment configurations against standard ad-hoc test collections.

Book ChapterDOI
17 Sep 2012
TL;DR: Experimental results prove the existence of the giant component in such graphs, and based on the evaluation of their behaviour the graphs, the corresponding descriptors are compared, and validated in proof-of-concept retrieval tests.
Abstract: The paper presents a random graph based analysis approach for evaluating descriptors based on pairwise distance distributions on real data. Starting from the Erdős-Renyi model the paper presents results of investigating random geometric graph behaviour in relation with the appearance of the giant component as a basis for choosing descriptors based on their clustering properties. Experimental results prove the existence of the giant component in such graphs, and based on the evaluation of their behaviour the graphs, the corresponding descriptors are compared, and validated in proof-of-concept retrieval tests.

Book ChapterDOI
17 Sep 2012
TL;DR: This paper presents a test collection with cases of plagiarism by missing and incorrect references, which contains automatically generated academic papers in which passages from other documents have been inserted.
Abstract: In recent years, several methods and tools been developed together with test collections to aid in plagiarism detection. However, both methods and collections have focused on content analysis, overlooking citation analysis. In this paper, we aim at filling this gap and present a test collection with cases of plagiarism by missing and incorrect references. The collection contains automatically generated academic papers in which passages from other documents have been inserted. Such passages were either: adequately referenced (i.e., not plagiarized), not referenced, or incorrectly referenced. Annotation files identifying each passage enable the evaluation of plagiarism detection systems.

Proceedings Article
01 Jan 2012
TL;DR: The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012, where the last two runs use TF or TF-IDF weighting scheme as well as the OMIM termsabout Alzheimer for query expansion.
Abstract: The paper presents the experiments carried out as part of the participation in the pilot task of Biomedical about Alzheimer for QA4MRE at CLEF 2012. We have submitted total five unique runs in the pilot task. One run uses Term Frequency (TF) of the query words to weight the sentence. Two runs use Term Frequency-Inverted Document Frequency (TF-IDF) of the query words to weight the sentences. The two unique runs differ in the way that when multiple answers get the same scores by our system, we choose the different answer in the different runs. The last two runs use TF or TF-IDF weighting scheme as well as the OMIM terms about Alzheimer for query expansion. Stopwords are removed from the query words and answers. Each sentence in the associated document is assigned a weighting score with respect to query words. The sentence that receives the higher weighting score corresponding to the query words is identified as the more relevant sentence to the document. The corresponding answer option to the given question is scored according to the sentence weighting score and the highest ranked answer is selected as the final answer.

Proceedings Article
01 Jan 2012
TL;DR: Experimental results show that just combining concept detectors not specically designed for handling the large variety of concepts does not allow reaching satisfactory results.
Abstract: This paper presents the rst participation of the Pattern Recognition and Application Group (PRA Group), and the Ambient In- telligence Lab (AmILAB) at the ImageCLEF 2012 Photo Flickr Concept Annotation Task. In this task, the teams' goal is to detect the presence of 94 concepts in the images, and to provide a condence score related to the condence of the decision of each concept detector. We faced the challenge by relying on visual information only, combining dierent image descriptors by means of dierent score combination techniques. Experimental results show that just combining concept detectors not specically designed for handling the large variety of concepts does not allow reaching satisfactory results.