scispace - formally typeset
Search or ask a question
Author

Mohamed Tmar

Bio: Mohamed Tmar is an academic researcher from University of Sfax. The author has contributed to research in topics: Image retrieval & Relevance feedback. The author has an hindex of 8, co-authored 54 publications receiving 202 citations.


Papers
More filters
Book ChapterDOI
21 Nov 2012
TL;DR: A new intrinsic information content metric is used with Wikipedia category graph to measure the semantic relatedness between words and when tested on common benchmark of similarity ratings the proposed approach shows a good correlation value compared to other computational models.
Abstract: Computing semantic relatedness is a key component of information retrieval tasks and natural processing language applications. Wikipedia provides a knowledge base for computing word relatedness with more coverage than WordNet. In this paper we use a new intrinsic information content (IC) metric with Wikipedia category graph (WCG) to measure the semantic relatedness between words. Indeed, we have developed a performed algorithm to extract the categories assigned to a given word from the WCG. Moreover, this extraction strategy is coupled with a new intrinsic information content metric based on the subgraph composed of hypernyms of a given concept. Also, we have developed a process to quantify the information content subgraph. When tested on common benchmark of similarity ratings the proposed approach shows a good correlation value compared to other computational models.

15 citations

Proceedings ArticleDOI
01 Sep 2011
TL;DR: This paper presents a new approach for measuring semantic relatedness between words and concepts which offers a thorough use of the relation hypernym/hyponym (noun and verb “is a” taxonomy) without external corpus statistical information.
Abstract: Semantic similarity techniques are used to compute the semantic similarity (common shared information) between two concepts according to certain language or domain resources like ontologies, taxonomies, corpora, etc. Semantic similarity techniques constitute important components in most Information Retrieval (IR) and knowledge-based systems. Taking semantics into account passes by the use of external semantic resources coupled with the initial documentation on which it is necessary to have semantic similarity measurements to carry out comparisons between concepts. This paper presents a new approach for measuring semantic relatedness between words and concepts. It combines a new information content (IC) metric using the WordNet thesaurus and the nominalization relation provided by the Java WordNet Library (JWNL). Specifically, the proposed method offers a thorough use of the relation hypernym/hyponym (noun and verb “is a” taxonomy) without external corpus statistical information. Mainly, we use the subgraph formed by hypernyms of the concerned concept which inherits the whole features of its hypernyms and we quantify the contribution of each concept pertaining to this subgraph in its information content. When tested on a common data set of word pair similarity ratings, the proposed approach outperforms other computational models. It gives the highest correlation value 0.70 with a benchmark based on human similarity judgments and especially a large dataset composed of 260 Finkelstein word pairs (Appendix 1 and 2).

13 citations

Journal ArticleDOI
TL;DR: An efficient and effective retrieval framework which includes a vectorization technique combined with a pseudo relevance model to transform any similarity matching model (between images) to a vector space model providing a score is presented.
Abstract: Image retrieval is an important problem for researchers in computer vision and content-based image retrieval (CBIR) fields. Over the last decades, many image retrieval systems were based on image representation as a set of extracted low-level features such as color, texture and shape. Then, systems calculate similarity metrics between features in order to find similar images to a query image. The disadvantage of this approach is that images visually and semantically different may be similar in the low level feature space. So, it is necessary to develop tools to optimize retrieval of information. Integration of vector space models is one solution to improve the performance of image retrieval. In this paper, we present an efficient and effective retrieval framework which includes a vectorization technique combined with a pseudo relevance model. The idea is to transform any similarity matching model (between images) to a vector space model providing a score. A study on several methodologies to obtain the vectorization is presented. Some experiments have been undertaken on Wang, Oxford5k and Inria Holidays datasets to show the performance of our proposed framework.

13 citations

Book ChapterDOI
02 Oct 2009
TL;DR: An automatic method using a MeSH (Medical Subject Headings) thesaurus for generating a semantic annotation of medical articles using NLP techniques to extract the indexing terms and weighs concepts based on their frequencies, locations in the article and their semantic relationships according to MeSH.
Abstract: This paper proposes an automatic method using a MeSH (Medical Subject Headings) thesaurus for generating a semantic annotation of medical articles. First, our approach uses NLP (Natural Language Processing) techniques to extract the indexing terms. Second, it extracts the Mesh concepts from this set of indexing terms. Then, these concepts are weighed based on their frequencies, locations in the article and their semantic relationships according to MeSH. Next, a refinement phase is triggered in order to upgrade the frequent ontology's concepts and determine the ones which will be integrated in the annotation. Finally, the structured result annotation is built.

12 citations

01 Jan 2014
TL;DR: This paper presents the first participation in user-centred health information retrieval task at the CLEFeHealth 2014, which has as objective the information retrieval to answer patients’ questions when reading clinical reports.
Abstract: This paper presents our first participation in user-centred health information retrieval task at the CLEFeHealth 2014. This task has as objective the information retrieval to answer patients’ questions when reading clinical reports. We have submitted only the mandatory run (baseline system). The obtained results are motivating with map=0.1677 and p@10=0.5460 but can be improved.

11 citations


Cited by
More filters
Proceedings Article
01 Jan 2012
TL;DR: The ninth edition of the ImageCLEF medical image retrieval and classication tasks was organized in 2012, using a larger number of over 300'000 images than in 2011, adding mainly complexity.
Abstract: The ninth edition of the ImageCLEF medical image retrieval and classication tasks was organized in 2012. A subset of the open access collection of PubMed Central was used as the database in 2012, using a larger number of over 300'000 images than in 2011. As in previous years, there were three subtasks: modality classication, image{based and case{based retrieval. A new hierarchy for article gures was created for the modality classi- cation task. The modality detection could be one of the most important lters to limit the search and focus the results sets. The goal of the image{based and the case{based retrieval tasks were similar compared to 2011 adding mainly complexity. The number of groups submitting runs has remained stable at 17, with the number of submitted runs remaining roughly the same with 202 (207 in 2011). Of these, 122 were image{based retrieval runs, 37 were case{based runs while the remaining 43 were modality classication runs. Depending on the exact nature of the task, visual, textual or multimodal approaches performed better.

115 citations

Journal ArticleDOI
TL;DR: This paper utilized Wikipedia features (articles, categories, Wikipedia category graph and redirection) in a system combining this Wikipedia semantic information in its different components to quantify better as possible the semantic relatedness between words.
Abstract: Measuring semantic relatedness is a critical task in many domains such as psychology, biology, linguistics, cognitive science and artificial intelligence. In this paper, we propose a novel system for computing semantic relatedness between words. Recent approaches have exploited Wikipedia as a huge semantic resource that showed good performances. Therefore, we utilized the Wikipedia features (articles, categories, Wikipedia category graph and redirection) in a system combining this Wikipedia semantic information in its different components. The approach is preceded by a pre-processing step to provide for each category pertaining to the Wikipedia category graph a semantic description vector including the weights of stems extracted from articles assigned to the target category. Next, for each candidate word, we collect its categories set using an algorithm for categories extraction from the Wikipedia category graph. Then, we compute the semantic relatedness degree using existing vector similarity metrics (Dice, Overlap and Cosine) and a new proposed metric that performed well as cosine formula. The basic system is followed by a set of modules in order to exploit Wikipedia features to quantify better as possible the semantic relatedness between words. We evaluate our measure based on two tasks: comparison with human judgments using five datasets and a specific application ''solving choice problem''. Our result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches.

107 citations

18 Sep 2014
TL;DR: This paper presents the results of task 3 of the ShARe/CLEF eHealth Evaluation Lab 2014, which investigated the eectiveness of information retrieval systems in a monolingual and a multilingual context.
Abstract: This paper presents the results of task 3 of the ShARe/CLEF eHealth Evaluation Lab 2014. This evaluation lab focuses on improving access to medical information on the web. The task objective was to investigate the eect of using additional information such as a related discharge summary and external resources such as medical ontologies on the eectiveness of information retrieval systems, in a monolingual (Task 3a) and in a multilingual (Task 3b) context. The participants were al- lowed to submit up to seven runs for each language (English, Czech, French, German), one mandatory run using no additional information or external resources, and three each using or not using discharge sum- maries.

104 citations