scispace - formally typeset
Open Access

Using WordNet-based semantic similarity measurement in External Plagiarism Detection Notebook for PAN at CLEF 2011

TLDR
It is shown that there exists a direct correlation between the obfuscation degree (method) and the achieved performance, thus defining the baseline for further studies.
Abstract
Continuing our previous work started at PAN 2009 and PAN 2010 [7] we considered further research options based on the achieved baseline of the best performing algorithms. The research done by Potthast et al. [4] presented a sliced view of the presented approaches showing their performance on specific corpus metrics external\intrinsic, obfuscation strategies (none, artificial high\low, simulated, translated), topic match, case length and document length thus defining the baseline for further studies. A brief analysis of the above named results [1,3] shows that there exists a direct correlation between the obfuscation degree (method) and the achieved performance.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Overview of the 2nd International Competition on Plagiarism Detection

TL;DR: In PAN'10, 18 plagiarism detectors were evaluated in detail, highlighting several important aspects of plagiarism detection, such as obfuscation, intrinsic vs. external plagiarism, and plagiarism case length as mentioned in this paper.
Journal ArticleDOI

Unmasking text plagiarism using syntactic-semantic based natural language processing techniques

TL;DR: The proposed approach presented considerable improvement in comparison with the top-ranked systems of the respective years and reflected the supremacy of deeper linguistic features for identifying manually plagiarized data.
Proceedings ArticleDOI

On the mono- and cross-language detection of text reuse and plagiarism

TL;DR: The aim of this PhD thesis is to address three of the main problems in the development of better models for automatic plagiarism detection: the adequate identification of good potential sources for a given suspicious text, the detection of plagiarism despite modifications and the generation of standard collections of cases of plagiarisms and text reuse.
Proceedings ArticleDOI

Using Natural Language Processing Techniques and Fuzzy-Semantic Similarity for Automatic External Plagiarism Detection

TL;DR: The paper explores the different preprocessing methods based on Natural Language Processing (NLP) techniques and further explores fuzzy-semantic similarity measures for document comparisons and performances of different methods are compared.

Technologies for Reusing Text from the Web

TL;DR: This thesis presents a comprehensive overview of the different ways in which text and language is reused today, and how exactly information retrieval technologies can be applied in this respect, and introduces technologies that solve three retrieval tasks based on language reuse.
References
More filters

Overview of the 1st international competition on plagiarism detection

TL;DR: Thispaper overviews 18 plagiarism detectors that have been developed and evaluated within PAN'10, highlighting several important aspects of plagiarism de- tection, such as obfuscation, intrinsic vs. external plagiarism, and plagiarism case length.

External and Intrinsic Plagiarism Detection using a Cross-Lingual Retrieval and Segmentation System Lab Report for PAN at CLEF 2010

TL;DR: This work presents a hybrid system that performs plagiarism detection for translated and non-translated exter- nally as well as intrinsically plagiarized document passages, using heuristic post processing to arrive at the final detection results.

Encoplot - Performance in the Second International Plagiarism Detection Challenge - Lab Report for PAN at CLEF 2010 .

TL;DR: This year's submission is generated by the same method Encoplot that was developed for the last year competition and there is a single improvement.

Exploring Fingerprinting as External Plagiarism Detection Method - Lab Report for PAN at CLEF 2010.

TL;DR: This paper outlines the main approach and the general design of the plagiarism detection prototype application that has been developed to take part in the 2nd International Plagiarism Detection Competition.
Related Papers (5)