Showing papers by "Bela Gipp published in 2013"

PDF

Open Access

Proceedings Article•DOI•

A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation

[...]

Joeran Beel, Marcel Genzmehr, Stefan Langer, Andreas Nürnberger¹, Bela Gipp² - Show less +1 more•Institutions (2)

Otto-von-Guericke University Magdeburg¹, University of California, Berkeley²

12 Oct 2013

TL;DR: It is found that results of offline and online evaluations often contradict each other, and it is concluded that offline evaluations may be inappropriate for evaluating research paper recommender systems, in many settings.

...read moreread less

Abstract: Offline evaluations are the most common evaluation method for research paper recommender systems. However, no thorough discussion on the appropriateness of offline evaluations has taken place, despite some voiced criticism. We conducted a study in which we evaluated various recommendation approaches with both offline and online evaluations. We found that results of offline and online evaluations often contradict each other. We discuss this finding in detail and conclude that offline evaluations may be inappropriate for evaluating research paper recommender systems, in many settings.

...read moreread less

136 citations

Proceedings Article•DOI•

Research paper recommender system evaluation: a quantitative literature survey

[...]

Joeran Beel, Stefan Langer, Marcel Genzmehr, Bela Gipp¹, Corinna Breitinger¹, Andreas Nürnberger - Show less +2 more•Institutions (1)

University of California, Berkeley¹

12 Oct 2013

TL;DR: It is currently not possible to determine which recommendation approaches for academic literature recommendation are the most promising, but there is little value in the existence of more than 80 approaches if the best performing approaches are unknown.

...read moreread less

Abstract: Over 80 approaches for academic literature recommendation exist today. The approaches were introduced and evaluated in more than 170 research articles, as well as patents, presentations and blogs. We reviewed these approaches and found most evaluations to contain major shortcomings. Of the approaches proposed, 21% were not evaluated. Among the evaluated approaches, 19% were not evaluated against a baseline. Of the user studies performed, 60% had 15 or fewer participants or did not report on the number of participants. Information on runtime and coverage was rarely provided. Due to these and several other shortcomings described in this paper, we conclude that it is currently not possible to determine which recommendation approaches for academic literature are the most promising. However, there is little value in the existence of more than 80 approaches if the best performing approaches are unknown.

...read moreread less

131 citations

Journal Article•DOI•

State-of-the-art in detecting academic plagiarism

[...]

Norman Meuschke¹, Bela Gipp•Institutions (1)

University of California, Berkeley¹

06 Jun 2013-The International Journal for Educational Integrity

TL;DR: In the future, plagiarism detection systems may benefit from combining traditional character-based detection methods with these emerging detection approaches, including intrinsic, cross-lingual and citation-based plagiarism Detection.

...read moreread less

Abstract: The problem of academic plagiarism has been present for centuries. Yet, the widespread dissemination of information technology, including the internet, made plagiarising much easier. Consequently, methods and systems aiding in the detection of plagiarism have attracted much research within the last two decades. Researchers proposed a variety of solutions, which we will review comprehensively in this article. Available detection systems use sophisticated and highly efficient character-based text comparisons, which can reliably identify verbatim and moderately disguised copies. Automatically detecting more strongly disguised plagiarism, such as paraphrases, translations or idea plagiarism, is the focus of current research. Proposed approaches for this task include intrinsic, cross-lingual and citation-based plagiarism detection. Each method offers unique strengths and weaknesses; however, none is currently mature enough for practical use. In the future, plagiarism detection systems may benefit from combining traditional character-based detection methods with these emerging detection approaches.

...read moreread less

99 citations

Proceedings Article•DOI•

Evaluation of header metadata extraction approaches and tools for scientific PDF documents

[...]

Mario Lipinski¹, Kevin Yao¹, Corinna Breitinger¹, Joeran Beel¹, Bela Gipp¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

22 Jul 2013

TL;DR: In the evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop, and SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.

...read moreread less

Abstract: This paper evaluates the performance of tools for the extraction of metadata from scientific articles. Accurate metadata extraction is an important task for automating the management of digital libraries. This comparative study is a guide for developers looking to integrate the most suitable and effective metadata extraction tool into their software. We shed light on the strengths and weaknesses of seven tools in common use. In our evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop. SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.

...read moreread less

65 citations

Proceedings Article•DOI•

Demonstration of citation pattern analysis for plagiarism detection

[...]

Bela Gipp¹, Norman Meuschke¹, Corinna Breitinger¹, Mario Lipinski¹, Andreas Nürnberger² - Show less +1 more•Institutions (2)

University of California, Berkeley¹, Otto-von-Guericke University Magdeburg²

28 Jul 2013

TL;DR: State-of-the-art plagiarism detection approaches capably identify copy & paste and to some extent slightly modified plagiarism but cannot reliably identify strongly disguised plagiarism forms, including paraphrases, translated plagiarism, and idea plagiarism.

...read moreread less

Abstract: Limitations of Plagiarism Detection Systems State-of-the-art plagiarism detection approaches capably identify copy & paste and to some extent slightly modified plagiarism. However, they cannot reliably identify strongly disguised plagiarism forms, including paraphrases, translated plagiarism, and idea plagiarism, which are forms of plagiarism more commonly found in scientific texts. This weakness of current systems results in a large fraction of today’s scientific plagiarism going undetected.

...read moreread less

27 citations

Dissertation•

Citation-based Plagiarism Detection : Applying Citation Pattern Analysis to Identify Currently Non-Machine-Detectable Disguised Plagiarism in Scientific Publications

[...]

Bela Gipp

01 Jan 2013

7 citations