scispace - formally typeset
Search or ask a question
Author

Laima Kamzola

Bio: Laima Kamzola is an academic researcher from Riga Technical University. The author has contributed to research in topics: Plagiarism detection & Software system. The author has an hindex of 1, co-authored 3 publications receiving 17 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The sobering results show that although some web-based text-matching systems can indeed help identify some plagiarized content, they clearly do not find all plagiarism and at times also identify non-plagiarized material as problematic.
Abstract: There is a general belief that software must be able to easily do things that humans find difficult. Since finding sources for plagiarism in a text is not an easy task, there is a wide-spread expectation that it must be simple for software to determine if a text is plagiarized or not. Software cannot determine plagiarism, but it can work as a support tool for identifying some text similarity that may constitute plagiarism. But how well do the various systems work? This paper reports on a collaborative test of 15 web-based text-matching systems that can be used when plagiarism is suspected. It was conducted by researchers from seven countries using test material in eight different languages, evaluating the effectiveness of the systems on single-source and multi-source documents. A usability examination was also performed. The sobering results show that although some systems can indeed help identify some plagiarized content, they clearly do not find all plagiarism and at times also identify non-plagiarized material as problematic.

53 citations

Posted Content
TL;DR: A collaborative test of 15 web-based text-matching systems that can be used when plagiarism is suspected was conducted by researchers from seven countries using test material in eight different languages, evaluating the effectiveness of the systems on singlesource and multi-source documents as discussed by the authors.
Abstract: There is a general belief that software must be able to easily do things that humans find difficult. Since finding sources for plagiarism in a text is not an easy task, there is a wide-spread expectation that it must be simple for software to determine if a text is plagiarized or not. Software cannot determine plagiarism, but it can work as a support tool for identifying some text similarity that may constitute plagiarism. But how well do the various systems work? This paper reports on a collaborative test of 15 web-based text-matching systems that can be used when plagiarism is suspected. It was conducted by researchers from seven countries using test material in eight different languages, evaluating the effectiveness of the systems on single-source and multi-source documents. A usability examination was also performed. The sobering results show that although some systems can indeed help identify some plagiarized content, they clearly do not find all plagiarism and at times also identify non-plagiarized material as problematic.

1 citations

Journal ArticleDOI
TL;DR: This research focuses on testing the existent text-matching software systems on a set of documents prepared in the Latvian language on the plagiarism coverage using the prepared document corpus.
Abstract: There are many internationally developed text-matching software systems that help successfully identify potentially plagiarized content in English texts using both their internal databases and web resources. However, many other languages are not so widely spread but they are used daily to communicate, conduct research and acquire education. Each language has its peculiarities, so, in the context of finding content similarities, it is necessary to determine what systems are more suitable for a document set written in a specific language. The research focuses on testing the existent text-matching software systems on a set of documents prepared in the Latvian language. The corpus includes documents containing verbatim plagiarism, paraphrasing, translation plagiarism and original text to test both false positive and false negative cases. In total, 16 different text-matching software systems are compared on the plagiarism coverage using the prepared document corpus. The research presented is a part of an international initiative “Testing of Support Tools for Plagiarism Detection (TeSToP)” established under the European Network for Academic Integrity.

1 citations


Cited by
More filters
Journal ArticleDOI
01 Oct 2021-Heliyon
TL;DR: In this article, search engine activity data on exam cheating in Spain was collected and analyzed for the five-year period between 2016 and 2020 inclusive, showing a significant increase in requests for information on cheating on online exams during the COVID-19 timeframe and the Spanish lockdown period.

24 citations

Journal ArticleDOI
TL;DR: The cases of plagiarism in non-English speaking countries have a strong message for honest researchers that they should improve their English writing skills and credit used sources by properly citing and referencing them.
Abstract: What constitutes plagiarism? What are the methods to detect plagiarism? How do “plagiarism detection tools” assist in detecting plagiarism? What is the difference between plagiarism and similarity index? These are probably the most common questions regarding plagiarism that many research experts in scientific writing are usually faced with, but a definitive answer to them is less known to many. According to a report published in 2018, papers retracted for plagiarism have sharply increased over the last two decades, with higher rates in developing and non-English speaking countries.1 Several studies have reported similar findings with Iran, China, India, Japan, Korea, Italy, Romania, Turkey, and France amongst the countries with highest number of retractions due to plagiarism.1,2,3,4 A study reported that duplication of text, figures or tables without appropriate referencing accounted for 41.3% of post-2009 retractions of papers published from India.5 In Pakistan, Journal of Pakistan Medical Association started a special section titled “Learning Research” and published a couple of papers on research writing skills, research integrity and scientific misconduct.6,7 However, the problem has not been adequately addressed and specific issues about it remain unresolved and unclear. According to an unpublished data based on 1,679 students from four universities of Pakistan, 85.5% did not have a clear understanding of the difference between similarity index and plagiarism (unpublished data). Smart et al.8 in their global survey of editors reported that around 63% experienced some plagiarized submissions, with Asian editors experiencing the highest levels of plagiarized/duplicated content. In some papers, journals from non-English speaking countries have specifically discussed the cases of plagiarized submissions to them and have highlighted the drawbacks in relying on similarity checking programs.9,10,11 The cases of plagiarism in non-English speaking countries have a strong message for honest researchers that they should improve their English writing skills and credit used sources by properly citing and referencing them.12 Despite aggregating literature on plagiarism from non-Anglophonic countries, the answers to the aforementioned questions remain unclear. In order to answer these questions, it is important to have a thorough understanding of plagiarism and bring clarity to the less known issues about it. Therefore, this paper aims to 1) define plagiarism and growth in its prevalence as well as literature on it; 2) explain the difference between similarity and plagiarism; 3) discuss the role of similarity checking tools in detecting plagiarism and the flaws on completely relying on them; and 4) discuss the phenomenon called Trojan citation. At the end, suggestions are provided for authors and editors from developing countries so that this issue maybe collectively addressed.

15 citations

Journal ArticleDOI
TL;DR: A performance overview of various types of corpus-based models, especially deep learning (DL) models, with the task of paraphrase detection shows that DL models are very competitive with traditional state-of-the-art approaches and have potential that should be further developed.
Abstract: Paraphrase detection is important for a number of applications, including plagiarism detection, authorship attribution, question answering, text summarization, text mining in general, etc. In this paper, we give a performance overview of various types of corpus-based models, especially deep learning (DL) models, with the task of paraphrase detection. We report the results of eight models (LSI, TF-IDF, Word2Vec, Doc2Vec, GloVe, FastText, ELMO, and USE) evaluated on three different public available corpora: Microsoft Research Paraphrase Corpus, Clough and Stevenson and Webis Crowd Paraphrase Corpus 2011. Through a great number of experiments, we decided on the most appropriate approaches for text pre-processing: hyper-parameters, sub-model selection—where they exist (e.g., Skipgram vs. CBOW), distance measures, and semantic similarity/paraphrase detection threshold. Our findings and those of other researchers who have used deep learning models show that DL models are very competitive with traditional state-of-the-art approaches and have potential that should be further developed.

14 citations

Journal ArticleDOI
TL;DR: Medical researchers and authors may improve their writing skills and avoid the same errors by consulting the list of retractions due to plagiarism which are tracked on the PubMed platform and discussed on the Retraction Watch blog.
Abstract: Plagiarism is an ethical misconduct affecting the quality, readability, and trustworthiness of scholarly publications. Improving researcher awareness of plagiarism of words, ideas, and graphics is essential for avoiding unacceptable writing practices. Global editorial associations have publicized their statements on strategies to clean literature from redundant, stolen, and misleading information. Consulting related documents is advisable for upgrading author instructions and warning plagiarists of academic and other consequences of the unethical conduct. A lack of creative thinking and poor academic English skills are believed to compound most instances of redundant and “copy-and-paste” writing. Plagiarism detection software largely relies on reporting text similarities. However, manual checks are required to reveal inappropriate referencing, copyright violations, and substandard English writing. Medical researchers and authors may improve their writing skills and avoid the same errors by consulting the list of retractions due to plagiarism which are tracked on the PubMed platform and discussed on the Retraction Watch blog.

13 citations

Journal ArticleDOI
TL;DR: In this article, a systematic review examines the research on online assessment security involving studies completed between 2016 and 2021, and proposes an Academic Dishonesty Mitigation Plan (ADMP) that encompasses strategies from both prevention and detection approaches for effective security and integrity of online assessments.

9 citations