Filtering Documents for Plagiarism Detection

Kensuke Baba

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)


Efficient methods are required for plagiarism detection. This paper proposes a fast and scalable method for detecting “copy and paste”-type plagiarism in documents. Implementing detection methods for this type of plagiarism requires a long processing time or a large database for comprehensive matching of ordered word occurrences. The author improved the scalability of an existing fast method based on fast Fourier transform using the idea of the frequency domain filtering. He evaluated the effect of the improvement on accuracy of the plagiarism detection method, and achieved an effective trade-off between the accuracy and the required size of database.

Original languageEnglish
Title of host publicationDiscovery Science - 21st International Conference, DS 2018, Proceedings
EditorsMichelangelo Ceci, Larisa Soldatova, Joaquin Vanschoren, George Papadopoulos
PublisherSpringer Verlag
Number of pages12
ISBN (Print)9783030017705
Publication statusPublished - 2018
Externally publishedYes
Event21st International Conference on Discovery Science, DS 2018 - Limassol, Cyprus
Duration: Oct 29 2018Oct 31 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11198 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference21st International Conference on Discovery Science, DS 2018


  • Fast Fourier transform
  • Filtering
  • Plagiarism detection
  • Text processing
  • Vector representation of words

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Filtering Documents for Plagiarism Detection'. Together they form a unique fingerprint.

Cite this