Skip to content

Piyush-0120/PlagiarismDetector

Repository files navigation

Plagiarism Detector

The earnest goal of this project is to help our users detect Plagiarisation in the quickest and most effective manner. With the help of ‘Plagiarisation Detector’ we aim to check for plagiarisation between two documents and also provide our users features like word count, line count, paragraph count and also spelling errors. All of this is mounted on a robust and user-friendly GUI. The algorithm for string matching is KMP (Knuth Morris Pratt) algorithm.

Time Complexity of KMP

  • Preprocessing time : O(m)
  • Matching time : O(n)

Thus we can see that the total time complexity of KMP algorithm is O(m+n) which is a linear time complexity and where m is size of pattern string and n is size of main string.

Flow of execution of Plagiarism Detector

image

  • Input document :
    • Input document is given for recognition the document can be any journal or paper.
  • Keyword extraction
    • The keywords from that document have been extracted for comparing it with other document.
  • Algorithm :
    • Here we have used the KMP algorithm to find out any suspicious matter in the document.
  • Discovery of similarity :
    • After going through the algorithms any similarity can be judged or discovered.
  • Plagiarism percent:
    • The plagiarism percentage is calculated by the formula below: 2* (count of similar words)*100/(parsed words of doc1 + parsed words of doc2)%

Flow of execution of Special Features

image

  • Input document :
    • Input document is given for analysis.
  • Preprocessing :
    • Document is broken down to paragraphs, sentences, words.
  • Discovery of spell errors:
    • Using TextBlob library we discover the words with spelling mistakes
  • Output :
    • Displaying the word, paragraph and sentence count.

How it Looks like

image

image

Contribution

I would like to thank the collaborators

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages