-
Notifications
You must be signed in to change notification settings - Fork 332
Home
JPlag is a system that finds similarities among multiple sets of source code files. This way it can detect software plagiarism and collusion in software development. JPlag does not merely compare bytes of text but is aware of programming language syntax and program structure and hence is robust against many kinds of attempts to disguise similarities between plagiarized files. JPlag currently supports Java, C#, C, C++, Python 3, Scheme, and natural language text.
JPlag is typically used to detect and thus discourage the unallowed copying of student exercise programs in programming education. But in principle, it can also be used to detect stolen software parts among large amounts of source text or modules that have been duplicated (and only slightly modified). JPlag has already played a part in several intellectual property cases where it has been successfully used by expert witnesses.
Just to make it clear: JPlag does not compare to the internet! It is designed to find similarities among the student solutions, which is usually sufficient for computer programs.
Originally, JPlag was developed in 1996 by Guido Mahlpohl and others at the chair of Prof. Walter Tichy at Karlsruhe Institute of Technology (KIT). It was first documented in a Tech Report in 2002 and later more formally in the Journal of Universal Computer Science. Since 2015 JPlag is hosted here on GitHub.
Download the latest version of JPlag here. If you encounter bugs or other issues, please report them here.
In case you depend on the legacy version of JPlag we refer to the legacy release v2.12.1 and the legacy branch. Note that the legacy CLI usage is slightly different.
The following features are only available in version v3.0.0 and onwards:
- a Java API for third-party integration
- a simplified command-line interface
- support for Java files containing new language features
- improved colors for source codes matches in the report
- a parallel comparison mode
The following features are currently only supported in the legacy version but will be eventually restored:
- Result clustering
- Comparison based on maximum similarity