Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No clustering information persisted in the JSON files #609

Closed
tsaglam opened this issue Aug 23, 2022 · 10 comments
Closed

No clustering information persisted in the JSON files #609

tsaglam opened this issue Aug 23, 2022 · 10 comments
Assignees
Labels
bug Issue/PR that involves a bug minor Minor issue/feature/contribution/change report-viewer PR / Issue deals (partly) with the report viewer and thus involves web-dev technologies
Milestone

Comments

@tsaglam
Copy link
Member

tsaglam commented Aug 23, 2022

As described here, the persistence mechanism of the clustering results seems to have some issues. The overview.json file contains no clusters. I experienced the issue with multiple frontends.

JPlag version: 1c401de
Input: PartialPlagiarism.zip
Result: result.zip (overview.json contains no clusters)
CLI parameters tried:
java -jar jplag.jar ./PartialPlagiarism
and
java -jar jplag.jar ./PartialPlagiarism --cluster-alg AGGLOMERATIVE --cluster-metric AVG

@nestabentum am I missing something?

@tsaglam tsaglam added bug Issue/PR that involves a bug minor Minor issue/feature/contribution/change report-viewer PR / Issue deals (partly) with the report viewer and thus involves web-dev technologies labels Aug 23, 2022
@tsaglam tsaglam added this to the v4.0.0 milestone Aug 23, 2022
@nestabentum
Copy link
Contributor

Looks like this is a bug in the Clustering CLI:

See CLI.java line 144. When --cluster-skip is not present enabled is set to false and clustering is skipped. It seems this behaviour should actually be inverted.

@nestabentum
Copy link
Contributor

But there seems to be a different bug where the clustering information does when the icon is clicked but rather the comparison view is shown. I'm looking into it.

@tsaglam
Copy link
Member Author

tsaglam commented Aug 23, 2022

Thanks for looking into it. Should I fix the CLI issue, or do you want to fix them both since you are on it anyways?

@nestabentum
Copy link
Contributor

Feel free to fix it, as I am short on time currently.

@tsaglam
Copy link
Member Author

tsaglam commented Aug 23, 2022

@nestabentum I fear there are still sone issues with the report viewer (maybe introduced by the last fix?). I am going to investigate them a bit more tomorrow and give some details. However, this is not really time critical so we can discuss them in our next meeting and you can fix them when it suits you.

@tsaglam tsaglam moved this to Todo in v4.0.0 Release Aug 24, 2022
@tsaglam
Copy link
Member Author

tsaglam commented Aug 25, 2022

Input: PartialPlagiarism.zip

@nestabentum, here are some details (Win 10, Firefox):

Currently, the report viewer crashes for me when I pass the following results to the viewer and click on a submission pair:
result-nocluster.zip (CLI: java -jar jplag.jar ./PartialPlagiarism)

Same here, the only difference is that this result actually contains a cluster:
result-cluster.zip (CLI: java -jar jplag.jar ./PartialPlagiarism --cluster-alg AGGLOMERATIVE)

@nestabentum
Copy link
Contributor

You're referring to the publicly deployed version of the report viewer, right? I tried to reproduce the crash locally but I didn't succeed. The deployed version does crash though. I'll look into it.

@tsaglam
Copy link
Member Author

tsaglam commented Aug 25, 2022

Yes, the deployed version from this repo's master branch.

@tsaglam tsaglam closed this as completed Sep 7, 2022
Repository owner moved this from Todo to Done in v4.0.0 Release Sep 7, 2022
@CristianCantoro
Copy link

I am still getting the same behavior, i.e. no cluster information in the overview.json file, also with the release v4.1.0.

Calling Jplag without options yields no clusters:

$ java -jar jplag-4.1.0-jar-with-dependencies.jar ./PartialPlagiarism                                                                                           130 ↵
2022-12-24-02:04:32_182 [main] [INFO] LanguageLoader - Available languages: '[C/C++ Scanner [basic markup], C# 6 Parser, EMF metamodel, Go Parser, Javac based AST plugin, Kotlin Parser, Python3 Parser, R Parser, Rust Language Module, Scala parser, SchemeR4RS Parser [basic markup], Swift Parser, Text Parser (naive)]'
2022-12-24-02:04:32_519 [main] [INFO] ParallelComparisonStrategy - Comparing B-E: 0.0
2022-12-24-02:04:32_521 [ForkJoinPool.commonPool-worker-6] [INFO] ParallelComparisonStrategy - Comparing A-E: 0.0
2022-12-24-02:04:32_521 [ForkJoinPool.commonPool-worker-6] [INFO] ParallelComparisonStrategy - Comparing D-E: 0.0
2022-12-24-02:04:32_523 [ForkJoinPool.commonPool-worker-2] [INFO] ParallelComparisonStrategy - Comparing C-E: 0.0
2022-12-24-02:04:32_527 [ForkJoinPool.commonPool-worker-5] [INFO] ParallelComparisonStrategy - Comparing B-C: 0.2457689477557027
2022-12-24-02:04:32_527 [ForkJoinPool.commonPool-worker-4] [INFO] ParallelComparisonStrategy - Comparing B-D: 0.2827868852459016
2022-12-24-02:04:32_528 [ForkJoinPool.commonPool-worker-7] [INFO] ParallelComparisonStrategy - Comparing A-B: 0.2457689477557027
2022-12-24-02:04:32_532 [ForkJoinPool.commonPool-worker-1] [INFO] ParallelComparisonStrategy - Comparing A-D: 0.7787255393878575
2022-12-24-02:04:32_532 [ForkJoinPool.commonPool-worker-3] [INFO] ParallelComparisonStrategy - Comparing A-C: 0.9966329966329966
2022-12-24-02:04:32_532 [main] [INFO] ParallelComparisonStrategy - Comparing C-D: 0.7787255393878575
2022-12-24-02:04:32_535 [main] [INFO] JPlag - Total time for comparing submissions: 0 sec
2022-12-24-02:04:32_536 [main] [INFO] ClusteringFactory - Calculating clusters via spectral clustering with cumulative distribution function pre-processing...
2022-12-24-02:04:32_659 [main] [INFO] ClusteringFactory - 0 clusters were found:
2022-12-24-02:04:32_749 [main] [INFO] ReportObjectFactory - Start writing report files...
2022-12-24-02:04:32_844 [main] [INFO] ReportObjectFactory - Zipping report files...
2022-12-24-02:04:32_866 [main] [INFO] DirectoryManager - Successfully zipped report files: result.zip
2022-12-24-02:04:32_866 [main] [INFO] DirectoryManager - Display the results with the report viewer at https://jplag.github.io/JPlag/

and no clusters appear in the overview file: result-noclusters.zip

With --cluster-alg AGGLOMERATIVE I get clusters:

$ java -jar jplag-4.1.0-jar-with-dependencies.jar --cluster-alg AGGLOMERATIVE ./PartialPlagiarism
2022-12-24-02:05:24_321 [main] [INFO] LanguageLoader - Available languages: '[C/C++ Scanner [basic markup], C# 6 Parser, EMF metamodel, Go Parser, Javac based AST plugin, Kotlin Parser, Python3 Parser, R Parser, Rust Language Module, Scala parser, SchemeR4RS Parser [basic markup], Swift Parser, Text Parser (naive)]'
2022-12-24-02:05:24_686 [main] [INFO] ParallelComparisonStrategy - Comparing B-E: 0.0
2022-12-24-02:05:24_688 [ForkJoinPool.commonPool-worker-6] [INFO] ParallelComparisonStrategy - Comparing A-E: 0.0
2022-12-24-02:05:24_688 [ForkJoinPool.commonPool-worker-6] [INFO] ParallelComparisonStrategy - Comparing D-E: 0.0
2022-12-24-02:05:24_693 [ForkJoinPool.commonPool-worker-7] [INFO] ParallelComparisonStrategy - Comparing A-B: 0.2457689477557027
2022-12-24-02:05:24_698 [ForkJoinPool.commonPool-worker-5] [INFO] ParallelComparisonStrategy - Comparing B-C: 0.2457689477557027
2022-12-24-02:05:24_699 [ForkJoinPool.commonPool-worker-2] [INFO] ParallelComparisonStrategy - Comparing C-E: 0.0
2022-12-24-02:05:24_700 [ForkJoinPool.commonPool-worker-4] [INFO] ParallelComparisonStrategy - Comparing B-D: 0.2827868852459016
2022-12-24-02:05:24_701 [ForkJoinPool.commonPool-worker-3] [INFO] ParallelComparisonStrategy - Comparing A-C: 0.9966329966329966
2022-12-24-02:05:24_702 [main] [INFO] ParallelComparisonStrategy - Comparing C-D: 0.7787255393878575
2022-12-24-02:05:24_703 [ForkJoinPool.commonPool-worker-1] [INFO] ParallelComparisonStrategy - Comparing A-D: 0.7787255393878575
2022-12-24-02:05:24_705 [main] [INFO] JPlag - Total time for comparing submissions: 0 sec
2022-12-24-02:05:24_706 [main] [INFO] ClusteringFactory - Calculating clusters via agglomerative clustering with cumulative distribution function pre-processing...
2022-12-24-02:05:24_766 [main] [INFO] ClusteringFactory - 1 clusters were found:
2022-12-24-02:05:24_766 [main] [INFO] ClusteringFactory -  cluster strength: 1.1102230246251565E-16, avg similarity: 0.947469537553443%, members: [A, C, B, D]
2022-12-24-02:05:24_861 [main] [INFO] ReportObjectFactory - Start writing report files...
2022-12-24-02:05:24_966 [main] [INFO] ReportObjectFactory - Zipping report files...
2022-12-24-02:05:24_992 [main] [INFO] DirectoryManager - Successfully zipped report files: result.zip
2022-12-24-02:05:24_992 [main] [INFO] DirectoryManager - Display the results with the report viewer at https://jplag.github.io/JPlag/

but still no clusters appear in the overview file: result-clusters.zip

@tsaglam
Copy link
Member Author

tsaglam commented Dec 26, 2022

Clusters in the JSON-export are still disabled in the v4.1.0 release. However, this is on purpose, as they can break the report viewer. We are already working on that for the next release.
As far as I know, the SPECTRAL clustering, which is the default option, can sometimes lead to no cluster at all. Maybe in the future, we can address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue/PR that involves a bug minor Minor issue/feature/contribution/change report-viewer PR / Issue deals (partly) with the report viewer and thus involves web-dev technologies
Projects
No open projects
Status: Done
Development

No branches or pull requests

3 participants