Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate check during import marks articles from collection as possible duplicates #8885

Closed
2 tasks done
jorgman1 opened this issue Jun 2, 2022 · 10 comments · Fixed by #11217
Closed
2 tasks done
Labels
bug Confirmed bugs or reports that are very likely to be bugs duplicateFinder

Comments

@jorgman1
Copy link

jorgman1 commented Jun 2, 2022

JabRef version

Latest development branch build (please note build date below)

Operating system

GNU / Linux

Details on version and operating system

JabRef 5.7--2022-06-01--943489e || Linux 5.10.0-14-amd64 amd64 || Java 18.0.1 || JavaFX unknown

Checked with the latest development build

  • I made a backup of my libraries before testing the latest development version.
  • I have tested the latest development version and the problem persists

Steps to reproduce the behaviour

  1. Copy two articles (InCollection) from the same book to a new database
  2. The second entry that is copied is marked as possible duplicate although author, title, and citationkey deviate.

Appendix

Import_erroneous_duplicates

@ThiloteE ThiloteE added bug Confirmed bugs or reports that are very likely to be bugs duplicateFinder labels Jun 2, 2022
@claell
Copy link
Contributor

claell commented Jun 13, 2022

I think that I experienced a similar problem in the past.

@koppor koppor moved this to Normal priority in Prioritization Nov 10, 2022
@Siedlerchr
Copy link
Member

Can you please test with the latest development version? We recently improved the merging and duplicate detection handling.

We would like to ask you to use a development build from https://builds.jabref.org/main and report back if it works for you. Please remember to make a backup of your library before trying-out this version.

@jorgman1
Copy link
Author

I still have the problem with the latest development version.

JabRef 5.10--2023-04-16--d47ed31
Linux 4.4.0-53-generic amd64
Java 19.0.2
JavaFX 20+19

Screenshot
20230417_Screenshot

@Siedlerchr
Copy link
Member

Would you mind sharing the bib entries for testing?

@jorgman1
Copy link
Author

jorgman1 commented Apr 17, 2023

I think I found a mistake in my database. I only had the book's ISBN in the entries. Hence, all entries had the same ISBN. Removing the ISBN from the entries solves the problem. Moreover, after putting the DOIs for each entry, they are not recognized as duplicates anymore (only if both entries have DOIs, if one does and the other not, they are recognized as duplicates (due to the ISBN?)).

Hence, I think it is my mistake. I don't think it is correct to have all entries with the same ISBN, which refers to the whole book.

These are five entries:
Collection_test.txt

@Siedlerchr
Copy link
Member

You could handle such cases with entry links https://docs.jabref.org/advanced/entryeditor/entrylinks
e.g. you have one entry book and others with inbook and then you can put the isbn in the book entry and it will show up in the references for each inbook entry as well.
Then you avoid duplicate information

@koppor
Copy link
Member

koppor commented Apr 18, 2023

I think I found a mistake in my database. I only had the book's ISBN in the entries.

Thank you for the hint. IMHO this is a bug in the duplicate detection. I "unzipped" the explanation at #9769 (comment).

Hence, I think it is my mistake. I don't think it is correct to have all entries with the same ISBN, which refers to the whole book.

This is what ISBNs are used for. Depending on the bibtex style required by the publisher, either the ISBN or the DOI is printed. Maybe both. In the case only the ISBN is printed, the information is IMHO useful to find the entry.

@ThiloteE
Copy link
Member

ThiloteE commented Apr 16, 2024

@AbdAlRahmanGad Would it be theoretically and technically possible to only remove the ISBN duplicate detection for InBook or InCollection in your pull-request?

@AbdAlRahmanGad
Copy link
Contributor

@AbdAlRahmanGad Would it be theoretically and technically possible to only remove the ISBN duplicate detection for InBook or InCollection in your pull-request?

I think it would be possible.

@koppor
Copy link
Member

koppor commented Apr 16, 2024

@AbdAlRahmanGad For that, add a test case and then with trial and error fix the code ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs duplicateFinder
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants