Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jabref-meta storage in bib file should be improved (by switching to embedded JSON) #10371

Open
koppor opened this issue Sep 11, 2023 · 5 comments
Labels
good second issue Issues that involve a tour of two or three interweaved components in JabRef
Milestone

Comments

@koppor
Copy link
Member

koppor commented Sep 11, 2023

Context

While seeing that diff, I thought, something is really wrong:

B

The semicolon on position 1 indicates that multiple meta data items are can be written into @Comment. This was clear to me today (and not in 2016 #960). This would be great as this would minimize the number of @Comment entries. However, the saveActions also use ; as delimiter (position 2).

The "feature" of non-merging the meta fields is long time present. See e.g., an old issue report #250.

Thus, a straight-forward merge is most probably not possible.

Code hint: Separation according to ; is done at org.jabref.logic.importer.util.MetaDataParser#getNextUnit


Call for new metadata storage

Single JSON in @comment field

Example:

@Comment{jabref-meta-0.1.0
{
  "saveActions" :
  {
    "state": true,
    "date": ["normalize_date", "action2"],
    "pages" : ["normalize_page_numbers"],
    "month" : ["normalize_month"]
  }
}
}

Content:

{
  "saveActions" :
  {
     "state": true,
    "date": ["normalize_date", "action2"],
    "pages" : ["normalize_page_numbers"],
    "month" : ["normalize_month"]
  }
}

Decision outcome: Use "Single JSON in @comment field"


Migration path:

  • v6.0 can read and write both setting formats
    • when reading, the new format "wins" (if both exists)
  • v7.0 can read both settings, but writes only new setting format

After this is implemented, we can work on #8701


ADR

Single JSON in @comment field

  • Good, because a single @Comment element is enough
  • Good, because JSON parser can directly be used
  • Good, because we can nest elements in the json without the need of a custom format
  • Neutral, because JSON is nested in BibTeX
  • Bad, because syntax highlighting won't work
  • Bad, because the meta format changes
  • Bad, because looks "hacky"

Multiple JSON

Each preference could have a separate JSON nesting.

  • Bad, because lookup would be done using BibTeX data and second lookup using JSON. The preferences should be in a consistent format.

BibTeX

Example (From JabRef#232)

old:

@Comment{jabref-meta: saveActions:enabled;
date[normalize_date]
pages[normalize_page_numbers]
month[normalize_month]
;}

new:

@JabRef{saveActions,
  state = {enabled},
  date = {normalize_date, action2}
  pages = {normalize_page_numbers}
  month = {normalize_month}
}
  • Good, because feels natural
  • Good, because no additional parsing logic needs to be implemented
  • Good, because we currently have only one level of key/value pairs for the meta data (to be checked)
  • Bad, because even nested list (e.g., normalize_date, action2) is a custom format.
  • Bad, because multiple elements have to be used: One for each meta data key
  • Bad, because does not allow for nesting of properties
  • Bad, because other tools might treat these entries special
  • Bad, because "old" JabRef versions will treat these entries as "normal" entries

@comment and then nested

JabRef v5.9 (and before) used that format.

  • Good, because arbitrary content can be used
  • Bad, because the parsing logic needs to be written for the content inside

JSON at the end of the file

New entries always start with @. Anything outside the “argument” of a “command” starting
with an @ is considered as a comment. This gives an easy way to comment a given entry: just
remove the initial @. As usual when a language allows comments, don’t hesitate to use them so
that you have a clean, ordered, and easy-to-maintain database. Conversely, anything starting
with an @ is considered as being a new entry

@Article{demo,
   note={just an example article to illustrate the **previous** entry}
}

// jabref-meta-0.1.0
{
  "saveActions" :  {
   "state": true,
   "date": ["normalize_date", "action2"],
   "pages" : ["normalize_page_numbers"],
   "month" : ["normalize_month"]
  }
}
@Siedlerchr
Copy link
Member

BibDesk on mac stores its groups into apple plist xml format:

grafik

@koppor
Copy link
Member Author

koppor commented Sep 20, 2023

@koppor koppor changed the title jabref-meta storage in bib file should be improved jabref-meta storage in bib file should be improved (by switching to embedded JSON) Jul 3, 2024
@ThiloteE ThiloteE added this to the 6.0 milestone Sep 7, 2024
@leaf-soba
Copy link
Contributor

leaf-soba commented Sep 11, 2024

Sorry I'm new here and I want to work on this issue, I try to break this issue into some small steps, please check if I understand this issue right.

  1. write a unit test input is the Example in Single JSON in @comment field.
    • I don't know the expected output exactly in unit test now, but I'll try to figure it out later.
@Comment{jabref-meta-0.1.0
{
  "saveActions" :
  {
    "state": true,
    "date": ["normalize_date", "action2"],
    "pages" : ["normalize_page_numbers"],
    "month" : ["normalize_month"]
  }
}
}
  1. Update MetaDataParser#getNextUnit to handle the new JSON format in unit test case
  2. Write logic code to parse, read and write new JSON format.
    • I didn't find the proper place to put these logic code, maybe I should put them in MetaDataSerializer, MetaDataParser?
    • And I didn't find the old code to read @Comment in this step now, maybe in BibtexDatabaseWriter?
  3. Add more corner case in unit test about this update.

@koppor
Copy link
Member Author

koppor commented Oct 30, 2024

1. write a unit test input is the Example in `Single JSON in @comment field`.

Yes

   * I don't know the expected output exactly in unit test now, but I'll try to figure it out later.

The JSON content itself. Maybe the GSon library is your friend. I made good experiences in the http server part with it.

2. Update `MetaDataParser#getNextUnit` to handle the new JSON format in unit test case

The place is ´org.jabref.logic.importer.fileformat.BibtexParser#parseJabRefComment`.

3. Write logic code to parse, read and write new JSON format.

The hole MetaDataParser can be "deleted" - and a new loading from JSON. I think, it is JSON -> DTO -> metadata. Maybe also directly from JSON to MetaData. -- "deleted" is not quite true, because JabRef should be able to read "old" files - and on version 7, the old metadata is not writtin any more. In version 6, both formats are read and written; with the new format taking predecdence)

   * I didn't find the proper place to put these logic code,  maybe I should put them in `MetaDataSerializer`, `MetaDataParser`?
  • Reading: See above.
  • Writing: org.jabref.logic.exporter.BibDatabaseWriter#writeMetaData
   * And I didn't find the old code to read `@Comment` in this step now, maybe in `BibtexDatabaseWriter`?

See above.

There will be many unit tests for that.

@leaf-soba
Copy link
Contributor

OK, it is clear now, please assign to me.

@github-actions github-actions bot added the 📍 Assigned Assigned by assign-issue-action (or manually assigned) label Oct 30, 2024
@koppor koppor moved this from Free to take to Assigned in Candidates for University Projects Oct 30, 2024
@koppor koppor removed 📍 Assigned Assigned by assign-issue-action (or manually assigned) 📌 Pinned labels Feb 27, 2025
@koppor koppor moved this from Assigned to Free to take in Candidates for University Projects Feb 27, 2025
@koppor koppor added the good second issue Issues that involve a tour of two or three interweaved components in JabRef label Feb 27, 2025
@JabRef JabRef deleted a comment from github-actions bot Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good second issue Issues that involve a tour of two or three interweaved components in JabRef
Projects
Status: Free to take
Development

No branches or pull requests

4 participants