Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for loading files #6332

Merged
merged 3 commits into from
Apr 22, 2020
Merged

Improve performance for loading files #6332

merged 3 commits into from
Apr 22, 2020

Conversation

tobiasdiez
Copy link
Member

  • Performance improvements around groups
  • Remove detection of duplicate ID when inserting entries (it's not really possible to create two entries with the same except if you use setId manually)
  • Remove detection of duplicate bibtex keys when opening a file (the result was not used and we have a integrity check for it)
  • Use EnumMap instead of HashMap to cache fields as keywords (which is only used for Keyword and Groups fields anyway)
  • Fixes Formatting of main table ignored #6329 where latex code was displayed in the maintable
  • Lazy init of source tab

In summary, we are now at very reasonable numbers:

  • Reading files is about 1k entries per second (with 30% spent on initialing the EventBus)
  • Groups membership counter is still expensive to calculate but kind of ok (30 sec for 1k groups x 12k entries = 12 mil checks)
  • 7s load of an empty JabRef instance
  • 1gb ram with two databases of together 15k entries

(all numbers are measured under profiling, thus they are worse than under actual production).

I think that concludes my series of performance improvements for now.

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

- Performance improvements around groups
- Remove detection of duplicate ID when inserting entries (it's not really possible to create two entries with the same except if you use `setId` manually)
- Remove detection of duplicate bibtex keys when opening a file (the result was not used and we have a integrity check for it)
- Use EnumMap instead of HashMap to cache fields as keywords (which is only used for Keyword and Groups fields anyway)
- Fix bug where latex code was displayed in the maintable
- Lazy init of source tab
@tobiasdiez tobiasdiez added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Apr 21, 2020
@Siedlerchr
Copy link
Member

Regarding, entries with the same id: What happens when you have a shared database? @koppor

@tobiasdiez
Copy link
Member Author

database.getDatabase().removeEntry(oldEntry);
BibEntry mergedEntry = mergePanel.getMergeEntry();
mergedEntry.setId(oldEntry.getId()); // Keep ID

That's the only instance where setId is invoked and since the old entry is removed shortly before, there is really no chance of duplicate id.

@koppor
Copy link
Member

koppor commented Apr 22, 2020

Regarding, entries with the same id: What happens when you have a shared database? @koppor

In general, we have the sharedId at the communication with the server - to ensure that two users always have the same id for the same entry.

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for getting rid of the pseudo primary key.

I have some minor comments on the other changes. I also pinged @stefan-kolb because of the InteralFields. Maybe @LinusDietz could also comment on that, because he worked on that on JabCon, too.

@tobiasdiez
Copy link
Member Author

Thanks for the reviews. I (hopefully) addressed each concern and updated the code accordingly.

@calixtus
Copy link
Member

Great work!

I recently thought about another way to make user experience more responsive: The database tab could be opened, before the parser starts working. This would need some more refactoring (maybe moving the call to the import routines inside the tab constructor?). This would be a project in the mid future.

@tobiasdiez
Copy link
Member Author

That's indeed a very good suggestion @calixtus. Let's keep it in mind.

@tobiasdiez tobiasdiez merged commit c7d7767 into master Apr 22, 2020
@tobiasdiez tobiasdiez deleted the openingPerf branch April 22, 2020 12:30
Siedlerchr added a commit that referenced this pull request Apr 24, 2020
* upstream/master:
  Update label names
  Squashed 'src/main/resources/csl-locales/' changes from d0ee4d13c9..79845b087b
  Squashed 'src/main/resources/csl-styles/' changes from c1793d2..143464e
  Cite work by @ayaankazerouni
  Improve performance for loading files (#6332)
  Add ADR von JUnit vs. AssertJ (#6335)
  'Name' textfield on focus instead of 'OK' button when user clicks on 'Add subgroup' option (#6330)
  Bump jurt from 6.3.2 to 6.4.3 (#6325)
  Bump unoil from 6.3.2 to 6.4.3 (#6320)
  Bump ridl from 6.3.2 to 6.4.3 (#6326)
  Bump classgraph from 4.8.69 to 4.8.71 (#6322)
  Bump flexmark from 0.61.6 to 0.61.16 (#6318)
  Bump richtextfx from 0.10.4 to 0.10.5 (#6319)
  Bump guava from 28.2-jre to 29.0-jre (#6323)
  Bump flexmark-ext-gfm-tasklist from 0.61.6 to 0.61.16 (#6327)
  Bump org.beryx.jlink from 2.17.5 to 2.17.7 (#6324)
  Improve performance massively by fixing a stupid binding mistake (#6316)
  Fixed missing paste command (#6313)
  Remove cache of auto completion results (#6310)
Siedlerchr added a commit that referenced this pull request Apr 26, 2020
# By Tobias Diez (10) and others
# Via dimitra-karadima (5) and others
* upstream/master: (62 commits)
  Backward compatibility for 4.3.1 (#6296)
  Fix Export to clipboard Dialog icon (#6345)
  Refactor EntryEditorPreferences (#6245)
  Update label names
  Squashed 'src/main/resources/csl-locales/' changes from d0ee4d13c9..79845b087b
  Squashed 'src/main/resources/csl-styles/' changes from c1793d2..143464e
  Cite work by @ayaankazerouni
  Improve performance for loading files (#6332)
  Add ADR von JUnit vs. AssertJ (#6335)
  'Name' textfield on focus instead of 'OK' button when user clicks on 'Add subgroup' option (#6330)
  Bump jurt from 6.3.2 to 6.4.3 (#6325)
  Bump unoil from 6.3.2 to 6.4.3 (#6320)
  Bump ridl from 6.3.2 to 6.4.3 (#6326)
  Bump classgraph from 4.8.69 to 4.8.71 (#6322)
  Bump flexmark from 0.61.6 to 0.61.16 (#6318)
  Bump richtextfx from 0.10.4 to 0.10.5 (#6319)
  Bump guava from 28.2-jre to 29.0-jre (#6323)
  Bump flexmark-ext-gfm-tasklist from 0.61.6 to 0.61.16 (#6327)
  Bump org.beryx.jlink from 2.17.5 to 2.17.7 (#6324)
  Improve performance massively by fixing a stupid binding mistake (#6316)
  ...

# Conflicts:
#	build.gradle
Siedlerchr added a commit that referenced this pull request Apr 30, 2020
…ionCaseInsensitive

* upstream/master: (32 commits)
  Fix Export to clipboard Dialog icon (#6345)
  Refactor EntryEditorPreferences (#6245)
  Update label names
  Squashed 'src/main/resources/csl-locales/' changes from d0ee4d13c9..79845b087b
  Squashed 'src/main/resources/csl-styles/' changes from c1793d2..143464e
  Cite work by @ayaankazerouni
  Improve performance for loading files (#6332)
  Add ADR von JUnit vs. AssertJ (#6335)
  'Name' textfield on focus instead of 'OK' button when user clicks on 'Add subgroup' option (#6330)
  Bump jurt from 6.3.2 to 6.4.3 (#6325)
  Bump unoil from 6.3.2 to 6.4.3 (#6320)
  Bump ridl from 6.3.2 to 6.4.3 (#6326)
  Bump classgraph from 4.8.69 to 4.8.71 (#6322)
  Bump flexmark from 0.61.6 to 0.61.16 (#6318)
  Bump richtextfx from 0.10.4 to 0.10.5 (#6319)
  Bump guava from 28.2-jre to 29.0-jre (#6323)
  Bump flexmark-ext-gfm-tasklist from 0.61.6 to 0.61.16 (#6327)
  Bump org.beryx.jlink from 2.17.5 to 2.17.7 (#6324)
  Improve performance massively by fixing a stupid binding mistake (#6316)
  Fixed missing paste command (#6313)
  ...
@koppor
Copy link
Member

koppor commented May 3, 2020

@calixtus Could you please open an issue for it. Seems to be a good-first-issue? (And link this PR from the description there)

koppor pushed a commit that referenced this pull request Dec 15, 2022
f6c778e Update emerald-harvard.csl (#6335)
d6c6a16 Fix Brazilian quotes on  chicago-author-date.csl (#6317)
a1549b6 Update medizinische-hochschule-hannover.csl (#6330)
da88073 Update journal-of-the-american-college-of-cardiology.csl (#6334)
a520d8e Bump nokogiri from 1.13.9 to 1.13.10 (#6333)
ba54b44 Update royal-society-of-chemistry.csl (#6328)
1378ba7 LUSEM: Remove full stop (#6332)
9e3cf89 Create interpreting.csl (#6254)
bef74ed Create conservation-science-and-practice.csl (#6258)
9fb7eb7 Bug fixes triangle.csl (#6251)
e6112ba Update ucl-university-college-apa.csl (#6250)
6dcba3a Update engineering-technology-and-applied-science-research.csl (#6247)
00fe4a2 Create constructivist-foundations.csl (#6243)
03ad71b Create les-mondes-du-travail.csl (#6234)
a2bce86 Corrections based on author instructions (#6306)

git-subtree-dir: buildres/csl/csl-styles
git-subtree-split: f6c778e
koppor pushed a commit that referenced this pull request Dec 17, 2022
84dba23 Update international-union-of-crystallography.csl (#6279)
13dd9e8 Update zeitschrift-fur-deutsche-philologie.csl (#6340)
d95b652 Create cahiers-mondes-anciens.csl (#6203)
ded567c Create rassegna-degli-archivi-di-stato-bibliografia-generale.csl (#6275)
124777a Create scientific-online-letters-on-the-atmosphere.csl (#6261)
3c276e7 Create american-medical-association-no-url-alphabetical.csl (#6252)
595ad95 Bump mathieudutour/github-tag-action from 6.0 to 6.1 (#6287)
7008128 Create cambridge-a (#6336)
17e930c Update norsk-apa-manual-note.csl (#6338)
b360859 Update norsk-apa-manual.csl (#6337)
f6c778e Update emerald-harvard.csl (#6335)
d6c6a16 Fix Brazilian quotes on  chicago-author-date.csl (#6317)
a1549b6 Update medizinische-hochschule-hannover.csl (#6330)
da88073 Update journal-of-the-american-college-of-cardiology.csl (#6334)
a520d8e Bump nokogiri from 1.13.9 to 1.13.10 (#6333)
ba54b44 Update royal-society-of-chemistry.csl (#6328)
1378ba7 LUSEM: Remove full stop (#6332)
9e3cf89 Create interpreting.csl (#6254)
bef74ed Create conservation-science-and-practice.csl (#6258)
9fb7eb7 Bug fixes triangle.csl (#6251)
e6112ba Update ucl-university-college-apa.csl (#6250)
6dcba3a Update engineering-technology-and-applied-science-research.csl (#6247)
00fe4a2 Create constructivist-foundations.csl (#6243)
03ad71b Create les-mondes-du-travail.csl (#6234)
a2bce86 Corrections based on author instructions (#6306)

git-subtree-dir: buildres/csl/csl-styles
git-subtree-split: 84dba23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Formatting of main table ignored
5 participants