-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fourth column for graph name #1337
Conversation
This is yet very hacky, and we have to integrate all the other things.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1337 +/- ##
==========================================
- Coverage 89.38% 89.36% -0.03%
==========================================
Files 345 345
Lines 24881 24945 +64
Branches 3307 3312 +5
==========================================
+ Hits 22241 22293 +52
- Misses 1500 1501 +1
- Partials 1140 1151 +11 ☔ View full report in Codecov by Sentry. |
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1-1 with Johannes, with some minor changes left.
I am amazed at how non-invasive this significant change is!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
It remains to fix the failing macOS build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another quick round with Johannes + tested this on Wikidata (ca. 15% slower)
Quality Gate passedIssues Measures |
Add a parser for N-Quads and write the graph label to the new column introduced by #1337 . Now `IndexBuilderMain` supports three file types: `nt`, `tll`, and `nq`.
This continues work from #1337 (add a column with the graph name to the index), #1444 (add parser for N-Quads), and #1482 (add graph info to block metadata). Queries with `FROM` and/or `GRAPH` with a fixed IRI can now be processed. Processing queries with `FROM NAME` or `GRAPH` with a variable will be implemented in a future PR.
All six permutation pairs (SPO, SOP, OSP, OPS, PSO, POS) now have an additional column for the graph name. It is the fourth column (before the two columns for the patterns). For now, all internal triples get a special internal graph name, and all other triples belong to the default graph.
IMPORTANT: This changes the index format. It also slows down the indexing by about 15% (which is reasonable, given that the number of columns increased from 5 to 6).