Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use BPlusTree to hold backlinks #6673

Merged
merged 6 commits into from
Jun 1, 2023
Merged

Use BPlusTree to hold backlinks #6673

merged 6 commits into from
Jun 1, 2023

Conversation

jedelbo
Copy link
Contributor

@jedelbo jedelbo commented May 25, 2023

What, How & Why?

Fixes #6577

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • C-API, if public C++ API changed.

Removes the limit on how many backlinks we can handle
@jedelbo jedelbo force-pushed the je/backlink-upgrade branch from 8b368f2 to 8621ce6 Compare May 25, 2023 10:41
@danieltabacaru
Copy link
Collaborator

Should we run some benchmarks for this change?

Copy link
Contributor

@finnschiermer finnschiermer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -568,6 +566,13 @@ class BPlusTree : public BPlusTreeBase {
m_root->bptree_traverse(func);
}

void split_if_needed()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? I am under the impression that BPlusTree::insert() will split automatically as needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the current node is just a plain array, then it should be split into a BPlusTree. I am open for another name - I struggled a bit myself to find one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is an optimization that defers creating a tree until we reach more than 1000 links and then does the transformation on the fly rather than in the migration function of the file format. If that is correct, we should keep current performance the same for fewer than 1000 backlinks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the size is below 1000, then a BPlusTree is just a plain array, so no need for any transformation. But the BPlusTree implementation is just slower than a plain array. I did not want to create two code paths - one for sizes under 1000 and one for sizes over.

@jedelbo
Copy link
Contributor Author

jedelbo commented May 26, 2023

@danieltabacaru This new approach is about 10 % slower than the old.

Comment on lines +839 to +845
auto node_size = Array::get_size_from_header(header);
if (Array::get_is_inner_bptree_node_from_header(header)) {
auto data = Array::get_data_from_header(header);
auto width = Array::get_width_from_header(header);
node_size = size_t(get_direct(data, width, node_size - 1)) >> 1;
}
return node_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto node_size = Array::get_size_from_header(header);
if (Array::get_is_inner_bptree_node_from_header(header)) {
auto data = Array::get_data_from_header(header);
auto width = Array::get_width_from_header(header);
node_size = size_t(get_direct(data, width, node_size - 1)) >> 1;
}
return node_size;
if (Array::get_is_inner_bptree_node_from_header(header)) {
auto data = Array::get_data_from_header(header);
auto width = Array::get_width_from_header(header);
auto node_size = size_t(get_direct(data, width, node_size - 1));
REALM_ASSERT_EX(node_size >= 2, node_size);
return node_size >> 1;
}
return Array::get_size_from_header(header);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, double checking because my memory of this is fuzzy, is the divide by two always correct for an inner node? Or is it possible that it is not in the compact form. Eg should we divide by get_elems_per_child()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more a shift right than a divide by two. The size of the sub tree is always stored as the last entry and coded as a number - that is shifted one up and or-ed with 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but you may want to use this suggestion to avoid an unnecessary read from the array header if looking at an inner node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't think your suggesting would compile :-)

@@ -181,7 +183,7 @@ ObjKey ArrayBacklink::get_backlink(size_t ndx, size_t index) const
return ObjKey(int64_t(value >> 1));
}

Array backlink_list(m_alloc);
BPlusTree<int64_t> backlink_list(m_alloc);
backlink_list.init_from_ref(ref_type(value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to interpret the data correctly if it is a simple Array that hasn't been split yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. A BPlusTree only holding one leaf array, is just a leaf array.

@@ -568,6 +566,13 @@ class BPlusTree : public BPlusTreeBase {
m_root->bptree_traverse(func);
}

void split_if_needed()
{
while (m_root->get_node_size() > REALM_MAX_BPNODE_SIZE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please test more than one iteration of this loop eg. more than 1000 * 1000 elements. I think the second iteration will fail because split_root() assumes that the root is a LeafNode but on the second pass they will be all BPlusTreeInner types

Copy link
Contributor Author

@jedelbo jedelbo May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tested more thoroughly when the node size is 4. get_node_size() will return the number of child elements - no matter if this is an inner node or a leaf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you had the implementation correct all along and I misread it. Thanks for adding the extra tests though!


BPlusTree<Int> tree(Allocator::get_default());
tree.init_from_ref(arr.get_ref());
tree.split_if_needed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't do a split because there are less than 1000 elements, can you add more tests like this checking various sizes: 999, 1000, 1001, 1000*1000 + 1 etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more sizes - but the test will still only be effective when node size is 4.

Copy link
Contributor

@ironage ironage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the explanations! LGTM.

@jedelbo jedelbo merged commit 3369c63 into next-major Jun 1, 2023
@jedelbo jedelbo deleted the je/backlink-upgrade branch June 1, 2023 13:38
nicola-cab added a commit that referenced this pull request Mar 7, 2024
* Prepare next-major

* Remove support for upgrading from pre core-6 (v10) (#6090)

* Optimize size of ArrayDecimal128 (#6111)

Optimize storage of Decimal128 properties so that the individual values will take up 0 bits (if all nulls or all zero), 32 bits, 64 bits or 128 bits depending on what is needed.

* update next major to core 13.4.1 (#6310)

* temporary disable failing c api decimal test

* Revert "update next major to core 13.4.1" (#6312)

* Revert "update next major to core 13.4.1 (#6310)"

This reverts commit 59764a2.

* appease format checks

* Align dictionaries to Lists and Sets when they are cleared.  (#6254)

* Fix storage of Decimal128 NaNs

* Allow Collections to be owned by Collections (#6447)

Introduce a new class - ColletionParent - which a collection will refer to as its owner. This class can be
specialized as an Obj if the nesting level is 0 or a CollectionList if the collection is nested.

* Add interface for defining columns of nested collections

* Add CollectionList class

* Change CollectionBase::set_owner() interface

Make it clear that when an Obj is the owner, then the index must be a ColKey

* Implementation `CollectionList::remove()` (#6381) (#6458)

Co-authored-by: Jørgen Edelbo <[email protected]>

* Schema support for nesting collection (#6451)

Co-authored-by: Jørgen Edelbo <[email protected]>

* Handle links in nested collections (#6470)

* Handle nullifying links in nested collections
* Clear backlinks related to nested collections

* Return collection type in Mixed (#6520)

* Print nested collections to Json (#6534)

* dump to json support info about nested collections for schema

* reuse logic for printing nested collections

* main logic for expanding nested collections to json, requires to be polished

* more testing for nested containers

* complete algo for printing nested collections in json format

* add testing json files to project

* generate json files option set to false

* run whole test suite

* test nested collections with links

* format checks

* Move out_mixed_json... functionality to Mixed class

* Remove not needed template parameter from CollectionBaseImpl

* Delegate to_json to collections

* remove commented code

* fix audit conflicts

---------

Co-authored-by: Jørgen Edelbo <[email protected]>

* Simplify Obj::get_path()

* Store ref in ArrayMixed (#6565)

* Cleanup naming and consolidate update strategy
* Allow a Mixed containing a ref to be stores in ArrayMixed

* Actually store collection type in Mixed (#6583)

* Allow Dictionary to contain a collection (#6584)

* Make a template specializetion for Lst<Mixed>

* Allow Lst<Mixed> to contain a collections

* Streamlining interface (#6615)

Main change is that insert... will not return the created collection. This
has of course big influence on the test cases written for the old API.

Virtual interface for setting/getting nested collections created

* Api nested collections in OS (#6618)

Add interface on both object_store::Collection and the C API to handle collections in Mixed.
---------

Co-authored-by: Jørgen Edelbo <[email protected]>

* Set interface nested collections (#6648)

* testing for set<mixed>

* c-api for nested sets

* fix Set constructor

* Get path from collection objects (#6636)

* Move NoOpTransactionLogParser to transact_log.hpp

* Add nested collection path in transaction log

* Optimize get_path()

Avoid having the first element in Path being a std::string

* Small fixes

* Make m_path private in sync::instr::Path

* Remove `set_string_compare_method` (#6668)

Partially based on 5f2dda1 Delete some obsolete cruft

set_string_compare_method() and everyhing related to it has never actually been
used by any SDK, and is not really the correct solution to the problem anyway.

Co-authored-by: Thomas Goyne <[email protected]>

* Replication of operations on nested collections

* Remove support for query over typed links in Dictionary

This feature is not exposed, and should not be done in the way
it was implemented.

* Use BPlusTree to hold backlinks (#6673)

Removes the limit on how many backlinks we can handle

* Add StablePath concept

* Collection in mixed notification support. (#6660)

* Add support in the notification machinery for nested collections.

* Avoid passing string parameters by value in KeyPathMapping interface

* Use uniform Path representation in query parser

We need to be able to handle a path that is just a sequence of strings
and integers. The strings can then either be a property name or a key
in a dictionary. Before we have known that the last entry in a path
would be a property name. We can't assume that anymore, so we just have
to follow links as long as that is possible. The rest must then be a path
to the wanted value.

We must also allow the syntax "dict.key" and dict["key"] to be used
interchangeably. A nested dictionary can be used in the same way as
an embedded object is used and so the syntax for querying on a specific
property should be the same.

* Support query on nested collections

This includes supporting using index in query on list of primitives

* Copy replication nested collections (#6714)

* copy replication for nested collections

* Remove support for TypedLinks in LinkTranslator

Removes some complexity/code. Is easy to re-introduce. Can be safely
removed if we disallow creating columns of this type. This can also
safely be done, as this feature is not yet used.

* Support typed links in nested collections

This is about the usual stuff:
 - When a link is inserted, make sure a backlink is created
 - When a link is cleared, make sure the backlink is removed.
 - When object containing a collection containing links is deleted
   make sure the backlinks are removed.
 - When the linked-to object is deleted the link should be nullified/removed.
 - When the linked-to object is made into a tombstone, the link should be updated.
 - When the linked-to object is recreated, the link should be restored.

* Handle exceptions thrown from Obj::get_collection_ref

* Support assigning a json string to a mixed property

Collect all to_json related functions in one compilation unit. Then
it will only be included in the final binary if used.

* Support having [*] as part of a path (#6741)

Allows you to consider all elements at some level.

* added check for set in mixed in the C API (#6764)

* Fix collection mismatch for `Set` in `Mixed`. (#6802)

* Allow TypedLinks to be part of path to property in queries

* Sorting stage 2 (#6669)

* Remove `set_string_compare_method`

Partially based on 5f2dda1 Delete some obsolete cruft

set_string_compare_method() and everyhing related to it has never actually been
used by any SDK, and is not really the correct solution to the problem anyway.

* strings are no longer equal to binaries

* parser supports bin(...) to differentiate binaries

* fix sectioned results

* fix formatting

* review updates

* Fix syntax

* review feedback changes

* fix reported UB

* lint

---------

Co-authored-by: Thomas Goyne <[email protected]>
Co-authored-by: Jørgen Edelbo <[email protected]>

* Fix list type

* Syntactical sugar

* Throw if syncing a collection nested in Mixed

* Support indexing into link collections in Query (#6854)

* Support syncing nested Set

* Add missing support for getting Sets

* Return correct attachement state from nested collections (#6880)

Ensures that proper notifications on no longer existing collections are sent out

* small changes to the c api for collections in mixed (#6881)

* explicit insertion for collections in mixed and return the collection just inserted

* Make exceptions thrown by nested collections more consistent (#6875)

* Make information on the deletion of a collection available in C API (#6896)

* update set_collection for list and have an explicit function for each collection (#6900)

* Small changes

* Publish Obj::set_json in C API

* Check for stale accessors to a collction embedded directly in a Mixed property

Change the index held by the collection object from ColKey to a structure
(ColIndex) )containing both the index of the column and a key generated for
that particular collection. The key value is stored alongside the ref and
compared with the key value found in index when trying to obtain the ref
for the collection.

This commit includes review updates

* Fix freezing a nested collection

* Remove support for static nested collections

* Use more bits in ColIndex key

* Improve StringIndex::dump_node_structure

* Check for stale accessors to a collction embedded in a dictionary

Change the index held by the collection object from std::string to a structure
(KeyIndex) ) containing both the beginning of the dictionary key (mostly for
debugging purposes) and an index key generated for that particular collection.
The key value is stored alongside the ref and compared with the key value found
in index when trying to obtain the ref for the collection.

* Optimize StableIndex

* Refactor StringIndex interface (#6787)

* refactor StringIndex interface

optimize

* StringIndex has a virtual parent SearchIndex

* review feedback and fix a warning

* More consistent exception handling for nested collections

This commit fixes the problem that trying to access position 0 in a newly
created nested list would give an exception saying that the collection
was gone instead of an out-of-bounds exception. This was because we had a
test for attached before validating the index.

The solution selected is to remove "ensure_attached" and let the exceptions
thrown in "get_collection_ref" flow all the way to the client. This is kind of
fundamental change in that we must remove the noexcept specification from
"update_if_needed_with_status" and make the "init_from_parent" functions rethrow
the exceptions caught. The noexcept functions calling "update_if_..." must
add a try..catch block.

* Add ability to get collections from Results (#6948)

Co-authored-by: Nicola Cabiddu <[email protected]>

* Fix compilation of RealmTrawler

* Logging mutations on tables (#6953)

To avoid having the same operation logged twice, the logging in instruction_applier
in removed.

* Simplify Logger class a bit

Logger::m_base_logger_ptr seems not to be used in the class itself.
The member is added to the sub-classes that need it.

get/set level_threshold need not be virtual is we remove support
for NullLogger.

* Limiting the output when logging large string and binary values (#6986)

* Introduce logging categories

* Sorting stage 3 (#6670)

* Add tests on BPlusTree upgrade

* change the sort order of strings

* Add test on upgrade of StringIndex and Set

* remove utf8_compare

* Add upgrade functionality

* Avoid string index upgrade

* Update test

* Move Set::do_resort() to .cpp file

* Generate test_upgrade_database_x_23.realm as on ARM

* Revert "Avoid string index upgrade"

This reverts commit 333982a.

* Fix upgrade logic for string index

- Only upgrade if char is signed
- Upgrade Mixed columns too

* memcmp is faster than std::lexi_cmp and doesn't require casting

* optimize: only compare strings once

* Upgrade of fulltext index not needed

* migrate sets of binaries and better migration test

* generate migration Realms on a signed platform

* fix lint

* avoid a string index migration by using linear search

---------

Co-authored-by: Jørgen Edelbo <[email protected]>

* Upodate Package.swift

* Client Reset for collections in mixed / nested collections (#6766)

* Fix error after merge

* Fix issue using REALM_ENABLE_MEMDEBUG=On

* Logging of schema migrations

* Use logging categories (#7052)

* Logging notification activity

* Logging details when opening DB

* Fix warning

* Update bindgen to support logging categories

* Add cases handling Json::value_t::binary

* Log free space and history sizes when opening file

* Remove unused stuff

* Rearrange some code in Set<T>

The is to prepare for merge with master. It is more or less a cherry-pick
of commit bf5ffd3.

* Fix missing NullLogger

* Remove support for nested sets

* Fix warnings

* Remove type_LinkList and col_type_LinkList (#7114)

Should have been removed long time ago

* Index on list of strings (#7152)

* Prepare beta release

* Add path.hpp to installed headers

* Update release notes

* Allow keypath to be provided as argument (#7210)

* Remove set from realm_value_type for collections in mixed (#7245)

* Add support for collections in indexed mixed fields

* [C-API] Fix the return type of realm_set_collection (#7247)

* Fix the return type of realm_set_collection

* fix some leaking tests

* Simplify JSON functionality

* Don't leak implementation of BsonDocument and BsonArray to the users.

This is done by defining the interface explicitly and in a way that
makes it possible to easily change the underlying implementation.

* Throw when inserting an embedded object into a list of Mixed

* Fix queries on dictionaries in Mixed with @keys

* Optimize BsonDocument::find()

* Only output '_key': xxx when output mode is plain JSON

Fix 9169fa1

* Send notifitations about mutations on nested collections

* Restore correct expected json files

* Support querying for @SiZe on Mixed

This will make sense both for strings, binaries and nested collections.

* Support querying with @type on nested collections (#7288)

* Refactor ConstantNode::visit() (#7295)

This will allow us to use argument substitution in more places. In
particuler if TypeOfValue is expected.

The visit function in ConstantNode is split up i 2 steps. First the
value is extracted into a Mixed - that being directly from the query
string or from the arguments. Then the vaule is adapted to what is
needed based on the 'hint' parameter.

* Fix merge error in dependency list

* Fix using stringops query on nested collections

* Fix using ANY, NONE, ALL in query on Mixed property

* Remove LinkList (#7308)

* fix == NONE {x} queries (#7333)

* fix == NONE {x} queries

* more tests

* Fix app URI tests for baasaas (#7342)

* Mitigate races in accessing `m_initated` and `m_finalized` in various REALM_ASSERTs (#7338)

* Fix a TOCTOU race when copying Realm files

Checking if the destination exists before copying is a race condition as the
file can be created in between the check and the copy. Instead we should
attempt to copy without overwriting the target if it exists.

* Use clonefile() when possible in File::copy()

* Delete unused sync file action metadata fields

* Schema migration tests to use admin API rather than querying backing cluster (#7345)

* Add bson library (#7324)

Can be used as a reference implementation in tests.

* Fix Results nofitifation for changes to nested collections

* Use TestDirGuard where applicable

* Simplify session tests by consitently using TestSyncManager::fake_user()

* Separate TestSyncManager and OfflineAppSession

Co-authored-by: James Stone <[email protected]>

* Fix sync replication (#7343)

* Adjust CMake files to used by vcpkg (#7334)

* Fix SPM compilation errors (#7360)

Include paths for the tests are set up slightly different for the SPM build
from the CMake build.

* Prepare release

* Update release notes

* Prepare release

* Update release note

* Rewrite the "app: app destroyed during token refresh" test (#7363)

* Update to CHANGELOG

* Don't allow Core targets to be installed if submodule (#7379)

Co-authored-by: Kenneth Geisshirt <[email protected]>
Co-authored-by: Jørgen Edelbo <[email protected]>

* Prepeare release

* Update release notes

* Do not populate list KVO information for non-list collections (#7378)

* Prevent opening files with file format 23 in read-only mode

* Don't update backlinks in Mixed self-assignment (#7384)

Setting a Mixed field to ObjLink equal to the current value removed the
existing backlink and then exited before adding the new one, leaving things in
an invalid state.

* Eliminate copies when accessing values from Bson types (#7377)

Returning things by value performs a deep copy, which is very expensive when
those things are also bson containers.

Re-align the naming with the convention names for the functions rather than
being weird and different.

* Use the correct allocator for queries on dictionaries over links (#7382)

The base table's allocator was being used to read from the target table.

* Bson object should hold binary data in decoded form

If you construct a Bson object from a std::vector<char>, the  extjson
streaming format should encode the binary data.

* Fix passing a double as argument to query on Decimal128 (#7387)

* Treat missing keys in dictionaries as null in queries (#7391)

* Treat missing keys in dictionaries as null in queries

* Fix test

---------

Co-authored-by: Jørgen Edelbo <[email protected]>

* Allow using aggregate operations on Mixed properties in queries (#7398)

This is something which Cocoa and the query engine supports but the core query
parser did not.

* [bindgen] Enable support for collections in the `Mixed` data type (#7392)

* Adapt to breaking change.

* Expose APIs for flat collections in Mixed and add preparation for nested.

* Add and expose helper for getting the data type.

* Expose a data type enum that includes non-primitives.

The JS SDK needs this for checking types of Mixed.

* Update casting of enum constants.

* Replace access of 'm_type' with use of the 'int()' operator overload.

* Expose APIs for setting nested lists in Mixed.

* Expose APIs for setting nested dictionaries in Mixed.

* Expose APIs for getting nested lists in Mixed.

* Expose APIs for getting nested dictionaries in Mixed.

* Expose get_obj() on List and Set.

* Expose method for getting element type in List.

* Remove the need for element type helpers.

The JS SDK has managed to use sentinel values in the generated bindings instead.

* Remove unused header.

* Avoid doing unneeded logger work in Replication

Most of the replication log statements do some work including memory
allocations which are then thrown away if the log level it too high, so always
check the log level first. A few places don't actually benefit from this, but
it's easier to consistently check the log level every time.

* Prepare release

* RCORE-1990 Add X86 Windows Release builder to evergreen (#7383)

* Use the bfd linker in the armv7 toolchain (#7406)

* Fix several crashes in the object store benchmarks (#7403)

* add new index related benchmarks (#7401)

* Use updated curl on evergreen windows hosts (#7409)

* comment test not working + still missing handling for nested collections

* handle collection array for mixed types

* lint

* please windows builder warnings + x86

* proposed fix for 32-bit

* moved call outside assert macro

* Fix for 32 bit archs for encoded Arrays (#7427)

* tentative fix for 32 bit archs
* removed wrong cast to size_t from bf_iterator::set_value()
* fix inverted condition in 'unsigned_to_num_bits'
* fix inverted condition in 'unsigned_to_num_bits'

---------

Co-authored-by: Nicola Cabiddu <[email protected]>

* lint

* Revert "lint"

This reverts commit 7ac0073.

* lint

---------

Co-authored-by: Jørgen Edelbo <[email protected]>
Co-authored-by: James Stone <[email protected]>
Co-authored-by: Thomas Goyne <[email protected]>
Co-authored-by: Nikola Irinchev <[email protected]>
Co-authored-by: Claus Rørbech <[email protected]>
Co-authored-by: Kenneth Geisshirt <[email protected]>
Co-authored-by: Jonathan Reams <[email protected]>
Co-authored-by: Thomas Goyne <[email protected]>
Co-authored-by: Lee Maguire <[email protected]>
Co-authored-by: LJ <[email protected]>
Co-authored-by: Yavor Georgiev <[email protected]>
Co-authored-by: Finn Schiermer Andersen <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants