Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepChangeChecker: follow links in Mixed columns #4828

Merged
merged 8 commits into from
Jul 30, 2021

Conversation

fealebenpae
Copy link
Member

@fealebenpae fealebenpae commented Jul 27, 2021

What, How & Why?

When checking outgoing links the DeepChangeChecker always assumes that the column containing the outgoing link is always a Link column, but it's now also possible for this to be a Mixed columns with a value containing a TypedLink.

Fixes #4803

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)

@fealebenpae fealebenpae requested review from tgoyne and ironage July 27, 2021 17:10
@fealebenpae fealebenpae self-assigned this Jul 27, 2021
Comment on lines 286 to 289
Mixed value = obj.get<Mixed>(outgoing_link_column);
REALM_ASSERT(value.get_type() == type_TypedLink);
dst_key = value.get_link().get_obj_key();
dst_table = table.get_parent_group()->get_table(value.get_link().get_table_key());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash if the Mixed column contains both links and non-links. I think the correct fix for this is just if (outgoing_link_column.get_type() == col_type_Mixed) { return do_check_mixed_for_link(...); }

bool did_run_section = false;

{
size_t col_ndx = GENERATE(1, 2, 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests are mostly the same as before, except that they use catch2's GENERATE to duplicate everything for each link column in the schema so that we can cover mixed as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, I didn't know that was a thing. It's nicer than injecting a template type just for this.

_impl::CollectionKeyPathChangeChecker checker(info, *table, related_tables, key_path_array_five_levels,
false);
CHECK_THROWS(checker(19));
CHECK_THROWS(checker(18));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A functional change in the new test code is that calls like these for an object which has null links along the path to check for no longer throws an invalid object exception. I'm not sure why we would want to throw in that case so I changed it not to.

@@ -439,43 +464,37 @@ void CollectionKeyPathChangeChecker::find_changed_columns(std::vector<int64_t>&
else if (column_key.is_set()) {
if (column_type == col_type_Mixed) {
auto set = object.get_set<Mixed>(column_key);
for (size_t i = 0; i < set.size(); i++) {
auto target_object = set.get(i);
for (auto it = set.begin(); it != set.end(); ++it) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes to use iterators in this section is just an optimization

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use range-based for here?

@ironage ironage requested a review from jedelbo July 27, 2021 23:38
Comment on lines 279 to 292
ConstTableRef dst_table;
ObjKey dst_key;
if (outgoing_link_column.get_type() == col_type_Link) {
dst_table = table.get_link_target(outgoing_link_column);
dst_key = obj.get<ObjKey>(outgoing_link_column);
}
else if (outgoing_link_column.get_type() == col_type_Mixed) {
TableRef no_cached;
Mixed value = obj.get<Mixed>(outgoing_link_column);
return do_check_mixed_for_link(*table.get_parent_group(), no_cached, value, filtered_columns, depth);
}
else {
REALM_UNREACHABLE();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flow is pretty weird now. It declares variables, populates them in one branch of the if, but then doesn't use them in the other branch (and returns so the fact that they aren't used isn't important either). I think it'd be clearer with just if (outgoing_link_column.get_type() == col_type_Mixed) { ... } REALM_ASSERT(outgoing_link_column.get_type() == col_type_Link); and then all the non-mixed logic after the check for Mixed.

@@ -439,43 +464,37 @@ void CollectionKeyPathChangeChecker::find_changed_columns(std::vector<int64_t>&
else if (column_key.is_set()) {
if (column_type == col_type_Mixed) {
auto set = object.get_set<Mixed>(column_key);
for (size_t i = 0; i < set.size(); i++) {
auto target_object = set.get(i);
for (auto it = set.begin(); it != set.end(); ++it) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use range-based for here?

@@ -291,6 +305,10 @@ bool DeepChangeChecker::check_row(Table const& table, ObjKeyType object_key,
{
TableKey table_key = table.get_key();

if (ObjKey(object_key).is_unresolved()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we get an unresolved key here? Each of the callers is already checking for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way is to call the checker directly with an unresolved key. This shouldn't happen in from any of the notifiers in production, but some of the new test code I added calls the checker for all objects under test, including an unresolved one. I felt it was better to handle it here explicitly rather than not test it. I could add a comment about this, unless you'd prefer we not do it this way. I could also move this up a level if it detracts from the intended design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very much a personal opinion thing, but I'd prefer an assert that the key isn't unresolved.

The general idea is that the more that a function's inputs are constrained, the easier it is to modify that function in the future. It's hard to imagine how it would happen in this specific case, but suppose that a future change made properly handling unresolved links here a lot more complicated. A future maintainer could then invest a bunch of effort into some code which isn't actually reachable in normal usage, or alternatively have to spend a while trying to figure out how this function could get an unresolved link so that they could solve the problem at a different layer. Asserting that the link isn't unresolved instead tells anyone reading this code that they don't have to consider that possibility, while still making it likely that changes in other places which accidentally result in this function being passed an unresolved link will be caught.

Even better would be to shove this off to the type system and have different types for possibly unresolved or null ObjKeys and definitely valid ObjKeys so that we could statically check that they're used properly, but that's also a lot more work than just slapping in some asserts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the assert because I think that's the clearest option and could help us catch other mistakes. To accommodate the test code, I move the check up to the top level calls with a comment explaining why it is needed. Unresolved keys were intended to be a hidden detail and having to deal with them at this level is not ideal, so hopefully this is an acceptable compromise until we find a way to improve that situation.

bool did_run_section = false;

{
size_t col_ndx = GENERATE(1, 2, 3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, I didn't know that was a thing. It's nicer than injecting a template type just for this.

@ironage ironage requested a review from tgoyne July 29, 2021 19:50
@jedelbo jedelbo force-pushed the yg/links-in-mixed-notifications branch from 40f453a to 62cdcba Compare July 30, 2021 08:31
@ironage ironage merged commit be03b65 into master Jul 30, 2021
@ironage ironage deleted the yg/links-in-mixed-notifications branch July 30, 2021 16:42
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

crash N5realm15InvalidTableRefE
4 participants