Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search still returns previous versions of objects for a while after they have been updated #324

Closed
ferdi-ritense opened this issue Mar 22, 2023 · 13 comments · Fixed by #444
Assignees
Labels
approved bug Something isn't working enhancement New feature or request owner: den haag

Comments

@ferdi-ritense
Copy link

ferdi-ritense commented Mar 22, 2023

We ran into an issue where, after we change a property, the previous version of the object still gets returned when searching for the old value of the property.

For example, we have an object with a nested property straat.

{
    "adres": {
        "straat": "Bospad"
    }
}

The object gets returned, as expected, with this query: data_attrs=adres__straat__exact__Bospad

Now we change the straat to a new value. Version 2 of the object now exists with this new value.

{
    "adres": {
        "straat": "Dorpsstraat"
    }
}

When we search data_attrs=adres__straat__exact__Dorpsstraat it behaves as expected. Version 2 of the object is returned.

However, when we search for the old value, data_attrs=adres__straat__exact__Bospad, version 1 is returned. But only for a few hours. When we try the same query the next day data_attrs=adres__straat__exact__Bospad no longer yields any results which is what we would have expected from the start.

There seems to be some form of caching going on somewhere in the Objects Api. We have reproduced this on a number of environments running versions 2.1.0 (v2) and 2.1.1 (v2). There is no caching outside of the Objects Api in these environments.

@alextreme alextreme self-assigned this Mar 23, 2023
alextreme added a commit that referenced this issue Mar 23, 2023
@alextreme
Copy link
Member

Could you provide your PUT body to modify the object?

I've added two tests in #325 to attempt to reproduce the issue ( https://github.com/maykinmedia/objects-api/pull/325/files ) but this seems to work as expected on master. I'll double-check v2.1.1

@ferdi-ritense
Copy link
Author

In this example we modified the object from the Objects Api Admin interface.

@alextreme
Copy link
Member

Investgated further after a hunch of Ivo, the PR seems to solve the problem (Joeri is having a look)

@joeribekker
Copy link
Member

joeribekker commented Apr 3, 2023

Made a deep dive into this. The "problem" here is the data filter.

The regular order of things goes like this:

  1. Filter records (if no date provided, it searches on valid records for today)
  2. If multiple records match, do conflict resolution.

Assuming we have an Object O with 2 records: Record A with prop=K and a newer record B with prop=L. In essence, we have an object with some property with value K and we updated it to say the property is now L.
Record A started on t1 and ended on t2. At date t2, record B starts (without any end date).

Object O records:
|-- A(prop=K) --|
                |-- B(prop=L) --
t1              t2

Which record is valid on date t2? In this case that would be record B.

  • Rule 1: Both records match
  • Rule 2: Record B is newer and therefore wins the "conflict"
  • Result: Record B correct

If you ask why did record A match at all? Well, imagine that record B didn't exist. Should record A be returned then when searched on date t2? Yes it should, so that's your answer.

Anyway, now let's repeat the same query "Which record is valid on date t2" but also with the filter prop=K.

  • Rule 1: Only record A matches date t2 and prop=K.
  • Rule 2: Conflict resolution not needed
  • Result: Record A and this is undesired.

So, we probably need to adjust the rules:

  1. Filter records on date/time (if no date provided, it searches on valid records for today)
  2. If multiple records match, do conflict resolution.
  3. Filter on data

If we now follow the rules the result would:

  • Rule 1: Date filter Record A and record B matches date t2.
  • Rule 2: Conflict resolution says record B
  • Rule 3: Data filter says no matches
  • Result: no matches

@joeribekker joeribekker added bug Something isn't working enhancement New feature or request labels May 19, 2023
@joeribekker joeribekker added this to the Release 3.0.0 milestone May 19, 2023
@joeribekker
Copy link
Member

I decided to cut down on the magic (read: less complex rules to understand how it works over a slight functional change that can cause confusion - but is easily cleared up with documentation).

Thus, the suggested change in the diff in the PR below should be the final approach (replace gte with gt):

https://github.com/maykinmedia/objects-api/pull/325/files#diff-12c96a9f3455fec44a32b397a7e60060ded8488b6bc13c9dcf7c497c4604f68d

Th PR needs some work still because it changes CI stuff as well thats no longer needed I think. Due to its breaking change we need to make this part of the next major release (3.0.0) and not of 2.x

@flinden68
Copy link

We hit also this bug. Is there a time table when the related PR is being available in a version??

@joeribekker
Copy link
Member

@flinden68 nop, but please explain your exact use case.

If it concerns multiple record that were created on the same day, it might need to be a correction.

@flinden68
Copy link

We have an Object from Objects API where we update an property status to 'ingediend', but when do call for all these objects where the status is 'open', it contains also the object where we just updated the status to 'ingediend', and should be in the query results

@joeribekker
Copy link
Member

@flinden68 can you add all the records and your query to this issue?

@flinden68
Copy link

We do the query &data_attrs=identificatie__type__exact__bsn,identificatie__value__exact__xxxx,status__exact__open&ordering=-record__startAt.

Records I could not provide at the moment.

@joeribekker
Copy link
Member

This ticket can be picked up further under Den Haag. I added "triage" because I want this investigated more. I also don't think this should be a major release anymore. It changes fuzzy behaviour to proper behaviour.

@PeterVanBragt
Copy link

PeterVanBragt commented Aug 15, 2024

@joeribekker When is this bug expected to be fixed? Also asking for Wendy van Duijvenvoorde
(Klantportaal)

@alextreme
Copy link
Member

Discussed with Anna:

  • There are problems with objects that are created and have a new version created on the same day. As Joeri mentioned this can be done with corrections but this would have to be filtered out via the data_attrs
  • Modifying the way this works means that also the default ordering/filtering will be adjusted
  • Discussed adding a query parameter to only show the latest versions of objects (or a feature flag / separate endpoint?)

This requires further discussion with @joeribekker after his holiday due to the impact

@joeribekker joeribekker removed the triage label Sep 6, 2024
@joeribekker joeribekker removed this from the Release 3.0.0 milestone Sep 6, 2024
annashamray added a commit that referenced this issue Sep 6, 2024
  * filter on date,
  * group records by object and keep records with max index
  * filter on rest of query params
@github-project-automation github-project-automation bot moved this from Implemented to Done in Data en API fundament Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved bug Something isn't working enhancement New feature or request owner: den haag
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants