Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty datetime parsing issue #2310

Closed
Levatius opened this issue Apr 5, 2018 · 4 comments · Fixed by #2891
Closed

Empty datetime parsing issue #2310

Levatius opened this issue Apr 5, 2018 · 4 comments · Fixed by #2891
Assignees
Labels
kind/bug Something is broken.

Comments

@Levatius
Copy link

Levatius commented Apr 5, 2018

Version: v1.0.4

Observed problem:
We suspect it is somehow possible to insert an empty string ("") as a datetime but have been unable to replicate it.
The reason we think this is due to what we see here:
capture

The _email_address_hash_last_checked predicate is of type datetime and is indexed by hour; the schema alter that sets this up occurs before any data is added to the graph.

When trying to query this particular field as above, we get the error:

: rpc error: code = Unknown desc = parsing time "" as "2006": cannot parse "" as "2006"

Attempting to replicate the issue, mutating an empty string onto such a predicate gives us the error:

parsing time "" as "2006": cannot parse "" as "2006"

Then question then becomes, how did an empty string enter the graph as a datetime if it is not possible to mutate it in?

The only way I can see this happening is if the schema definition for the predicate is at any point lost allowing an empty string to be inserted before the definition gets restored*.

*At the start of every processing task, we check whether any definitions are missing and add them to the schema if so.

@pawanrawal
Copy link
Contributor

This can also happen if the predicate initially had default type and an empty value was inserted. Then if you try and do a schema mutation, we skip errors while indexing which we should only do with predicates which have lang tag (for full-text search). Other errors should be returned to the user.

This could be related to predicate move somehow. Do you verify the exact schema for a predicate or only whether the key for the predicate is returned before doing a schema mutation?

@Levatius
Copy link
Author

This can also happen if the predicate initially had default type and an empty value was inserted.

Indeed, we think this does not happen initially however because we encountered the error far through the test meaning that the predicate in question had already accepted many non-empty datetimes as properly indexed datetimes.

Do you verify the exact schema for a predicate or only whether the key for the predicate is returned before doing a schema mutation?

Only whether the key exists in the schema:

  • If it does, we do not attempt to re-add the key.
  • If it does not, we then make a schema alteration to add it with its appropriate index.

It might be worth noting that this particular predicate being empty is even more strange since it is not imported from the original dataset but generated through python's datetime.now() function (using it to record the last time we modified the node) hence we would never expect it to generate an empty value.

@danielmai danielmai added investigate Requires further investigation and removed improvement labels Oct 18, 2018
@manishrjain manishrjain assigned srfrog and unassigned gitlw Nov 26, 2018
@gitlw
Copy link

gitlw commented Jan 3, 2019

@Levatius Is it ok for you to share your tests so that we can try to reproduce this error?

@srfrog srfrog added kind/bug Something is broken. and removed investigate Requires further investigation labels Jan 12, 2019
@srfrog
Copy link
Contributor

srfrog commented Jan 12, 2019

Steps to reproduce:

  1. Set some correct datetime data and a few empty strings. No schema or string predicate.
{
  "set": [
    {
      "created": "2016-01-15T0:00:00.000Z",
      "name": "first"
    },
    {
      "created": "2017-01-17T0:00:00.000Z",
      "name": "second"
    },
    {
      "created": "",
      "name": "third"
    }
  ]
}
  1. Try to alter predicate to: created: datetime @index(year) .
  2. Fails

The problem is that "" will never parse to a Go time.Time. We need to convert "" to zero time value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something is broken.
Development

Successfully merging a pull request may close this issue.

5 participants