Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better way to handle NaT (not-a-time) #3414

Closed
seisman opened this issue Jun 1, 2020 · 7 comments · Fixed by #3415
Closed

Better way to handle NaT (not-a-time) #3414

seisman opened this issue Jun 1, 2020 · 7 comments · Fixed by #3415
Labels
bug Something isn't working

Comments

@seisman
Copy link
Member

seisman commented Jun 1, 2020

Description of the problem

Both Python numpy and Matlab can have a datetime called NaT, i.e., not-a-time. It's similar to NaN but for time. Not sure if Julia also has something similar.

GMT can't handle NaT. It's OK for command line, but not for external interfaces like PyGMT.

Currently, when a NaT string is passed to GMT, it reports an error and keeps running.

Unable to parse 1 datetime strings (ISO datetime format required)

We should at least change the error to warning.

What's worse is:

gmt/src/gmt_api.c

Lines 12769 to 12771 in 3003c50

for (row = 0; row < V->n_rows; row++) {
if (gmt_scanf (API->GMT, dt[row], GMT_IS_ABSTIME, &(t_vector[row])) == GMT_IS_NAN) n_bad++;
}

GMT converts the datetime strings as double values relative to the epoch time internally. In the codes above, if dt[row]="NaT", the function gmt_scanf() returns GMT_IS_NAN and the variable t_vector[row] is untouched. The value of t_vector[row] is 0, which means 1970-01-01T12:00:00. The plot is incorrect if we're plotting a calendar plot around 1970-01-01.

Ping @PaulWessel, @joa-quim and @weiji14.

@PaulWessel
Copy link
Member

So we need to change it to

 if (gmt_scanf (API->GMT, dt[row], GMT_IS_ABSTIME, &(t_vector[row])) == GMT_IS_NAN) {
     n_bad++; 
    t_vector[row] = API->GMT->session.d_NaN;
}

Pls try that. Alternatively, if you wish "NaT" string to be a valid entry that yields NaN, then a separate branch is needed:

if (strcmp (dt[row], "NaT") == 0)
       t_vector[row] = API->GMT->session.d_NaN;
else if (gmt_scanf ....
}

@joa-quim
Copy link
Member

joa-quim commented Jun 1, 2020

First option seem better. It will be visited only in case of failure while second introduces another if branch.

@seisman
Copy link
Member Author

seisman commented Jun 1, 2020

In the first option, NaT is converted to NaN, but we still see the warnings (n_bad++). I personally prefer the second option. Waiting for @weiji14's comments on this.

@seisman seisman added the bug Something isn't working label Jun 1, 2020
@weiji14
Copy link
Member

weiji14 commented Jun 1, 2020

Struggling with the C code here, no idea what the fuss of another if branch is about 😆. I say go with the first option? It's okay to have warnings printed out (as long as they're at the info/warning level, not error). I highly doubt that anyone actually stores NaT values literally as NaT in a text file, so we shouldn't have to check for them too literally.

@seisman
Copy link
Member Author

seisman commented Jun 1, 2020

OK. Go with the first option in PR #3415.

@joa-quim
Copy link
Member

joa-quim commented Jun 1, 2020

no idea what the fuss of another if branch is about

It's about branch prediction. For example loops with if branches cannot be vectorized. See first answer of
https://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-processing-an-unsorted-array

@weiji14
Copy link
Member

weiji14 commented Jun 1, 2020

Thanks for pointing me to that link (never saw such a highly upvoted stackoverflow question in my life!), was very educational. Definitely should go for the vectorizable option here, it sounds a bit like the EAFP style in Python, or Easier to Ask for Forgiveness than Permission, in contrast with LBYL or Look Before You Leap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants