Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-indexed array indices are ignored #31

Closed
marshallward opened this issue Jul 22, 2016 · 19 comments
Closed

Zero-indexed array indices are ignored #31

marshallward opened this issue Jul 22, 2016 · 19 comments
Labels

Comments

@marshallward
Copy link
Owner

If we have a zero-indexed array, e.g.

&nml_group
    y(0:3) = 1, 2, 3, 4
/

then it incorrectly interprets 0 as None (due to a bug in FIndex) and uses the default value of 1. The above case fails because there are more than 3 values in the array and the iterator runs out of indices.

@marshallward
Copy link
Owner Author

Further bugs downstream in append_value, which seems to be assuming 1-based indexing:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "f90nml/__init__.py", line 50, in read
    return parser.read(nml_path)
  File "f90nml/parser.py", line 144, in read
    g_vars, patch_nml=grp_patch)
  File "f90nml/parser.py", line 308, in parse_variable
    self.append_value(v_values, next_value, v_idx, n_vals)
  File "f90nml/parser.py", line 486, in append_value
    v_tmp[v_i[-1] - 1] = next_value
IndexError: list assignment index out of range

Amazing that this was never tested before...

@marshallward
Copy link
Owner Author

marshallward commented Jul 22, 2016

There seem to be more fundamental problems related to the starting index. Nearly everything in the current implementation assumes a one-indexed array. For example:

&nml
    y(3:5) = 1, 2, 3
/

will create this array:

y = [None, None, 1, 2, 3]

and save the output as

&nml
    y = , , 1, 2, 3
/

which could break if the Fortran array was not originally 1-indexed.

We need to include explicit array bounds, and track the original indices of data, when reading indexable arrays.

@marshallward
Copy link
Owner Author

As of d1df701 this now appears to be fixed.

@jacobwilliams
Copy link

Does this relate to #24 also? If I read a namelist like:

    a%b%c(2) = 2,
    a%b%c(3) = 3,

and then write it again, what will it print?

@jacobwilliams
Copy link

It seems like this change breaks the ability to access the variables with the indices that are in the namelist? So, for my example above:

import f90nml
n = f90nml.read('test.nml')
print(n)
print(n['nml_group']['a']['b']['c'][1])
print(n['nml_group']['a']['b']['c'][2])

The previous version would print

2
3

Where as the current version gives:

3
Traceback (most recent call last):
  File "./test.py", line 10, in <module>
    print(n['nml_group']['a']['b']['c'][2])
IndexError: list index out of range

Additionally, if we now write to another format like JSON, we loose the indices:

print(json.dumps(n,indent=4))

gives:

{
    "nml_group": {
        "a": {
            "b": {
                "c": [
                    2, 
                    3
                ]
            }
        }
    }
}

whereas the previous version would give:

{
    "nml_group": {
        "a": {
            "b": {
                "c": [
                    null, 
                    2, 
                    3
                ]
            }
        }
    }
}

In my case, the indices in the file are significant, and I want to preserve them. Is there a way to have the old behavior be an option?

@marshallward
Copy link
Owner Author

I think a read().write() previously would have produced this:

a%b%c = , 2, 3

whereas now it produces this:

a%b%c(2:3) = 2, 3

which are both equivalent under Fortran -- assuming 1-indexing! If c was not 1-indexed, then there could be problems... or maybe the None values would keep things safe.

You're right that the Python output has changed, since [None, 2, 3] has been changed to [2, 3], with the starting index tracked inside the parent Namelist.

I feel like the newer one is more in line with the Fortran representation, and ultimately produces safer output, e.g. x(4) = 1 is better than x = , , , 1 or x(1:4) = , , , 1 since I cannot infer the x starting index.

But I just tested out a change to the default start index (using 1 instead of the first cited index of the vector) which restores the old behaviour, and I can add a property to the Parser (parser.assume_start_index = 1?) that makes this very painless. Would that help?

@marshallward
Copy link
Owner Author

BTW I would probably use something like this:

self.first = [min(1, s) for s in self.start]

to prevent index errors relative to negative indexing (e.g. x(-2:2)). This would then put the onus on the user to know that -2 maps to 0 in python.

However for x(3:5), 3 would map to 2 in Python, rather than 0. I'll just put the onus on the user to manage this situation.

Also, I would probably want to turn this sort of behaviour off by default!

@marshallward
Copy link
Owner Author

Re: #24
I think the issues are related. I had hope to tackle both of them at the same time, but it seems like I really coded myself into a corner with out vector index is written, and it became too difficult to sort the two issues out in my head.

There's some deeper problems with entanglement between arrays, derived types, and lists of derived types, and too many little checks to accomodate all cases (even when they should not happen).

Unfortuantely I think it will take a big refactoring to get #24 working well.

@marshallward marshallward reopened this Jul 25, 2016
@marshallward
Copy link
Owner Author

marshallward commented Jul 25, 2016

reopening this ticket since it is causing issues with some use cases (see JSON output issue above).

@marshallward
Copy link
Owner Author

I've added a property global_start_index to restore the old 1-based indexing. You can set it within the parser:

import f90nml.Parser as Parser
parser = Parser()
parser.global_start_index = 1
nml = parser.read("test.nml")

@jacobwilliams
Copy link

Thanks! I think that will work for me. I'll test it soon.

@marshallward
Copy link
Owner Author

Looks like I forgot to push this commit to github, tuturu...

Anyway I hope it works, I will probably do a PyPI update once the documentation has been updated.

@jacobwilliams
Copy link

Terrific! It works for me. Thanks!
For the record:

>>> import json
>>> from f90nml import Parser
>>> p = Parser()
>>> p.global_start_index = 1
>>> n = p.read('test.nml')
>>> print(json.dumps(n,indent=4))
{
    "nml_group": {
        "a": {
            "b": {
                "c": [
                    null, 
                    2, 
                    3
                ]
            }
        }
    }
}
>>> print(n['nml_group']['a']['b']['c'][1])
2
>>> print(n['nml_group']['a']['b']['c'][2])
3

@marshallward
Copy link
Owner Author

Finally documented, closing this!

@lstngr
Copy link

lstngr commented Jun 6, 2023

I'm encountering a similar issue, which I assume is related to this ticket.
It looks like with arrays of multiple dimensions (or negative indices?), some values still are parsed incorrectly.

In [21]: %cat tmp/input.F90
&MYLIST
myarr(-1,-1:3) = 0 1 2 3
myarr(0,-1:3) = 4 5 6 7
myarr(1,-1:3) = 8 9 10 11
/

In [22]: nml = f90nml.read("tmp/input.F90")

In [23]: nml
Out[23]:
Namelist([('mylist',
           Namelist([('myarr',
                      [[0, 4, 8, 6, 7],
                       [None, None, 9],
                       [None, None, 10],
                       [None, None, 11]])]))])

@marshallward
Copy link
Owner Author

marshallward commented Jun 6, 2023

Thanks for reporting this. I can replicate it:

&list
    a(-1:3,-1) = 0, 4, 8, 6, 7
    a(-1:1,0) = , , 9
    a(-1:1,1) = , , 10
    a(-1:1,2) = , , 11
/

The starting index is correctly identified as (-1,-1) for all records, but it's incorrectly placing them. Something like this is happening at myarr(-1):

-> [[0, 1, 2, 3]]
-> [[0, 4, 5, 6, 7]]

and then [8, 9, 10, 11] is oddly placed correctly:

[[0, 4, 8, 6, 7]]
[[_, _, 9]]
[[_, _, 10]]
[[_, _, 11]]

In other words, the index order is swapped for the first two record, but is correct for third record.

I've confirmed at least this much, and it's happening in append_value, but I'll need to keep looking to understand why.

@marshallward
Copy link
Owner Author

I have a somewhat simpler (and more damning) example. The following namelists:

&nml2
    a(2,-1:3) = 0 1 2 3
/

&nml1
    a(1,-1:3) = 0 1 2 3
/

&nml0
    a(0,-1:3) = 0 1 2 3
/

&nml_m1
    a(-1,-1:3) = 0 1 2 3
/

returns very different outputs:

&nml2
    a(2,-1) = 0
    a(2,0) = 1
    a(2,1) = 2
    a(2,2) = 3
/

&nml1
    a(1,-1) = 0
    a(1,0) = 1
    a(1,1) = 2
    a(1,2) = 3
/

&nml0
    a(0:3,-1) = 0, 1, 2, 3
/

&nml_m1
    a(-1:2,-1) = 0, 1, 2, 3
/

If the starting index is below 1, then the contents are somehow reversed, thus producing the odd output from the original example.

This could be related to the global_start_index or default_start_index values, perhaps not properly updated for multidimensional arrays. I continue to investigate.

@marshallward
Copy link
Owner Author

marshallward commented Jun 6, 2023

The issue was - once again - an irresponsible use of if- testing with integers. In fact, 0 and -1 are the only cases which fail here, and only on the outer ranks.

(a(-2,-1:3) = 0 1 2 3 works perfectly fine. The dangers of extrapolation I suppose...)

There were in fact two completely unrelated problems:

  1. a(0,:) was incorrectly setting it upper bound to None rather than 1. In other words, unbounded.

  2. a(-1,:) raised an error in the multidimension iterator, and incorrectly concluding that -1 + 1 was None rather than 0, and thus immediately jumping to the next rank.

I have pushed a fix which resolves these issues. I will do a PyPI after I have provided sufficient testing (perhaps your example case).

@lstngr
Copy link

lstngr commented Jun 7, 2023

Many thanks for the prompt investigation!
I pulled your fix and can attest it also solves these issues on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants