Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase limits for overflow in formatting #713

Merged
merged 4 commits into from
Sep 18, 2019
Merged

Conversation

bluss
Copy link
Member

@bluss bluss commented Sep 15, 2019

Use a total element count limit, before abbreviating the format output.
This means "small" arrays are always printed in full.

Use separate limits for the last axis, the next to last axis and the
other axes before that.

The last two axes are printed like a matrix of elements which is why
they show more elements.

The other axes are "stacked" and printed by repeating the inner two axes
(like stacked 2D arrays). For this reason they are harder to read
already, and we use a lower limit.

Limits:

  • Array is printed in full below 500 elements
  • Max row length otherwise: 11
  • Max column length otherwise: 11
  • Max other axis length otherwise: 6.

These are further fixes on top of #705

Future features can be added to make the limits user configurable, but it is not
a priority to do that before the next release.

@bluss
Copy link
Member Author

bluss commented Sep 15, 2019

What's everyone's thoughts on the limits? It's tweaked but similar to jturner's idea in #705

@bluss
Copy link
Member Author

bluss commented Sep 15, 2019

This should be a draft PR, because tests and rustfmt remain to be updated. No longer the case.

@bluss bluss force-pushed the fmt-ellipsize-limit branch 3 times, most recently from 21b77e2 to d084f95 Compare September 16, 2019 16:58
Use a total element count limit, before abbreviating the format output.
This means "small" arrays are always printed in full.

Use separate limits for the last axis, the next to last axis and the
other axes before that.

The last two axes are printed like a matrix of elements which is why
they show more elements.

The other axes are "stacked" and printed by repeating the inner two axes
(like stacked 2D arrays). For this reason they are harder to read
already, and we use a lower limit.
@bluss bluss force-pushed the fmt-ellipsize-limit branch from d084f95 to ddfb349 Compare September 16, 2019 17:00
@bluss bluss mentioned this pull request Sep 16, 2019
@LukeMathWalker
Copy link
Member

What's everyone's thoughts on the limits? It's tweaked but similar to jturner's idea in #705

It looks good to me - the only one I am slightly worried about is printing the array in full if we have less than 500 elements.
Consider printing a 2D array with shape (1, 499) in a terminal... Not so nice.
Should we perhaps consider a lower number (e.g. 100), given that we are allowing the user to print the array in full anyway using # if they want to?

@bluss
Copy link
Member Author

bluss commented Sep 17, 2019

numpy thinks it is nice to print up to 1000 elements, so I'm not sure where to draw the line. Although of course, they line wrap if the lines get long. On the outset I think it doesn't matter if it's 100 or 500 - it will be longer than current linewidth anyway.

We just want to hit some number so that all the arrays that the users typically expect to see in full, are visible in full by default. The increased limit should help with that, and 100 in element count would be too small for my expectations at least. I could see that using jturner's idea of having a limit for the longest dimension (of 100 or 200) could work.

This is a slight de-bloating of this function, by converting to a dyn
dimensioned array before entering the main code. This saves a lot of code size,
even though there is more than can be done.

This also allows setting the top level parameters in only one place
(starting at depth 0 etc).
@bluss bluss force-pushed the fmt-ellipsize-limit branch from 5c97049 to 4fe6e07 Compare September 18, 2019 16:01
@bluss
Copy link
Member Author

bluss commented Sep 18, 2019

I used cargo bloat together with an example file to look at the reduction of code size for the last two commits. If someone is interested in formatting performance, maybe serialization is the right place to look then, but otherwise we can come back to optimizing it.

I wouldn't mind changing the rules or tweaking the numbers later. If we expose formatting parameters to users, it might be a good time to make the rules simple so that it uses few params.

@bluss bluss merged commit dad7c4a into master Sep 18, 2019
@bluss bluss deleted the fmt-ellipsize-limit branch September 18, 2019 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants