Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PrettyTables.jl as HTML backend #3096

Merged
merged 33 commits into from
Sep 23, 2022
Merged

Conversation

ronisbr
Copy link
Member

@ronisbr ronisbr commented Jul 2, 2022

Hi!

I am submitting the initial version of the code that makes DataFrames.jl uses PrettyTables.jl as HTML backend.

I need features that are not released in PrettyTables.jl yet. Hence, I propose the following workflow:

  1. We discuss and adjust everything we need in this PR.
  2. Once we done, I tag the new version of PrettyTables.jl that already has breaking changes.
  3. I update this PR to make DataFrames use the new version of PrettyTables.jl.
  4. Only here we can merge this work.

@ronisbr
Copy link
Member Author

ronisbr commented Jul 2, 2022

I will post some screenshots obtained in Jupiter. It would be nice if someone can pass interesting cases to check if I missed something:

Captura de Tela 2022-07-02 às 19 24 21

Captura de Tela 2022-07-02 às 19 25 48

The alignment follows the same behavior of text output without the . alignment with numbers since it is not possible:

  1. Numbers are aligned to the right.
  2. Everything else (including the titles) are aligned to the left.

Captura de Tela 2022-07-02 às 19 31 38

The only part I am really not happy with is the header cell title. DataFrames uses the title option of HTML table to output the type without cropping. I needed to modify PrettyTables so that it can be add this option. However, I am not liking the solution. I will think if this can be improved.

Another thing we must discuss is how we will select the maximum number of rows and columns that will be rendered. @bkamins mentioned about using a env variable. In this PR, I just copied what we had before.

@ronisbr ronisbr marked this pull request as draft July 2, 2022 22:33
@bkamins bkamins added this to the 1.4 milestone Jul 2, 2022
@bkamins bkamins added the display label Jul 2, 2022
@bkamins
Copy link
Member

bkamins commented Jul 3, 2022

@ronisbr - to test this I understand we should:

  1. checkout master of PrettyTables.jl
  2. checkout ronisbr:html_backend of DataFrames.jl

?

@ronisbr
Copy link
Member Author

ronisbr commented Jul 3, 2022

Hi @bkamins !

Yes! Everything should work after those two steps.

@ronisbr
Copy link
Member Author

ronisbr commented Jul 3, 2022

By the way, I was thinking in changing the maximum number of rows and columns to get the values in ENV["DATAFRAMES_COLUMNS"] and ENV["DATAFRAMES_ROWS"]. What do you think?

@bkamins
Copy link
Member

bkamins commented Jul 3, 2022

I was thinking in changing the maximum number of rows and columns to get the values in ENV["DATAFRAMES_COLUMNS"] and ENV["DATAFRAMES_ROWS"]. What do you think?

This is exactly my thinking. And I would set some reasonable defaults when they are not present, but let us wait for @nalimilan to comment on this.

@ronisbr
Copy link
Member Author

ronisbr commented Jul 3, 2022

Ok! I can do this in this PR right now just to make it easier to test the cropping (it is somewhat different from what we have today and it was what lead to the most problems when adding features to PrettyTables :D). If we decided a different solution, I can remove it.

@bkamins
Copy link
Member

bkamins commented Jul 3, 2022

OK

@nalimilan
Copy link
Member

This is related to discussion at JuliaLang/IJulia.jl#1040. Currently we use displaysize to get the maximum size: any particular reason to change this? IJulia sets the COLUMNS environment variable, which is used by displaysize as a fallback, so if we use something else we won't respect that setting, right?

@bkamins
Copy link
Member

bkamins commented Jul 3, 2022

so if we use something else we won't respect that setting, right?

Right. The reason is as follows:

  1. default COLUMNS is too small for DataFrames.jl.
  2. increasing default COLUMNS is too large for other objects (matrices are one of the important problematic cases).

So, since DataFrames.jl is lower in precedence than Base Julia we cannot change the default COLUMNS but need to use something else. Note that currently DataFrames.jl users almost always change COLUMNS in Jupyter Notebook, because the default is not useful, but this destroys display of other objects.

@ronisbr
Copy link
Member Author

ronisbr commented Jul 3, 2022

One important comment: I could not test in Pluto! It seems that when we add data-frame class to the table and div it changes completely the output. However, the good side is that the new system seems to work without breaking :)

@ronisbr
Copy link
Member Author

ronisbr commented Jul 4, 2022

I wondered why I was taking almost 4 minutes to show the first DataFrame. However, it turns out the be a problem in Jupyter. It also happens with the old version. The interesting point is that this code:

df = DataFrame(rand(10, 10), :auto);
display(df)

takes almost 4s to run the first time. Whereas this code:

df = DataFrame(rand(10, 10), :auto);
show(stdout, MIME("text/html"), df)

takes less than 2s.

Given this, I am happy with the current state inside PrettyTables.jll, and I think I can remove the draft from this PR!

Of course I need to update the tests, but we need to decide somethings before that.

@ronisbr ronisbr marked this pull request as ready for review July 4, 2022 12:54
@nalimilan
Copy link
Member

What feels weird to me is that data frames and matrices are very similar objects when it comes to printing. In theory you could also scroll horizontally if the matrix doesn't fit the screen, and it would arguably be more useful than plain truncation. This probably doesn't apply to all objects, but it sounds like a more general issue than just tables. Maybe it should be called something like MAX_SCROLL_COLUMNS or MAX_COLUMNS?

@bkamins
Copy link
Member

bkamins commented Jul 5, 2022

What feels weird to me is that data frames and matrices are very similar objects when it comes to printing.

They conceptually are, but they are not in implementation, as matrices do not support "text/html" MIME.

Regarding the name then indeed something like SCROLL_COLUMNS would make sense (I think MAX can be dropped as COLUMNS does not have this prefix).

@bkamins
Copy link
Member

bkamins commented Jul 5, 2022

I could not test in Pluto.jl

You mean that this PR fails under Pluto.jl? I understand this is a general PrettyTables.jl issue - right?

@nalimilan
Copy link
Member

They conceptually are, but they are not in implementation, as matrices do not support "text/html" MIME.

Yeah, but shouldn't such a method exist ideally?

@bkamins
Copy link
Member

bkamins commented Jul 5, 2022

But it should be in Base Julia (same for LaTeX)

@ronisbr
Copy link
Member Author

ronisbr commented Jul 5, 2022

You mean that this PR fails under Pluto.jl? I understand this is a general PrettyTables.jl issue - right?

Actually no. Pluto.jl does some customization if the class in div and table is data-frame. Some of those customization overrides what we selected in this PR. It is not a bug, just a design choice from Pluto.jl's side.

@bkamins
Copy link
Member

bkamins commented Jul 5, 2022

@fonsp - could you please comment how you see this (I it would be good to keep this PR and Pluto.jl in sync). Thank you!

@bkamins
Copy link
Member

bkamins commented Jul 18, 2022

@ronisbr - fonsp/Pluto.jl#2206 was tagged, but I get no response from Pluto.jl maintainers what we should do. Do you think that the current design is "robust enough" that we can go ahead (i.e. the display in Pluto will be different, but it still will be nice and readable) or should we wait (i.e. things in Pluto.jl will not be nice so it is better to synchronize them before releasing)?

CC @fonsp

@ronisbr
Copy link
Member Author

ronisbr commented Jul 18, 2022

Hi @bkamins !

In my tests, I saw no problem. However, if you want, we can change the class name to something like data-frame-v2, leading to no customization at Pluto side.

@bkamins
Copy link
Member

bkamins commented Jul 18, 2022

Maybe you have a print screen what are the consequences of the current customization in Pluto?

@ronisbr
Copy link
Member Author

ronisbr commented Jul 20, 2022

Sure! I will post it tonight.

@ronisbr
Copy link
Member Author

ronisbr commented Jul 21, 2022

This is the default view (notice that this cropping is not added by PrettyTables.jl):

Captura de Tela 2022-07-21 às 09 29 13

I think Pluto sets :limit to false. Hence, if we keeping pressing more it eventually shows all the DataFrame.

Another thing is that the column alignment is not the same as selected by PrettyTables.jl. We are aligning numbers to the right (to keep consistency with text backend), but here the are aligned to the left.

@bkamins
Copy link
Member

bkamins commented Jul 21, 2022

So it seems Pluto.jl just has a custom way of showing things end-2-end, so probably we can just let them do what they want. I understand current settings (i.e. using the same class name as they use) achieves this?

@ronisbr
Copy link
Member Author

ronisbr commented Jul 21, 2022

I understand current settings (i.e. using the same class name as they use) achieves this?

Yes, it seems it overrides everything. However, I will keep testing as we modified this PR to check if some new setting breaks Pluto.

@ronisbr
Copy link
Member Author

ronisbr commented Sep 20, 2022

Hi @bkamins and @nalimilan

IIUC:

  1. truncate throws argument error in HTML.
  2. max_column_width must have a string (CSS length) to limit the column widths.
  3. max_column_width throws argument error in text mode.
  4. Rebase everything against main.

test/dataframerow.jl Outdated Show resolved Hide resolved
Copy link
Member

@bkamins bkamins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Let us wait for @nalimilan. I just left one small request for tests.
After this is merged it would be super nice if you wrote a section to the manual about pretty printing (it does not have to be long - just something to get started with most important and non-obvious things; later it can be expanded based on users' feedback).
Thank you!

@yakir12
Copy link
Contributor

yakir12 commented Sep 21, 2022

#3172 might help the CI failure for Julia 1 - windows-latest - x86?

src/abstractdataframe/io.jl Outdated Show resolved Hide resolved
src/abstractdataframe/io.jl Outdated Show resolved Hide resolved
@ronisbr
Copy link
Member Author

ronisbr commented Sep 21, 2022

Hi @bkamins and @nalimilan !

Done, I think I implemented all the requested changes.

@ronisbr
Copy link
Member Author

ronisbr commented Sep 21, 2022

oops, I forgot to include a line in the previous commit.

@bkamins
Copy link
Member

bkamins commented Sep 23, 2022

@nalimilan - OK to merge now?

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@bkamins
Copy link
Member

bkamins commented Sep 23, 2022

Thank you all!

@bkamins bkamins merged commit 8acb679 into JuliaData:main Sep 23, 2022
@bkamins
Copy link
Member

bkamins commented Sep 23, 2022

@ronisbr - if you have time having a PR with a short tutorial on printing added to the manual before the DataFrames.jl 1.4 release would be great (no pressure of course).

@ronisbr
Copy link
Member Author

ronisbr commented Sep 24, 2022

@ronisbr - if you have time having a PR with a short tutorial on printing added to the manual before the DataFrames.jl 1.4 release would be great (no pressure of course).

Hi @bkamins !

I will try! Where should I write this section?

@ronisbr ronisbr deleted the html_backend branch September 24, 2022 17:42
@bkamins
Copy link
Member

bkamins commented Sep 24, 2022

In the "User Guide" section. I would put it after "Working with DataFrames" and before "Importing ..."
Just please create a new file and add it to the list of imported files.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants