Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a high number of untracked files make the extension and jupyter lab unresponsive #667

Closed
pfarndt opened this issue Jun 5, 2020 · 12 comments · Fixed by #767
Closed

a high number of untracked files make the extension and jupyter lab unresponsive #667

pfarndt opened this issue Jun 5, 2020 · 12 comments · Fixed by #767
Labels
Milestone

Comments

@pfarndt
Copy link

pfarndt commented Jun 5, 2020

I have a repo with a lot (10 000+) untracked files (mostly intermediate figures and tables I do not need to track, not excluded through a .gitignore) which get listed in the corresponding section of the UI but make the extension and the whole jupyter lab unresponsive.

Is there any remedy?

@ianhi
Copy link
Collaborator

ianhi commented Jun 5, 2020

Maybe the same as #663?

Thanks for reporting @pfarndt, I also find that if I create 10000 empty files in a git repository then the entire UI slows to an unusable ~0 fps. I guess this is happening because every single file gets rendered into the changes tab no matter how many there are. I saved the firefox performance monitor of this: profile.zip and its lots and lots of Parse HTML and DOM Event.

{files.map((file: Git.IStatusFile) => {

image
first refresh status is with 1000 files, second with 10000.

Is it easy to detect if a react component will be visible or not and only render it if you scroll onto it?

Also I don't think the status call is at fault because when I run time git status --porcelain -u -z
I get

real	0m0.031s
user	0m0.012s
sys	0m0.007s

Though if you add:

import time
time.sleep(5)

to the python status function then select parts of the ui are severely slowed down - making a new folder in the filebrowser, running notebook cells, etc. The connecting theme being things that require the jupyter server to work.

@fcollonval
Copy link
Member

@pfarndt unfortunately right now, you only have two possibilities:

  • reducing the number of versioned file
  • increasing the setting refreshInterval

Thanks @ianhi for the analysis.

Could you look at the response size of the status request for such a big number of files? This also impact the responsiveness of the server.

The solution you are looking for if you want to push a PR is called virtualization. More specifically I would suggest the following package react-window.
You could look at the examples: https://react-window.now.sh/#/api/FixedSizeList

@ianhi
Copy link
Collaborator

ianhi commented Jun 7, 2020

With 10000 files: 887 KB:

image

@pfarndt
Copy link
Author

pfarndt commented Jun 9, 2020

I did some more tests:

  • I can handle 5,000 files just fine - but from 10,000 files on things get considerably slower - from 15,000 files it gets really hard to work with this extension and the notebooks
  • increasing refreshInterval by a factor of 100 does not help
  • I also saw (but this is probably unrelated to the slowing down and happens with less files already) that the UI gets some distortions in the top part, i.e. elements start to overlap - see screen shot

Screen Shot 2020-06-09 at 6 51 29 PM

Hope that helps to pinpoint the problem.

@ivan-gomes
Copy link

ivan-gomes commented Jun 11, 2020

For context,

This is plaguing my users who have large repositories cloned as well, which is making it difficult to leave the extension enabled by default since there is no way to anticipate how it will affect a user's experience. :/

To clarify, the user's browser locks up and prompts the user to suspend the tab. Debugging with them, I see in the browser dev tools long requests to the status endpoint with large response payloads (in the order of MB) occurring in a loop, e.g. a new request starts the moment one responds. Disabling (temporarily) the jupyterlab-git labextension and serverextension resolves the browser locking. I suspect the large response is being processed in a way that blocks the event loop.

@ianhi:

Hey @ivan-gomes there is some discussion of the underlying cause for this over at #667. Could you see how long git status takes to run with time git status --porcelain -u -z and post the results in that issue? It sounds as though git status is taking longer than the refreshInterval setting (default 3 seconds). Other useful information would be: OS, git version, python package version, and npm extension version.

(@fcollonval maybe this issue should be closed to consolidate with 667?)

$ time git status --porcelain -u -z > /tmp/status.log
warning: could not open directory 'lost+found/': Permission denied

real    0m0.637s
user    0m0.320s
sys     0m0.307s
$ ls -lh /tmp/status.log
-rw-r--r--. 1 <> <> 17M Jun 11 21:33 /tmp/status.log

Looking at the short call time (<1s) and large size (17M) of the output, I think this has more to do with how long it takes to send that large of a payload over the network.

@ianhi
Copy link
Collaborator

ianhi commented Jul 11, 2020

Looking at the short call time (<1s) and large size (17M) of the output, I think this has more to do with how long it takes to send that large of a payload over the network.

I think two things are happening:

  1. List takes a long to render
  2. As you note the message can bog down the whole server.

react-window seems to solve the first issue. As for the second issue maybe we could compress with zlib (python builtin) and then decompress on the typescript side using pako. Though it would be good to determine if that is worth it - i.e. characterize the slowdown as a function of message size.

@ivan-gomes
Copy link

List takes a long to render

I assume we're referring to the list of changed files in the Git tab of the left pane. If so, it may be worth noting that the performance issue occurs even if the Git tab is not open.

react-window seems to solve the first issue.

To clarify, is this something that's already done / being worked on or is it a proposal to use it?

compress with zlib (python builtin) and then decompress on the typescript side using pako

Sounds like a good idea considering how compressible the payload likely is.

@ianhi
Copy link
Collaborator

ianhi commented Jul 17, 2020

If so, it may be worth noting that the performance issue occurs even if the Git tab is not open.

I think the panel is being rendered in the background as there are no checks in the render method to see if the sidebar tab is open:

render(): React.ReactElement {
return (
<div className={panelWrapperClass}>
{this.state.inGitRepository ? (
<React.Fragment>
{this._renderToolbar()}
{this._renderMain()}
</React.Fragment>
) : (
this._renderWarning()
)}
</div>
);
}

To clarify, is this something that's already done / being worked on or is it a proposal to use it?

Right now I think it is primarily a proposal. I started working on this a while ago (branch here) but never finished as my lab started re-opening and I didn't/don't know enough about react to implement it quickly. If anyone wants to pick this up they are more than welcome to.

@jaipreet-s
Copy link
Member

This also happens when there are a large number of files in the staging area. For e.g., download a data set into data/. and git add .

In general, the extension needs to gracefully handle these scenarios rather than making the whole UI unusable.

@pfarndt
Copy link
Author

pfarndt commented Aug 10, 2020

I agree with the previous post. I think "gracefully" could very well mean that only a few (say the 20 most recently changed) files are listed (together with a remark that this list is actually longer).

This way the UI is not blocked. Users anyway do not want to add/manage hundreds of files using the GUI and would use the git CLI instead.

Probably this is easier to implement.

@talal-sen
Copy link

I am facing the same issue, my jupyter lab running on chrome is unresponsive after i clicked on the option to initialise a repository. The whole UI is not unresponsive, Is there any way to kill this command and make the UI usable again?

@fcollonval
Copy link
Member

@talal-sen could you open a new issue with more information - in particular running in debug mode and reporting the messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants