Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent the upload of empty (0kb) files #33

Closed
HeavyTuned opened this issue Sep 21, 2017 · 30 comments
Closed

Prevent the upload of empty (0kb) files #33

HeavyTuned opened this issue Sep 21, 2017 · 30 comments

Comments

@HeavyTuned
Copy link

Expected Behavior

add a option for not uploading empty files

Actual Behavior

I cant reproduce it on demand but the plugin often overwrites files on our servers with empty 0 kb files

I work on windows 10

Our config:

{
    "host": "",
    "port": 22,
    "username": "",
    "password": "",
    "protocol": "sftp",
    "agent": null,
    "privateKeyPath": null,
    "passphrase": null,
    "passive": false,
    "interactiveAuth": false,
    "remotePath": "/var/www/online/",
    "uploadOnSave": true,
    "syncMode": "update",
    "watcher": {
        "files": false,
        "autoUpload": false,
        "autoDelete": false
    },
    "ignore": []
}

@liximomo
Copy link
Owner

Is the file on your local machine empty?

@HeavyTuned
Copy link
Author

The only thing I could imagine is somehow the download before through sftp ( I set VSCode like 2 weeks ago) created 0 kb files locally and the directory were synchronised to the remote server (I had syncMode full last week until I noticed the empty files on the server )

@HeavyTuned
Copy link
Author

happened again 5 days ago with the config provided above... 500 files where uploaded empty :/

@ghost
Copy link

ghost commented Nov 9, 2017

Same thing happened here with some hundreds of files. I can confirm that it doesn't correctly download all files - lots of them have 0 length. And, if you use syncMode full, it probably uploads them back to the server.

Good news: I can reproduce it - it fails to download my whole directory structure every time. Bad news: no logs, even after activating the debug log :-(.

When I first tried it after activating the debug log, it was generating a lot of EMFILE errors. I increased the soft limit for the max number of open files using ulimit -Sn 2000 (it was originally set to 1024) and now I have no more logs and still creates lot of zero length files :-( I wonder though: why does it need to keep so many files open at the same time?

After increasing the soft limit to 2000 and trying again it says "[debug]: conncet to remote" and that's the last thing I see in the output panel. The messages in the status bar show that it downloads some files (it gets from one directory to another pretty quickly), but then it gets stuck at a certain file (same every time, it seems - an interesting clue!) and that's all. I left it like that for more than 15 minutes but nothing else happens. No output, no error.

I am using Ubuntu Linux 16.04.3 and vscode 1.18.0 (but it did the same thing in 1.17).

@liximomo
Copy link
Owner

liximomo commented Nov 10, 2017

I wonder though: why does it need to keep so many files open at the same time?

Because all the operations are paralleled. Does it feel so fast?

@kataklys Thanks for your information. Could you provide your config? And that would be great if you can make a clone structure of your entire remote directory with same filename and format.(file content is unnecessary). I promise this will be fixed.

@ghost
Copy link

ghost commented Nov 10, 2017

Config file (I replaced confidential data with ******):

{
    "host": "**********",
    "port": 22,
    "username": "*********",
    "protocol": "sftp",
    "agent": null,
    "privateKeyPath": "****************",
    "passphrase": null,
    "passive": false,
    "interactiveAuth": false,
    "remotePath": "********************",
    "uploadOnSave": false,
    "syncMode": "update",
    "watcher": {
        "files": false,
        "autoUpload": false,
        "autoDelete": false
    },
    "ignore": [
        "**/.vscode/**",
        "**/.git/**",
        "**/.DS_Store"
    ]
}

@liximomo
Copy link
Owner

Could you try download at every subdirectory see if this happens?

@ghost
Copy link

ghost commented Nov 10, 2017

By the way, I was partially wrong - it doesn't get stuck on the same file every time, but there are just a few files that it blocks on, they tend to repeat.

How can I choose just a subdirectory? Let me ask why: I start with an empty local folder, so I don't have any directory structure yet so I can't click on a certain sub-folder to download it. I just click somewhere on the left panel (the explorer I think it's called) and then SFTP download, which brings me the entire remote directory.

@ghost
Copy link

ghost commented Nov 10, 2017

Did some more tests. Deleted all folders from my project except one, and it still doesn't download it entirely. Good news is I can send you that folder because it doesn't contain anything confidential: it's just the vendor directory of a Laravel installation!
vendor.zip

@liximomo
Copy link
Owner

@kataklys Bad news. The vendor.zip works well for me.

@ghost
Copy link

ghost commented Nov 11, 2017

...and other SFTP clients work well for me, when downloading the same folder from the same server. So, it's hard to draw a conclusion from this. We have to investigate more.

Can your extension be configured to generate more debugging data? As I told you, I activated sftp.printDebugLog but it only says "connect to remote" when starting the download and that's all. It's downloading a lot of files without saying a word and then failing silently (I even left it over night! no additional messages). Is there any way to make it more verbose?

@liximomo
Copy link
Owner

liximomo commented Nov 11, 2017

I will make it more verbose when next update.

@ghost
Copy link

ghost commented Nov 11, 2017

Thank you! I want to help you find this bug, but it's a little hard to do it blindfolded ;-)

@ghost
Copy link

ghost commented Nov 11, 2017

Also reproduced the problem in 2 different ways:

  • running vscode from another distro. Tested on 32bit Fedora 23 in a virtual machine and getting the same behaviour
  • running vscode on my machine (Ubuntu 64bit) and downloading the vendor folder from the SSH server on the same machine. Same thing - incomplete download

I kept on testing in the second scenario. I enabled the transfer log on my SSH server and did the following:

  • downloaded the vendor folder using your extension
  • checked the size of the resulting folder and it was incomplete (22MB instead of 62MB)
  • generated a list of files that have 0 length after download
  • chose one of those files and made sure that its original size is not zero
  • checked the SSH server log for that file and - surprise - it doesn't appear in the logs!

Which means that, for some reason, vscode is not even trying to download that file!

@liximomo
Copy link
Owner

@kataklys So thanks. I've published a debug version with verbose log at debug-pacakge branch. Check it out!

@ghost
Copy link

ghost commented Nov 12, 2017

Ok, let's rock!

Here's what it says for one of the files that has size 0 after downloading:

[error]: fs read from ReadStream when piping to /DATE/HD1/D/DESTERSVSCODE/vendor/egulias/email-validator/EmailValidator/Warning/AddressLiteral.php {
    "code": 4,
    "lang": ""
}

@liximomo
Copy link
Owner

liximomo commented Nov 13, 2017

It can only tell that there is a failure when trying to read from ReadStream.

Please help me answer these:

  1. Does this error happen on the same file every time?

  2. Does it happen when only downloading a single problem file?

@ghost
Copy link

ghost commented Nov 13, 2017

I doubt that it's the same set of affected files every time, because after every download I checked the size of the resulting folder, which should be 62MB. One time it's 44MB, another time it's 29MB, another time it's 9MB...it doesn't seem to follow a pattern.

I'll keep doing various experiments that I have in mind and I'll get back with more info.

@liximomo
Copy link
Owner

I would guess somehow it triggers an os limitation(memory, disk, max opened files). I will try to limit the concurrency.

@ghost
Copy link

ghost commented Nov 13, 2017

Latest tests where done with a limit of 500000 for the number of open files. Maybe there's another OS limitation that I'm not aware of.

@liximomo
Copy link
Owner

liximomo commented Nov 13, 2017

Does it happen when only downloading a single problem file?

So the answer is no?

Did you increase the limit of open files on your ssh server?

@ghost
Copy link

ghost commented Nov 13, 2017

You mean specifically for the SSH server process? No. On the other hand, can you (as a client) ask a certain SSH server process to open a specific number of files? I assume that the server is written in such a way that it doesn't open them all at once, even if you ask for a large number.

@HeavyTuned
Copy link
Author

glad to see there is some progress. I'm currently on holidays but it happened for me when I was downloading a whole directory with lot of files / directories in it.

Here are the server limits our dev server is using for the user


ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 257436
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 257436
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

@ghost
Copy link

ghost commented Nov 13, 2017

Ok, I think I got it. It's a matter of limits on the server too.

Ubuntu had a default soft limit of 1024 open files and a hard limit of 4096 (and there are other distros that have the same settings). I changed it to something huge (100000 or so) and I still had the problem. The catch is that, when I reproduced the problem locally, using my own SSH server, I only changed the limit for the client! When changing the limit globally, in /etc/security/limits.conf, it magically started downloading all the files.

Unfortunately, developers usually can't change that kind of setting on production servers, so I would suggest creating a configurable parameter of the extension, that controls the number of threads or the max number of open files or whatever @liximomo thinks it's best. Or/and just lowering the default numbers if it doesn't affect performance.

@liximomo
Copy link
Owner

@kataklys I’m truly grateful for your help. I think I will limit the max number to 512. Please wait for the update.

@ghost
Copy link

ghost commented Nov 13, 2017

You are welcome. I am looking forward to using your extension, now that the main problem is gone.

@liximomo
Copy link
Owner

liximomo commented Nov 14, 2017

@kataklys @HeavyTuned I've made an update of limiting the max concurrence file transmission to 512 at debug-pacakge branch. Could you help me test it?

@HeavyTuned
Copy link
Author

Thank you! installed it. Leaving tomorrow the country for one week. You probably won't hear anything from me until mid next week.

So far no issues.

@ghost
Copy link

ghost commented Nov 14, 2017

Good job @liximomo ! It seems to be working fine now. I lowered the max file limit back to the default (1024) and downloaded a whole site (more than 11,000 files). There's no error in the sftp logs. I also used an app to recursively compare the original dir to the downloaded one and they are identical. Great!

I have a question not related to this: I have a few things to say/fix/suggest about the example config file on the site. Should I create a new issue for that or is there any other way?

@liximomo
Copy link
Owner

liximomo commented Nov 14, 2017

@kataklys PRs are very welcome.

@liximomo liximomo closed this as completed Aug 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant