Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

machines out of disk space #1429

Closed
mcollina opened this issue Aug 6, 2018 · 12 comments
Closed

machines out of disk space #1429

mcollina opened this issue Aug 6, 2018 · 12 comments

Comments

@mcollina
Copy link
Member

mcollina commented Aug 6, 2018

npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, open '/ramdisk0/citgm/28e61dee-cb4c-446f-a097-e233d31a37d4/binary-split/node_modules/.staging/ajv-f65f0ddf/lib/dotjs/oneOf.js'
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, write
 npm WARN tar ENOSPC: no space left on device, open '/ramdisk0/citgm/28e61dee-cb4c-446f-a097-e233d31a37d4/binary-split/node_modules/.staging/ajv-f65f0ddf/lib/dotjs/pattern.js'

The machine is aix61-ppc64.

@maclover7
Copy link
Contributor

Looks like the Jenkins jobs for this were (1, 2), so the machines that need to be fixed are test-osuosl-aix61-ppc64_be-1 and test-osuosl-aix61-ppc64_be-2

@maclover7
Copy link
Contributor

@mcollina Cleaned up the disk a bit on both of those machines, can you try another citgm job and see if that one will work?

@mcollina
Copy link
Member Author

mcollina commented Aug 6, 2018

@mhdawson
Copy link
Member

mhdawson commented Aug 7, 2018

@gdams would it make sense to add a disk cleanup job in ansible for these machines as a first example and hopefully useful as we have seen an actual problem

@mhdawson
Copy link
Member

mhdawson commented Aug 7, 2018

@maclover7 can you add to this issue what you cleaned up so @gdams could use that as the basis of the cleanp script?

Looking at test-osuosl-aix61-ppc64_be-1 right now the space seems to be reasonable.

# df -k
Filesystem    1024-blocks      Free %Used    Iused %Iused Mounted on
/dev/hd4           720896    337728   54%    18017    13% /
/dev/hd2          2949120    218400   93%    50490    47% /usr
/dev/hd9var        589824    189792   68%     6478    13% /var
/dev/hd3           327680    126460   62%    19489    32% /tmp
/dev/hd1            65536     65160    1%        7     1% /home2
/dev/hd11admin      131072    130692    1%        5     1% /admin
/proc                   -         -    -         -     -  /proc
/dev/hd10opt      1835008    954248   48%    17238     8% /opt
/dev/livedump      262144    261776    1%        4     1% /var/adm/ras/livedump
/dev/fslv00      58589184  26151160   56%  1503498    17% /home
/aha                    -         -    -       485     2% /aha

@maclover7
Copy link
Contributor

@mhdawson Went to /home/iojs/build/workspace and removed directories for outdated jobs, and a couple of others just to make space. I've got a playbook I'm working on to try and automate some of this cleanup stuff, will try and open a PR for that in the next day or so

@mhdawson
Copy link
Member

mhdawson commented Aug 9, 2018

@maclover7 sounds good :) I think that would be a great first test for using ansible tower to enable self-server cleanup.

@joaocgreis
Copy link
Member

@maclover7 @mhdawson the problem with cleaning up from Ansible is that there is no way to be sure that a job from Jenkins is not running on the machine. Even if we check for node or vcbuild in the running processes, that's not going to be very reliable. The best place for clean up is a Jenkins job, that can only run when no other jobs are running. We already have https://ci.nodejs.org/view/All/job/git-clean-rpi/ and https://ci.nodejs.org/view/All/job/git-clean-windows/ running weekly, should be easy enough to make a general unix one that can run for other machines.

@mhdawson
Copy link
Member

mhdawson commented Aug 9, 2018

I guess since regular users can't disable the machine a ci job might make more sense. Having said that I could see cases were we'd rather the machine be disabled and cleaned as opposed to jobs continuing to fail until the cleanup job gets scheduled.

@BridgeAR
Copy link
Member

Hey, I saw this a lot on a CITGM run:

https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1561/#showFailuresLink

It would be nice if this could be looked into.

@joaocgreis
Copy link
Member

@BridgeAR looks related to nodejs/node#22754 (comment) , not this issue

@refack
Copy link
Contributor

refack commented Sep 27, 2018

I saw ENOSPC on AIX - https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1561/nodes=aix61-ppc64/console

I added a preemptive rimraf of /ramdisk0/citgm/* on AIX, to the job config:
https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/jobConfigHistory/showDiffFiles?timestamp1=2018-09-21_11-34-32&timestamp2=2018-09-27_11-21-28

This should close for AIX, please reopen if this shows up again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants