Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sl-runctl stop returns success before the supervisor is stopped #137

Open
danielwhite opened this issue Jun 16, 2015 · 6 comments
Open

sl-runctl stop returns success before the supervisor is stopped #137

danielwhite opened this issue Jun 16, 2015 · 6 comments

Comments

@danielwhite
Copy link

The fundamental problem behind issue #136 was the use of stop and start in quick succession, where a running supervisor was seen as a successful start.

It would be easier to script against if the stop CLI command only returns once the master is actually stopped.

To reproduce (assuming a running process):

$ sl-runctl stop && sl-runctl status

If the problem exists, then the supervisor information is returned. For example:

master pid: 28943
worker count: 1
worker id 6: { pid: 28964, uptime: 1445, startTime: 1434419524881 }

If the problem is resolved, then there should be an indication that the master is stopped. For example:

Communication error (connect ECONNREFUSED), check master is listening

A fix didn't seem trivial, so I might not get a chance to fix this in the near future, so suggestions would be welcome.

@rmg
Copy link
Member

rmg commented Jun 16, 2015

@sam-github thoughts on enhancements to strong-supervisor?

@sam-github
Copy link
Contributor

I feel your pain, synchronous CLI commands are handy for scripting, but its not worth enhancing supervisor, it is increasingly just a component of strong-pm, I don't even know how long the runctl channel will continue to be supported (we're moving to websockets). I suggest you look at http://strong-pm.io!

That said, pm has the same issue of "when should a command return".

Problem is that stop can take a while, a soft-stop can take a few minutes (up to 5: https://github.com/strongloop/strong-cluster-control/blob/master/lib/master.js#L34), with no sign of progress, as it waits for any open connections to that worker to go away, there will be no sign of anything in the CLI... which would call for some kind of explicit option, --block-til-complete, perhaps, to enable this.

For start, its not even clear what start means:

  • when supervisor is running?
  • when first worker starts?
  • when all workers have started?
  • when all workers have listened on a TCP port?

I'd suggest that the low-friction way to do this is write a loop around slc ctl status, waiting until it's state is the expected state.

If you wanted to modify https://github.com/strongloop/strong-mesh-models/blob/master/bin/sl-meshctl.js to support a REST API poller that could wait for various state, that would be interesting, and possibly easier than writing your own loop, maybe something like: slc ctl --control http://production.example.com await --service car-app --size 3, for example.

@danielwhite
Copy link
Author

The problem with strong-pm is that it carries a lot of overhead in terms of deployment complexity (i.e. creating another package just to host loopback applications), when all I need to do is host a single loopback application. My deployment must use standard package management (i.e. RPMs). We're deploying to environments that don't necessarily have outbound network access, and installing compilers (i.e. npm+deps) is strongly frowned on from a security point of view.

At least in the meantime, a status check loop will probably suffice.

I'll have to give it some more thought, since it's nice to get these problems fixed at the root.

@sam-github
Copy link
Contributor

Do you need an rpm-per-app, or can you put strong-pm in an rpm? The latter should be easy. The former is an interesting use-case, I'd like to know if that's a requirement, because we can address it.

Also, note that pm deals with a lot of deployment complexity: it accepts git pushes, or you can push updates with slc deploy. slc deploy works well with slc build. slc build can pack up an app and its deps into either an npm tarball, or a git deploy branch (not master). Those deps can be just javascript, or can be pre-built to include compiled addons. The latter is a good choice for deploying to an environment that doesn't have compilers or outbound network access.

pm also allows remote control and debugging using arc, and supports password-based auth, as well as tunneling its control channel over ssh.

But in summary, it sounds like the two features that would help you are:

  • synchronous start/stop/restart CLIs: for ease of scripting
  • slc pm <app.js>: a one-shot run of an app under strong-pm (similar to how slc run works)

Is this it?

@danielwhite
Copy link
Author

I think you're right probably right. Packaging and deploying strong-pm would be relatively easy. The missing piece for me would then be how to add the application to pm without having a .tgz file. To keep things discoverable, the RPM really needs to own the files.

One-shot runs might also work, but perhaps less of a requirement if there were more ways to add applications to the process manager.

@sam-github
Copy link
Contributor

If you want the app and its runner packaged together, you'd need something like slc pm server/server.js - that's what I mean by one-shot.

You have a heavily rpm-based deploy, you need the app to be packaged in the rpm, using slc depoy is not an option?

@sam-github sam-github self-assigned this Jul 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants