Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement support for the repl goal w/ the daemon #3774

Closed
kwlzn opened this issue Aug 10, 2016 · 4 comments · Fixed by #5040
Closed

implement support for the repl goal w/ the daemon #3774

kwlzn opened this issue Aug 10, 2016 · 4 comments · Fixed by #5040
Assignees
Milestone

Comments

@kwlzn
Copy link
Member

kwlzn commented Aug 10, 2016

currently, repl is known to not work with the daemon due to an unimplemented bi-directional user input path here:

# TODO(kwlzn): Implement remote input reading and fix the non-fork()-safe sys.stdin reference
# in NailgunClient to enable support for interactive goals like `repl` etc.

in order to get users using the daemon more generally, we'll need to get this implemented.

@kwlzn kwlzn added this to the v2 engine/daemon adoption milestone Aug 10, 2016
@stuhood stuhood modified the milestones: v2 engine: daemon-by-default beta, 1.4.0 Apr 7, 2017
@stuhood
Copy link
Member

stuhood commented Apr 7, 2017

For the 1.3.0 release, we'll only be telling folks to trial the daemon: this doesn't affect correctness, so I've targeted it to 1.4.0.

@kwlzn kwlzn modified the milestones: 1.4.x, Daemon Beta Oct 3, 2017
@stuhood stuhood self-assigned this Oct 27, 2017
@stuhood
Copy link
Member

stuhood commented Oct 31, 2017

I believe that I have a mostly-working implementation of code to read from the server's nailgun socket and write to stdin of the child process... I think there is one more tweak necessary though, as the child JVM process is exiting immediately as if stdin were empty/closed.

I'm also seeing in master that there is a failure where (seemingly) raw data (without a chunk header) is being written to the socket. This could potentially be a concurrency issue (if sock.sendall is not atomic), or just a case of some code accidentally accessing the socket directly. Will investigate tomorrow.

@stuhood
Copy link
Member

stuhood commented Oct 31, 2017

The issue I'm seeing in master is that a blob of stdout data without a header is being placed on the socket:

 6.1s recv(<pants.java.nailgun_client.NoisyClientSock object>, 5)
 6.6s -> 'Welco'
 6.6s recv(<pants.java.nailgun_client.NoisyClientSock object>, 1466264675)
 6.8s -> 'me to Scala 2.11.11 (Java HotSpot(TM) 64-Bit Ser...xpressions for evaluation. Or try :help.\n\nscala> ' (file:///var/folders/tc/kq93y4pd5jn6gy3pfzd2zd5h0000gn/T/q31418051.txt)

... I can't locate the writer of the value on the server, even tracing aggressively using q, and proxy-instrumenting the socket instance to ensure that only instrumented methods are called.


At this point, I can only assume that this is a concurrency issue due to the use of forked processes writing to those file-like objects in a thread-unsafe way. There are two things that don't quite align with this assumption though:

  1. It's extremely consistent... it doesn't "look" like a race.
  2. Even if it were a race, I would expect to be able to locate the code accessing the socket due to the q logging.

Lacking other ways to make forward process, I'm considering adjusting NailgunStreamWriter to write to a os.pipe, and then having a single thread consume both pipes to send to the socket, and hoping that this either fixes the problem or changes the error.

@stuhood
Copy link
Member

stuhood commented Oct 31, 2017

@kwlzn pointed to

def fileno(self):
return self._socket.fileno()
, which would explain how the child processes might end up with a handle directly to the socket, and thus bypass our framing. Huzzah!

Proceeding with the pipe approach.

stuhood pushed a commit that referenced this issue Nov 9, 2017
### Problem

The `DaemonPantsRunner` did not support consuming `STDIN` and `STDIN_EOF` chunks off of the nailgun socket, and so subprocesses of the pants daemon that read from `stdin` would immediately exit.

Additionally, because `stdin`/`stderr` were not actual pipes in the `DaemonPantsRunner` (rather, they were file-like objects that wrote directly to the nailgun socket) we were not able to fork and still proxy `stdout`/`stderr` from the forked process.  

### Solution

* Replace the synchronous file-like `NailgunStreamWriter` with an adjusted version of the previous reader code that uses a thread to copy from an anonymous `os.pipe`/`os.openpty` onto the socket.
* Use  `NailgunStreamWriter` to publish `sys.stdout`/`sys.stderr` from the server, and `sys.stdin` from the client.
* Implement `NailgunStreamStdinReader` to read `stdin` nailgun chunks on the server and proxy them to `sys.stdin`.
* Explicitly pass `sys.stdin` and `sys.stderr` when forking with SubprocessExecutor to disable additional handling in `Popen` that caused forked processes to write directly to fd 1 (the socket, in this case).
* Add an `@ensure_daemon` decorator, and use it to ensure that python and scala repls work in the context of the daemon.
* Move the watchman directory under the `subprocess_dir`, to fix an issue where calling `./pants kill-pantsd` would fail if the `subprocess_dir` matched but the `workdir` didn't.

### Result

Fixes #3774.

There is one slight oddity, which is that our `Ctrl+D` handling doesn't quite align with daemonless usage: opened #5058 about that.
kwlzn pushed a commit to twitter/pants that referenced this issue Nov 27, 2017
### Problem

The `DaemonPantsRunner` did not support consuming `STDIN` and `STDIN_EOF` chunks off of the nailgun socket, and so subprocesses of the pants daemon that read from `stdin` would immediately exit.

Additionally, because `stdin`/`stderr` were not actual pipes in the `DaemonPantsRunner` (rather, they were file-like objects that wrote directly to the nailgun socket) we were not able to fork and still proxy `stdout`/`stderr` from the forked process.  

### Solution

* Replace the synchronous file-like `NailgunStreamWriter` with an adjusted version of the previous reader code that uses a thread to copy from an anonymous `os.pipe`/`os.openpty` onto the socket.
* Use  `NailgunStreamWriter` to publish `sys.stdout`/`sys.stderr` from the server, and `sys.stdin` from the client.
* Implement `NailgunStreamStdinReader` to read `stdin` nailgun chunks on the server and proxy them to `sys.stdin`.
* Explicitly pass `sys.stdin` and `sys.stderr` when forking with SubprocessExecutor to disable additional handling in `Popen` that caused forked processes to write directly to fd 1 (the socket, in this case).
* Add an `@ensure_daemon` decorator, and use it to ensure that python and scala repls work in the context of the daemon.
* Move the watchman directory under the `subprocess_dir`, to fix an issue where calling `./pants kill-pantsd` would fail if the `subprocess_dir` matched but the `workdir` didn't.

### Result

Fixes pantsbuild#3774.

There is one slight oddity, which is that our `Ctrl+D` handling doesn't quite align with daemonless usage: opened pantsbuild#5058 about that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants