Skip to content
This repository has been archived by the owner on Mar 21, 2022. It is now read-only.

[2.7.6] Logs w/ follow hangs, or finalizer error #125

Open
hlship opened this issue Feb 18, 2015 · 11 comments
Open

[2.7.6] Logs w/ follow hangs, or finalizer error #125

hlship opened this issue Feb 18, 2015 · 11 comments
Labels
bug help wanted pinned exempt from being marked as stale

Comments

@hlship
Copy link

hlship commented Feb 18, 2015

My code is starting a Docker container with a Postgres instance inside, and waiting for the output to include the text "database system is ready to accept connections".

My code works fine ... but after I see the target text things go wonky.

If I close the LogStream, my process hangs for a while, then gets an exception:

(start-container)
09:07:27.965 INFO  no-correlation-id [nREPL-worker-3] c.s.d.c.DefaultDockerClient - Creating container with ContainerConfig: ContainerConfig{hostname=null, domainname=null, username=null, memory=null, memorySwap=null, cpuShares=null, cpuset=null, attachStdin=null, attachStdout=null, attachStderr=null, portSpecs=[62890:5432], exposedPorts=null, tty=null, openStdin=null, stdinOnce=null, env=null, cmd=null, image=quay.io/aviso/combined-db-migrated, volumes=null, workingDir=null, entrypoint=null, networkDisabled=null, onBuild=null}
09:07:28.096 INFO  no-correlation-id [nREPL-worker-3] c.s.d.c.DefaultDockerClient - Starting container with HostConfig: HostConfig{binds=null, containerIDFile=null, lxcConf=null, privileged=null, portBindings=null, links=null, publishAllPorts=null, dns=null, dnsSearch=null, volumesFrom=null, networkMode=null}
2015-02-18 17:07:10 UTC LOG:  database system was interrupted; last known up at 2015-02-12 21:40:56 UTC
2015-02-18 17:07:10 UTC LOG:  database system was not properly shut down; automatic recovery in progress
2015-02-18 17:07:10 UTC LOG:  redo starts at 0/1792078
2015-02-18 17:07:10 UTC LOG:  record with zero length at 0/19DD660
2015-02-18 17:07:10 UTC LOG:  redo done at 0/19DD620
2015-02-18 17:07:10 UTC LOG:  last completed transaction was at log time 2015-02-12 21:41:06.015063+00
2015-02-18 17:07:11 UTC LOG:  autovacuum launcher started
2015-02-18 17:07:11 UTC LOG:  database system is ready to accept connections
up and running!
java.net.SocketTimeoutException: Read timed out
     java.lang.RuntimeException: java.net.SocketTimeoutException: Read timed out
(pst)
                                                                          clojure.core/eval                          core.clj: 3076
                                                                                        ...                                        
                                                                      docker-utils/eval6732  form-init5456222565433859833.clj:    1
                                                               docker-utils/start-container                  docker_utils.clj:  151
                                                              docker-utils/launch-container                  docker_utils.clj:  128
                                                             docker-utils/wait-for-startup2                  docker_utils.clj:  110
                                                  com.spotify.docker.client.LogStream.close                    LogStream.java:   74
                                                  com.spotify.docker.client.LogReader.close                    LogReader.java:   88
                                                      com.google.common.io.ByteStreams.copy                  ByteStreams.java:  175
org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream.read    ReaderInterceptorExecutor.java:  295
                               org.glassfish.jersey.message.internal.EntityInputStream.read            EntityInputStream.java:   96
                                                             java.io.FilterInputStream.read            FilterInputStream.java:  107
                                                             java.io.FilterInputStream.read            FilterInputStream.java:  133
                                                           java.io.BufferedInputStream.read          BufferedInputStream.java:  334
                                                          java.io.BufferedInputStream.read1          BufferedInputStream.java:  275
                                                           java.io.BufferedInputStream.fill          BufferedInputStream.java:  235
                                             org.apache.http.conn.EofSensorInputStream.read         EofSensorInputStream.java:  137
                                            org.apache.http.impl.io.ChunkedInputStream.read           ChunkedInputStream.java:  169
                                       org.apache.http.impl.io.ChunkedInputStream.nextChunk           ChunkedInputStream.java:  206
                                    org.apache.http.impl.io.ChunkedInputStream.getChunkSize           ChunkedInputStream.java:  240
                                    org.apache.http.impl.io.SessionInputBufferImpl.readLine       SessionInputBufferImpl.java:  270
                                  org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer       SessionInputBufferImpl.java:  152
                                  org.apache.http.impl.io.SessionInputBufferImpl.streamRead       SessionInputBufferImpl.java:  136
                                                            java.net.SocketInputStream.read            SocketInputStream.java:  121
                                                            java.net.SocketInputStream.read            SocketInputStream.java:  150
                                                     java.net.SocketInputStream.socketRead0            SocketInputStream.java      
java.net.SocketTimeoutException: Read timed out
     java.lang.RuntimeException: java.net.SocketTimeoutException: Read timed out

(Obviously, I'm working in Clojure, but that should hopefully not be relevant).

On the other hand, if I skip the close step, things work fine ... but at some future point, I see this error in my console:

(System/gc)
=> nil
09:09:19.238 WARN  no-correlation-id [Finalizer] c.s.d.c.LogStream - com.spotify.docker.client.LogStream@79220f71 not closed properly

If it helps, here's the code:

(defn- wait-for-startup2
  "Waits for the container to startup. Returns nil or throws an exception."
  [docker-client container-id container-name]
  ;; docker logs outputs to STDERR for some reason
  (let [^LogStream logs (.. docker-client (logs container-id (into-array DockerClient$LogsParameter [DockerClient$LogsParameter/STDERR
                                                                                                     DockerClient$LogsParameter/FOLLOW])))
        utf-8           Charsets/UTF_8
        ;; A bit cumbersome to get the logs
        cmd-fn          (fn []
                          (if (.hasNext logs)
                            (->> logs .next .content (.decode utf-8) .toString)))
        test-fn         (fn [line]
                          (print line)
                          (and line
                               (re-find #"database system is ready to accept connections" line)))
        wait-result     (wait-until cmd-fn test-fn 5000)]

    (println "up and running!")

    ; (.close logs)

    (if (= ::timeout wait-result)
      (throw (DockerException. (format "Docker container `%s' did not complete initialization before timeout." container-name))))))
@rohansingh rohansingh added the bug label Feb 25, 2015
@jwgmeligmeyling
Copy link
Contributor

I'm not sure if @rohansingh already figured out that this really is a bug. But couldn't it be the case that after the expected message was received, you indeed keep listening for new messages - which do not arrive and then lead to a socket timeout exception? By the way the timeouts can be set and even disabled.

@rohansingh
Copy link
Contributor

Sorry, I haven't looked very closely at this, just gave it the bug label while organizing things.

@jwgmeligmeyling
Copy link
Contributor

Ah okay! I will try to come up with a test to proof this issue. Maybe the topic starter can provide some code?

Edit; oops there is code, missed that

Op 27 feb. 2015 om 01:33 heeft Rohan Singh <[email protected]mailto:[email protected]> het volgende geschreven:

Sorry, I haven't looked very closely at this, just gave it the bug label while organizing things.


Reply to this email directly or view it on GitHubhttps://github.com//issues/125#issuecomment-76308172.

@jwgmeligmeyling
Copy link
Contributor

This actually is an issue. I used the following test for proof: (Basically I create a DockerClient with a 4000 ms timeout and a Docker container with a 5 second sleep)

package com.spotify.docker.client;

import java.io.IOException;

import org.hamcrest.Matchers;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Rule;
import org.junit.Test;

import com.spotify.docker.client.DockerClient.LogsParameter;
import com.spotify.docker.test.CreateContainer;
import com.spotify.docker.test.DockerContainer;

import static com.google.common.base.Charsets.UTF_8;
import static org.junit.Assert.assertThat;

public class PrematureLogsCloseCall {

  private static DockerClient dockerClient;

  @Rule
  public DockerContainer dockerContainer = new DockerContainer(dockerClient);

  @BeforeClass
  public static void setUp() throws DockerCertificateException {
    dockerClient = DefaultDockerClient
        .fromEnv()
        .readTimeoutMillis(3000)
        .build();
  }

  @Test
  @CreateContainer(image = "busybox", command = {"sh", "-c", "echo start && sleep 5 && echo stop"}, start = true)
  public void testIt() throws IOException,
      DockerException, InterruptedException {

    String containerId = dockerContainer.getContainerId();

    try(LogStream logStream = dockerClient.logs(containerId, LogsParameter.STDOUT, LogsParameter.FOLLOW)) {
      assertThat(UTF_8.decode(logStream.next().content()).toString(),
          Matchers.containsString("start"));
      logStream.close();
    }

  }

  @AfterClass
  public static void cleanUp() {
    dockerClient.close();
  }

}

The problem is that in the LogReader the following happens on close:

  @Override
  public void close() throws IOException {
    closed = true;
    // Jersey will close the stream and release the connection after we read all the data.
    // We cannot call the stream's close method because it an instance of UncloseableInputStream,
    // where close is a no-op.
    copy(stream, nullOutputStream());
  }

So the remainder of the stream is being copied to a null output stream, but as no messages keep coming in an SocketTimeoutException will be thrown. Why is this being copied to a nullstream anyway? Why not just call close on the wrapped stream and ignore that it is no-op? (The code beneath makes te test pass)

  @Override
  public void close() throws IOException {
    closed = true;
    stream.close()
  }

@LyndonArmitage
Copy link

Noticed this today when trying to watch logs for a startup message and block until it is seen:

    private void waitUntilTomcatStarted() throws IOException, InterruptedException, DockerException {
        logger.info("Blocking until Tomcat has started");
        long start = System.currentTimeMillis();
        try (LogStream logStream = dockerClient.logs(containerId, DockerClient.LogsParameter.STDERR, DockerClient.LogsParameter.FOLLOW)) {
            boolean started = false;
            while (!started) {
                while(logStream.hasNext()) {
                    LogMessage next = logStream.next();
                    ByteBuffer content = next.content();
                    byte[] buffer = new byte[content.remaining()];
                    content.get(buffer);
                    String message = new String(buffer, "UTF-8");
                    System.out.print(message);
                    if (StringUtils.contains(message, "org.apache.catalina.startup.Catalina.start Server startup in ")) {
                        logger.info("Tomcat Server started after {}ms", (System.currentTimeMillis() - start));
                        started = true;
                        break;
                    }
                }
                logger.info("looped");
            }
            logger.info("I am stuck");
        }
        logger.info("Never reached");
    }

Admittedly this was implemented as a bit of a hack (seemed to be one of the easiest way of checking if the tomcat instance had started). I worked around it by not following the logs and running a dockerClient.logs(containerId, DockerClient.LogsParameter.STDERR) after some progressive sleeps until the message was found (although this feels like an even bigger hack).

@rgrunber
Copy link
Contributor

rgrunber commented Jan 6, 2016

I think the issue described in this bug was probably fixed in b76fc638 (Issue #223 ). The SocketTimeoutException was being hit when no data was being sent on the socket for a period of time longer than the read timeout.

@jwgmeligmeyling
Copy link
Contributor

I think, as the code within the close block is not fixed, this is probably worse. Now you just keep the socket open, until either the connection or container dies. If you don't need the logs anymore, it should just disconnect the socket on close.

@arakelian
Copy link

+1, no longer closing LogStream to work around issue

@grexe
Copy link

grexe commented May 18, 2017

just stumbled over this in v8.1.3, applied the fix from #125 (comment) and it works perfectly - can we please integrate this into master?
I see that https://github.com/spotify/docker-client/blob/master/src/main/java/com/spotify/docker/client/LogReader.java still has the old bug.
Update sadly still hangs the calling Thread, even with the fix to close() the stream.

@stale
Copy link

stale bot commented Sep 24, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 24, 2018
@davidxia davidxia added help wanted pinned exempt from being marked as stale and removed stale labels Sep 25, 2018
@dmandalidis
Copy link
Contributor

Hi all,

Since this project went on mature status, please re-open this issue (if it still stands) to https://github.com/dmandalidis/docker-client. Thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug help wanted pinned exempt from being marked as stale
Projects
None yet
Development

No branches or pull requests

9 participants