Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres JDBC 42.0.0 produces tons of error logs on container startup #317

Closed
mihhail-lapushkin opened this issue Mar 24, 2017 · 14 comments · Fixed by #327
Closed

Postgres JDBC 42.0.0 produces tons of error logs on container startup #317

mihhail-lapushkin opened this issue Mar 24, 2017 · 14 comments · Fixed by #327

Comments

@mihhail-lapushkin
Copy link
Contributor

mihhail-lapushkin commented Mar 24, 2017

2017-03-24 18:56:50.647 ERROR --- [      pool-3-thread-2] o.p.Driver                               : Connection error: 
org.postgresql.util.PSQLException: The connection attempt failed.
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:275) ~[postgresql-42.0.0.jar:42.0.0]
	at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49) ~[postgresql-42.0.0.jar:42.0.0]
	at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:194) ~[postgresql-42.0.0.jar:42.0.0]
	at org.postgresql.Driver.makeConnection(Driver.java:431) ~[postgresql-42.0.0.jar:42.0.0]
	at org.postgresql.Driver.connect(Driver.java:247) ~[postgresql-42.0.0.jar:42.0.0]
	at org.testcontainers.containers.JdbcDatabaseContainer.lambda$createConnection$2(JdbcDatabaseContainer.java:123) ~[jdbc-1.2.0.jar:?]
	at org.rnorth.ducttape.unreliables.Unreliables.lambda$retryUntilSuccess$0(Unreliables.java:41) ~[duct-tape-1.0.6.jar:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.io.EOFException
	at org.postgresql.core.PGStream.receiveChar(PGStream.java:284) ~[postgresql-42.0.0.jar:42.0.0]
	at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:418) ~[postgresql-42.0.0.jar:42.0.0]
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:222) ~[postgresql-42.0.0.jar:42.0.0]
	... 10 more

With 9.4.1212 they are produced as well, but not logged.

Probably this change made them appear by default with some logging configurations:
https://www.postgresql.org/about/news/1736/

java.util.logging is now used for logging: logging documentation

Is it something that could be fixed on your side or I should just disable all logging for PG driver?
Feels, like a bad idea...

@mihhail-lapushkin mihhail-lapushkin changed the title Postgres JDBC 42.0.0 produces tons of errors on container startup Postgres JDBC 42.0.0 produces tons of error logs on container startup Mar 24, 2017
@rnorth
Copy link
Member

rnorth commented Apr 1, 2017

Hi @mihhail-lapushkin
That sounds unfortunate - logging is always more difficult than it should be :(
Part of the problem is that testcontainers tries to establish a connection pretty aggressively during container startup. We could reduce the frequency of retries, but you're still going to receive some of these error logs.

I think you're probably going to need to adjust the logging, really, but it's not exactly good that your only option seems to be to disable all driver logging... I wonder if anybody else has any ideas?

Richard

@gesellix
Copy link
Contributor

gesellix commented Apr 1, 2017

Could the aggressive connection retries be replaced by a more specific "up and running" check? I'm thinking of the Docker healthcheck feature, though it alone won't really fix anything. The actual fix would be to use a PostgreSQL specific check. Does it provide some built in feature?

@kiview
Copy link
Member

kiview commented Apr 2, 2017

I've recently talked with rhuss, one of the authors of the docker-maven-plugin, since they of course face similar problems, in checking the up state of a container. He said they've gotten the best results by checking the containers's log output for a message, indicating a successful startup.

This of course very much depends on container's implementation, but maybe we could simply allow the usage of a LogOutputWaitStrategy?

@gesellix Postgres distributes a pg_isready client executable, we could launch a different container containing this executable and connect it to the postgres container. I wonder how I generic approach would look in this case, something like ExternalContainerWaitStrategy maybe?

@gesellix
Copy link
Contributor

gesellix commented Apr 2, 2017

@kiview how does pg_isready work? I assume it also somehow needs to perform requests to the PG server and handle potential connectivity errors gracefully.

Either way: one needs to decide whether a more generic approach (watching log output) vs. highly specific solutions (using or simulating pg_isready) is preferred. Maybe both options should be possible and the user needs to decide?

@kiview
Copy link
Member

kiview commented Apr 2, 2017

pg_isready simply returns different exit codes. So we'd have to loop the call until 0 is returned. It also need network connectivity to the database. https://www.postgresql.org/docs/9.3/static/app-pg-isready.html

I think we already planned to refactor the JdbcContainers to favor composition over inheritance, which would allow us, to specify the WaitStrategy for these specialized containers in a more decoupled way I've think. I already did some prototype work to extract the current wait behavior into a JdbcWaitStrategy:
https://github.com/testcontainers/testcontainers-spock/blob/jdbc-wait/src/main/groovy/com/groovycoder/spockdockerextension/wait/JdbcWaitStrategy.groovy
However this implementation would of course throw the same exceptions.

Currently I'd prefer to implement a LogOutputWaitStrategy based solution and also make this available for GenericContainer.

@gesellix
Copy link
Contributor

gesellix commented Apr 2, 2017

I would also start with a generic/log based solution - specific ones can always be built on top.

@kiview
Copy link
Member

kiview commented Apr 2, 2017

I'll start working on LogWaitStrategy we can use for GenericContainers then. Afterwards we have to look into PostgresContainer how to be incorperate the new wait strategy. We wanted to refactor this after all, so I assume this approach is fine?

@rnorth
Copy link
Member

rnorth commented Apr 2, 2017

@kiview This sounds good to me, though we're really aiming to make the inheritance->composition transition at the same time as general breaking API changes for v2.0. If it can be done and make things better without changing the API for users right now, that'd be great!

We collectively need to discuss the v2.0 API changes and composition model, and I'd really appreciate your input on that - it's something we should all talk about soon.

For the LogWaitStrategy I hope you might get some value/reuse out of WaitingConsumer, if you've not already come across that!

@kiview
Copy link
Member

kiview commented Apr 2, 2017

WaitingConsumer looks like a good match for this task, it's currently only used for internal testing, right?

I'll start with the LogWaitStrategy for GenericContainer and see if I can incorporate it into JdbcDatabaseContainer or PostgreSQLContainer without breaking anything.

@asafm
Copy link
Contributor

asafm commented Apr 4, 2017 via email

@kiview
Copy link
Member

kiview commented Apr 7, 2017

I assume we don't want to add a production scope HikariCP dependency to testcontainers-postgres.

When looking into the Postgres log output, I've discovered that the official postgres image will restart once, before being completely ready. We basically see the following output once the database is ready:

LOG:  database system was shut down at 2017-04-07 17:01:42 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
 done
server started
ALTER ROLE

/usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*

LOG:  received fast shutdown request
LOG:  aborting any active transactions
waiting for server to shut down....LOG:  autovacuum launcher shutting down
LOG:  shutting down
LOG:  database system is shut down
 done
server stopped

PostgreSQL init process complete; ready for start up.

LOG:  database system was shut down at 2017-04-07 17:01:43 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections

This means a simple logline based approach won't cut it, we need multiple occurrences of database system is ready to accept connections.

@kiview
Copy link
Member

kiview commented Apr 8, 2017

Seems they've discovered the same problem in the docker-maven plugin and added multiline log matching:
fabric8io/docker-maven-plugin#628
fabric8io/docker-maven-plugin#743

kiview pushed a commit to kiview/testcontainers-java that referenced this issue Apr 12, 2017
kiview pushed a commit to kiview/testcontainers-java that referenced this issue Apr 21, 2017
rnorth pushed a commit that referenced this issue Apr 30, 2017
* Update to latest postgres jdbc driver to make error logs visible

* Change PostgreSQLContainer to use WaitingConsumer for waiting instead of SQL query (#317)

Fix #317

* Make OutputFrame return empty String if containing no bytes

* Change PostgreSQLContainer to use LogMessageWaitStrategy (#317)

Fix #317
@laurikimmel
Copy link

@kiview thanks for the hint.

We had similar long standing issue with Postgres image in our own code base. We are using TCP probe for waiting strategy - with sensible timeouts trying to open connection to expected service port. Once successful service is considered up and running. Only problem we have seen so far was with Postgres container while running on Linux. At the same time it was working fine on MacOS and Windows.

I didn't realise the root cause was inside the container before seeing your comment above. Fix was quite simple. Wait bit more for Postgres - until n successful probes in a row.

Btw. we would like to migrate to testcontainers once TestNG is supported.

@kiview
Copy link
Member

kiview commented Apr 9, 2018

Hey @laurikimmel, great that this information could help.
Please see this comment in #132, you can already use Testcontainers in TestNG (and in general even outside of any test frameworks) if you start and stop containers manually (and still benefit from additional Testcontainers features such as WaitStrategies).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants