Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Finalizing JUnit 5 test case with Kafka Streams and Spring Boot takes several minutes #8867

Closed
MarvinVaillant opened this issue Jul 9, 2024 · 8 comments
Labels

Comments

@MarvinVaillant
Copy link

MarvinVaillant commented Jul 9, 2024

Module

Kafka

Testcontainers version

1.19.8

Using the latest Testcontainers version?

Yes

Host OS

MacOS, Linux, Windows

Host Arch

ARM, x86

Docker version

This is my local setup with Podman. In the pipeline we use Docker and other colleagues use Docker Desktop where the issue also happens.

Client: Docker Engine - Community
 Version:           27.0.3
 API version:       1.41 (downgraded from 1.46)
 Go version:        go1.22.4
 Git commit:        7d4bcd863a
 Built:             Fri Jun 28 14:56:30 2024
 OS/Arch:           darwin/arm64
 Context:           default

Server: linux/arm64/fedora-40
 Podman Engine:
  Version:          5.0.3
  APIVersion:       5.0.3
  Arch:             arm64
  BuildTime:        2024-05-10T02:00:00+02:00
  Experimental:     false
  GitCommit:
  GoVersion:        go1.22.2
  KernelVersion:    6.8.8-300.fc40.aarch64
  MinAPIVersion:    4.0.0
  Os:               linux
 Conmon:
  Version:          conmon version 2.1.10, commit:
  Package:          conmon-2.1.10-1.fc40.aarch64
 OCI Runtime (crun):
  Version:          crun version 1.14.4
commit: a220ca661ce078f2c37b38c92e66cf66c012d9c1
rundir: /run/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  Package:          crun-1.14.4-1.fc40.aarch64
 Engine:
  Version:          5.0.3
  API version:      1.41 (minimum version 1.24)
  Go version:       go1.22.2
  Git commit:
  Built:            Fri May 10 02:00:00 2024
  OS/Arch:          linux/arm64
  Experimental:     false

What happened?

We're running a Spring Boot project with Kafka Streams. We have one integration test (using @SpringBootTest and @Testcontainers) that sometimes takes several minutes to complete after all tests successfully ran. Lots of Kafka "no connection to node" and "Node 1 disconnected" are printed in the logs repeatedly.

It seems like the Kafka container is stopped before the Spring context is shutdown. And the Spring Context is waiting for Kafka to become available before shutting down.

Relevant log output

### The following 20 lines are printed repeatedly for several minutes
[11:00:35.147] [INFO] [org.apache.kafka.clients.NetworkClient] - [Consumer clientId=<our-consumer-group>-StreamThread-1-consumer, groupId=streams.notification-report-selector.notification-repartitioner] Node 1 disconnected.
[11:00:35.147] [WARN] [org.apache.kafka.clients.NetworkClient] - [Consumer clientId=<our-consumer-group>-StreamThread-1-consumer, groupId=streams.notification-report-selector.notification-repartitioner] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:35.218] [INFO] [org.apache.kafka.clients.NetworkClient] - [AdminClient clientId=<our-consumer-group>-admin] Node 1 disconnected.
[11:00:35.218] [WARN] [org.apache.kafka.clients.NetworkClient] - [AdminClient clientId=<our-consumer-group>-admin] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:35.350] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-3] Node 1 disconnected.
[11:00:35.350] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-3] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:35.591] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=<our-consumer-group>-StreamThread-1-producer] Node 1 disconnected.
[11:00:35.591] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=<our-consumer-group>-StreamThread-1-producer] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:35.782] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-1] Node 1 disconnected.
[11:00:35.782] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-1] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.072] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-2] Node 1 disconnected.
[11:00:36.072] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-2] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.148] [INFO] [org.apache.kafka.clients.NetworkClient] - [Consumer clientId=<our-consumer-group>-StreamThread-1-consumer, groupId=[<our-consumer-group>]] Node 1 disconnected.
[11:00:36.148] [WARN] [org.apache.kafka.clients.NetworkClient] - [Consumer clientId=<our-consumer-group>-StreamThread-1-consumer, groupId=[<our-consumer-group>]] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.221] [INFO] [org.apache.kafka.clients.NetworkClient] - [AdminClient clientId=<our-consumer-group>-admin] Node 1 disconnected.
[11:00:36.221] [WARN] [org.apache.kafka.clients.NetworkClient] - [AdminClient clientId=<our-consumer-group>-admin] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.370] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-3] Node 1 disconnected.
[11:00:36.371] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-3] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.544] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=<our-consumer-group>-StreamThread-1-producer] Node 1 disconnected.
[11:00:36.544] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=<our-consumer-group>-StreamThread-1-producer] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.731] [INFO] [org.apache.kafka.clients.admin.internals.AdminMetadataManager] - [AdminClient clientId=<our-consumer-group>-admin] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: fetchMetadata
[11:00:36.804] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-1] Node 1 disconnected.
[11:00:36.804] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-1] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.
[11:00:36.925] [INFO] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-2] Node 1 disconnected.
[11:00:36.925] [WARN] [org.apache.kafka.clients.NetworkClient] - [Producer clientId=producer-2] Connection to node 1 (localhost/127.0.0.1:32772) could not be established. Node may not be available.

### The following log lines are only printed once at when the tests are finalized

Error: 1:00:36.961] [ERROR] [org.apache.kafka.streams.processor.internals.TaskExecutor] - stream-thread [<our-consumer-group>-StreamThread-1] Committing task(s) 0_1 failed.
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {notification.managed-1=OffsetAndMetadata{offset=5, leaderEpoch=null, metadata='AgAAAZCWstBN'}}
Error: 1:00:36.961] [ERROR] [org.apache.kafka.streams.processor.internals.TaskManager] - stream-thread [<our-consumer-group>-StreamThread-1] Exception caught while committing tasks [TaskId: 0_1
	ProcessorTopology:
		Notifications:
			topics:		[notification.managed]
			children:	[Event-Type-Filter]
		Event-Type-Filter:
			children:	[Key-Selector]
		Key-Selector:
			children:	[Notifications-By-Id]
		Notifications-By-Id:
			topic:		StaticTopicNameExtractor(notification.by-id)
Partitions [notification.managed-1]
]
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {notification.managed-1=OffsetAndMetadata{offset=5, leaderEpoch=null, metadata='AgAAAZCWstBN'}}
[11:00:36.961] [INFO] [org.apache.kafka.streams.processor.internals.StreamTask] - stream-thread [<our-consumer-group>-StreamThread-1] task [0_0] Suspended from RUNNING
[11:00:36.962] [INFO] [org.apache.kafka.streams.processor.internals.RecordCollectorImpl] - stream-thread [<our-consumer-group>-StreamThread-1] stream-task [0_0] Closing record collector dirty
[11:00:36.962] [INFO] [org.apache.kafka.streams.processor.internals.StreamTask] - stream-thread [<our-consumer-group>-StreamThread-1] task [0_0] Closed dirty
[11:00:36.962] [INFO] [org.apache.kafka.streams.processor.internals.StreamTask] - stream-thread [<our-consumer-group>-StreamThread-1] task [0_1] Suspended from RUNNING
[11:00:36.963] [INFO] [org.apache.kafka.streams.processor.internals.RecordCollectorImpl] - stream-thread [<our-consumer-group>-StreamThread-1] stream-task [0_1] Closing record collector dirty
[11:00:36.963] [INFO] [org.apache.kafka.streams.processor.internals.StreamTask] - stream-thread [<our-consumer-group>-StreamThread-1] task [0_1] Closed dirty
[11:00:36.963] [INFO] [org.apache.kafka.clients.producer.KafkaProducer] - [Producer clientId=<our-consumer-group>-StreamThread-1-producer] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
[11:00:36.966] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics scheduler closed
[11:00:36.966] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[11:00:36.966] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
[11:00:36.966] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics reporters closed
[11:00:36.967] [INFO] [org.apache.kafka.common.utils.AppInfoParser] - App info kafka.producer for <our-consumer-group>-StreamThread-1-producer unregistered
Error: 1:00:36.967] [ERROR] [org.apache.kafka.streams.processor.internals.StreamThread] - stream-thread [<our-consumer-group>-StreamThread-1] Failed to close task manager due to the following error:
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before successfully committing offsets {notification.managed-1=OffsetAndMetadata{offset=5, leaderEpoch=null, metadata='AgAAAZCWstBN'}}
[11:00:36.967] [INFO] [org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer] - [Consumer clientId=<our-consumer-group>-StreamThread-1-restore-consumer, groupId=null] Unsubscribed all topics or patterns and assigned partitions
[11:00:36.967] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics scheduler closed
[11:00:36.967] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[11:00:36.967] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
[11:00:36.967] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics reporters closed
[11:00:36.970] [INFO] [org.apache.kafka.common.utils.AppInfoParser] - App info kafka.consumer for <our-consumer-group>-StreamThread-1-consumer unregistered
[11:00:36.970] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics scheduler closed
[11:00:36.970] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[11:00:36.970] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
[11:00:36.970] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics reporters closed
[11:00:36.972] [INFO] [org.apache.kafka.common.utils.AppInfoParser] - App info kafka.consumer for <our-consumer-group>-StreamThread-1-restore-consumer unregistered
[11:00:36.973] [INFO] [org.apache.kafka.streams.processor.internals.StreamThread] - stream-thread [<our-consumer-group>-StreamThread-1] State transition from PENDING_SHUTDOWN to DEAD
[11:00:36.973] [INFO] [org.apache.kafka.streams.processor.internals.StreamThread] - stream-thread [<our-consumer-group>-StreamThread-1] Shutdown complete
[11:00:36.973] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Shutdown 1 stream threads complete
[11:00:36.973] [INFO] [org.apache.kafka.common.utils.AppInfoParser] - App info kafka.admin.client for <our-consumer-group>-admin unregistered
[11:00:36.973] [INFO] [org.apache.kafka.clients.admin.internals.AdminMetadataManager] - [AdminClient clientId=<our-consumer-group>-admin] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited. Call: fetchMetadata
[11:00:36.973] [INFO] [org.apache.kafka.clients.admin.KafkaAdminClient] - [AdminClient clientId=<our-consumer-group>-admin] Timed out 1 remaining operation(s) during close.
[11:00:36.984] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics scheduler closed
[11:00:36.984] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[11:00:36.984] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics reporters closed
[11:00:36.984] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics scheduler closed
[11:00:36.984] [INFO] [org.apache.kafka.common.metrics.Metrics] - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[11:00:36.984] [INFO] [org.apache.kafka.common.metrics.Metrics] - Metrics reporters closed
[11:00:36.985] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] State transition from PENDING_SHUTDOWN to NOT_RUNNING
[11:00:36.985] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client stopped completely
[11:00:37.006] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal NOT_RUNNING state, all resources are closed and the client has stopped.
[11:00:37.007] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal NOT_RUNNING state, all resources are closed and the client has stopped.
[11:00:37.007] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal NOT_RUNNING state, all resources are closed and the client has stopped.
[11:00:37.007] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal NOT_RUNNING state, all resources are closed and the client has stopped.
[11:00:37.011] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.011] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.011] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.011] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.016] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.016] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.016] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.016] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.028] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.028] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.029] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.029] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.044] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.044] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.044] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.
[11:00:37.044] [INFO] [org.apache.kafka.streams.KafkaStreams] - stream-client [<our-consumer-group>] Streams client is already in the terminal ERROR state, all resources are closed and the client has stopped.

Additional Information

The issue seems to be mitigated to some extend (we wait < 10 seconds for the test to finalize) if we start the container manually in the @BeforeAll method, instead of using the @Testcontainers and @Container annotations.

@MarvinVaillant MarvinVaillant changed the title [Bug]: [Bug]: Finalizing JUnit 5 test case with Kafka Streams and Spring Boot takes several minutes Jul 9, 2024
@MarvinVaillant
Copy link
Author

Does anyone have a clue why this can happen? Or how I can investigate further?

@eddumelendez
Copy link
Member

eddumelendez commented Jul 10, 2024

Hi, do you have a project to reproduce it?

@MarvinVaillant
Copy link
Author

I'll try to provide you with a minimal example today.

@MarvinVaillant
Copy link
Author

Sorry for the slow response. Here is the example repo: https://github.com/MarvinVaillant/demo-long-kafka-streams-teardown/tree/main

@MarvinVaillant
Copy link
Author

Here you'll also find a short video that shows the issue:
https://github.com/user-attachments/assets/535847ac-cbbc-4f72-b6c9-a09e682bf84e

@eddumelendez
Copy link
Member

@MarvinVaillant thanks for providing the example but I am not able to run the project. It fails because of Mapped port can only be obtained after the container is started. I can make some changes but I can not see the behavior described.

@MarvinVaillant
Copy link
Author

i'm really sorry. I was fiddeling around with the code to get some more insights. Could you please pull the latest commits?

@eddumelendez
Copy link
Member

Thanks for updating it. I'm moving this to a discussion because It is not an issue itself. Testcontainers is creating, running the container and killing it. However, in order to close the context you should call close explicitly or rather use @DirtiesContext. I do not have much experience on Kafka Streams but others can help on how to handle this.

@testcontainers testcontainers locked and limited conversation to collaborators Jul 15, 2024
@eddumelendez eddumelendez converted this issue into discussion #8948 Jul 15, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

2 participants