Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.Thread.State: RUNNABLE at java.net.SocketOutputStream.socketWrite0(Native Method #2031

Open
Byte-n opened this issue Dec 17, 2024 · 2 comments
Labels
client-api-v2 investigating Investigation of a root cause is on going network network and IO related issues

Comments

@Byte-n
Copy link

Byte-n commented Dec 17, 2024

Describe the bug

The thread is stuck at this position for a long time:
java.lang.Thread.State: RUNNABLE at java.net.SocketOutputStream.socketWrite0(Native Method

Steps to reproduce

  1. new Client.Builder()....build() in main thread

Expected behaviour

Return to the client

Code example

public static void upload(String tableName, String rowsStr) throws Exception {
    String[] rows = rowsStr.split("\n");
    if (rows.length == 0) {
        return;
    }

    try (Client client = new Client.Builder()
            .enableConnectionPool(false)
            // .compressClientRequest(true)
            // .setMaxConnections(Runtime.getRuntime().availableProcessors())
            .setConnectionRequestTimeout(3, ChronoUnit.MINUTES)
            .addEndpoint("http://xxx:x")
            .setUsername("x")
            .setPassword("xxx")
            .setConnectionRequestTimeout(3, ChronoUnit.MINUTES)
            .setConnectTimeout(3, ChronoUnit.MINUTES)
            .setSocketTimeout(3, ChronoUnit.MINUTES)
            .setExecutionTimeout(3, ChronoUnit.MINUTES)
            .setKeepAliveTimeout(3, ChronoUnit.MINUTES)
            .setConnectTimeout(3, ChronoUnit.MINUTES)
            .setSocketKeepAlive(true)
            .setMaxRetries(0)
            .build();
         InsertResponse response = client.insert(
                 tableName,
                 new ByteArrayInputStream(rowsStr.getBytes(StandardCharsets.UTF_8)),
                 ClickHouseFormat.TSV
         ).get()
    ) {
        LOG.info("Insert finished: {} rows written, pre: {}", response.getMetrics().getMetric(ServerMetrics.NUM_ROWS_WRITTEN).getLong(), rows.length);
    } catch (Exception e) {
        LOG.error("Insert failed", e);
        Thread.sleep(1000 * 30);
        upload(tableName, rowsStr);
    }
}

Error log

...
7|javaApp  | 2024-12-17 17:01:05,874 [main] [org.byten.util.ClickHouseUtils.upload(ClickHouseUtils.java:53)] - [INFO] Insert finished: 701140 rows written, pre: 701140

7|javaApp  | 2024-12-17 17:01:06,003 [main] [com.clickhouse.client.api.Client$Builder.build(Client.java:938)] - [INFO] Using server timezone: UTC
7|javaApp  | 2024-12-17 17:01:06,005 [main] [com.clickhouse.client.api.internal.HttpAPIClientHelper.<init>(HttpAPIClientHelper.java:107)] - [INFO] client compression: false, server compression: true, http compression: false
7|javaApp  | 2024-12-17 17:01:06,005 [main] [com.clickhouse.client.api.Client.<init>(Client.java:173)] - [INFO] Using new http client implementation
7|javaApp  | 2024-12-17 17:01:08,086 [main] [org.byten.util.ClickHouseUtils.upload(ClickHouseUtils.java:53)] - [INFO] Insert finished: 639313 rows written, pre: 639313

7|javaApp  | 2024-12-17 17:01:08,431 [main] [com.clickhouse.client.api.Client$Builder.build(Client.java:938)] - [INFO] Using server timezone: UTC
7|javaApp  | 2024-12-17 17:01:08,432 [main] [com.clickhouse.client.api.internal.HttpAPIClientHelper.<init>(HttpAPIClientHelper.java:107)] - [INFO] client compression: false, server compression: true, http compression: false
7|javaApp  | 2024-12-17 17:01:08,432 [main] [com.clickhouse.client.api.Client.<init>(Client.java:173)] - [INFO] Using new http client implementation
7|javaApp  | 2024-12-17 17:01:10,423 [main] [org.byten.util.ClickHouseUtils.upload(ClickHouseUtils.java:53)] - [INFO] Insert finished: 612367 rows written, pre: 612367

7|javaApp  | 2024-12-17 17:01:10,556 [main] [com.clickhouse.client.api.Client$Builder.build(Client.java:938)] - [INFO] Using server timezone: UTC
7|javaApp  | 2024-12-17 17:01:10,557 [main] [com.clickhouse.client.api.internal.HttpAPIClientHelper.<init>(HttpAPIClientHelper.java:107)] - [INFO] client compression: false, server compression: true, http compression: false
7|javaApp  | 2024-12-17 17:01:10,557 [main] [com.clickhouse.client.api.Client.<init>(Client.java:173)] - [INFO] Using new http client implementation

^C
➜  / date
Tue Dec 17 17:22:27 UTC 2024
➜  / ps -ef | grep java
root     59252   427 13 17:00 ?        00:03:04 /usr/bin/java -Xmx25g -Dlog4j.configuration=file:/data/jar/log4j.properties -jar java-service.jar /data/baiduyun/
➜  / jstack 59252
2024-12-17 17:23:04
Full thread dump OpenJDK 64-Bit Server VM (25.432-bga mixed mode):

"Attach Listener" #18 daemon prio=9 os_prio=0 tid=0x00007f4204001000 nid=0xe903 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #17 daemon prio=9 os_prio=0 tid=0x00007f42942d3000 nid=0xe793 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00007f42942d0000 nid=0xe792 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00007f42942ce000 nid=0xe791 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00007f42942cc000 nid=0xe790 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00007f42942ca000 nid=0xe78f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread7" #12 daemon prio=9 os_prio=0 tid=0x00007f42942c8000 nid=0xe78e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread6" #11 daemon prio=9 os_prio=0 tid=0x00007f42942c5800 nid=0xe78d waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread5" #10 daemon prio=9 os_prio=0 tid=0x00007f42942c3800 nid=0xe78c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread4" #9 daemon prio=9 os_prio=0 tid=0x00007f42942c1800 nid=0xe78b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007f42942bf800 nid=0xe78a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f42942bd800 nid=0xe789 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f42942bb800 nid=0xe788 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f42942b8800 nid=0xe787 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f42942b1000 nid=0xe786 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f4294281800 nid=0xe785 in Object.wait() [0x00007f424f4f4000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000018cb3f2d8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
        - locked <0x000000018cb3f2d8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f429427d000 nid=0xe784 in Object.wait() [0x00007f424f5f4000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000001828b8ff0> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
        - locked <0x00000001828b8ff0> (a java.lang.ref.Reference$Lock)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00007f429400a000 nid=0xe775 runnable [0x00007f429b7a1000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
        at org.apache.hc.core5.http.impl.io.DefaultBHttpClientConnection$1.write(DefaultBHttpClientConnection.java:232)
        at org.apache.hc.core5.http.impl.io.SessionOutputBufferImpl.write(SessionOutputBufferImpl.java:143)
        at org.apache.hc.core5.http.impl.io.ChunkedOutputStream.flushCacheWithAppend(ChunkedOutputStream.java:143)
        at org.apache.hc.core5.http.impl.io.ChunkedOutputStream.write(ChunkedOutputStream.java:222)
        at com.clickhouse.client.api.Client.lambda$insert$6(Client.java:1404)
        at com.clickhouse.client.api.Client$$Lambda$34/339924917.execute(Unknown Source)
        at org.apache.hc.core5.http.io.entity.EntityTemplate.writeTo(EntityTemplate.java:82)
        at com.clickhouse.client.api.internal.LZ4Entity.writeTo(LZ4Entity.java:84)
        at org.apache.hc.core5.http.impl.io.DefaultBHttpClientConnection.sendRequestEntity(DefaultBHttpClientConnection.java:253)
        at org.apache.hc.core5.http.impl.io.HttpRequestExecutor.execute(HttpRequestExecutor.java:141)
        at org.apache.hc.core5.http.impl.io.HttpRequestExecutor.execute(HttpRequestExecutor.java:218)
        at org.apache.hc.client5.http.impl.io.BasicHttpClientConnectionManager$InternalConnectionEndpoint.execute(BasicHttpClientConnectionManager.java:555)
        at org.apache.hc.client5.http.impl.classic.InternalExecRuntime.execute(InternalExecRuntime.java:216)
        at org.apache.hc.client5.http.impl.classic.MainClientExec.execute(MainClientExec.java:116)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement$$Lambda$35/2146608740.proceed(Unknown Source)
        at org.apache.hc.client5.http.impl.classic.ConnectExec.execute(ConnectExec.java:188)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement$$Lambda$35/2146608740.proceed(Unknown Source)
        at org.apache.hc.client5.http.impl.classic.ProtocolExec.execute(ProtocolExec.java:192)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement$$Lambda$35/2146608740.proceed(Unknown Source)
        at org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec.execute(HttpRequestRetryExec.java:113)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement$$Lambda$35/2146608740.proceed(Unknown Source)
        at org.apache.hc.client5.http.impl.classic.ContentCompressionExec.execute(ContentCompressionExec.java:152)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement$$Lambda$35/2146608740.proceed(Unknown Source)
        at org.apache.hc.client5.http.impl.classic.RedirectExec.execute(RedirectExec.java:116)
        at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
        at org.apache.hc.client5.http.impl.classic.InternalHttpClient.doExecute(InternalHttpClient.java:170)
        at org.apache.hc.client5.http.impl.classic.CloseableHttpClient.execute(CloseableHttpClient.java:87)
        at org.apache.hc.client5.http.impl.classic.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at org.apache.hc.client5.http.classic.HttpClient.executeOpen(HttpClient.java:183)
        at com.clickhouse.client.api.internal.HttpAPIClientHelper.executeRequest(HttpAPIClientHelper.java:347)
        at com.clickhouse.client.api.Client.lambda$insert$7(Client.java:1393)
        at com.clickhouse.client.api.Client$$Lambda$33/660879561.get(Unknown Source)
        at com.clickhouse.client.api.Client.runAsyncOperation(Client.java:1942)
        at com.clickhouse.client.api.Client.insert(Client.java:1485)
        at com.clickhouse.client.api.Client.insert(Client.java:1341)
        at org.byten.util.ClickHouseUtils.upload(ClickHouseUtils.java:47)
        at org.byten.convert.WGS_single.lambda$run$6(WGS_single.java:252)
        at org.byten.convert.WGS_single$$Lambda$15/672320506.apply(Unknown Source)
        at org.byten.util.ClickHouseUtils.readFile(ClickHouseUtils.java:67)
        at org.byten.convert.WGS_single.run(WGS_single.java:150)
        at org.byten.convert.WGS_single.execute(WGS_single.java:137)
        at org.byten.convert.WGS_single.main(WGS_single.java:109)

"VM Thread" os_prio=0 tid=0x00007f4294273000 nid=0xe783 runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f4294020000 nid=0xe776 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f4294022000 nid=0xe777 runnable 

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f4294024000 nid=0xe778 runnable 

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f4294025800 nid=0xe779 runnable 

"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f4294027800 nid=0xe77a runnable 

"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f4294029000 nid=0xe77b runnable 

"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007f429402b000 nid=0xe77c runnable 

"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007f429402c800 nid=0xe77d runnable 

"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00007f429402e800 nid=0xe77e runnable 

"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00007f4294030000 nid=0xe77f runnable 

"GC task thread#10 (ParallelGC)" os_prio=0 tid=0x00007f4294032000 nid=0xe780 runnable 

"GC task thread#11 (ParallelGC)" os_prio=0 tid=0x00007f4294034000 nid=0xe781 runnable 

"GC task thread#12 (ParallelGC)" os_prio=0 tid=0x00007f4294035800 nid=0xe782 runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f42942d5800 nid=0xe794 waiting on condition 

JNI global references: 342

➜  / 

Configuration

Environment

  • Client version: client-v2 0.7.1
  • Language version:
    openjdk version "1.8.0_432"
    OpenJDK Runtime Environment (build 1.8.0_432-8u432-gaus1-0ubuntu224.04-ga)
    OpenJDK 64-Bit Server VM (build 25.432-bga, mixed mode)
  • OS:
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description: Ubuntu 24.04.1 LTS
    Release: 24.04
    Codename: noble

ClickHouse server

  • ClickHouse Server version: (version 24.11.1.2557 (official build)

  • ClickHouse Server non-default settings, if any:
    set max_insert_threads = 32;
    set input_format_parallel_parsing = 0;
    set receive_timeout = 30000;
    set input_format_allow_errors_num = 0;
    set parts_to_throw_insert = 3000;
    set max_insert_block_size = 1048449;
    set min_insert_block_size_rows = 1048449;
    set min_insert_block_size_bytes = 268402944;
    set parts_to_delay_insert = 0;
    set parts_to_throw_insert = 300;

  • CREATE TABLE statements for tables involved:

  • Sample data for all these tables, use clickhouse-obfuscator if necessary

@Byte-n Byte-n added the bug label Dec 17, 2024
@laeg laeg added the community label Dec 17, 2024 — with Linear
@Byte-n
Copy link
Author

Byte-n commented Dec 18, 2024

This state can last for more than twelve hours. CPU usage is almost 0

@laeg laeg added the performance label Dec 18, 2024 — with Linear
@laeg laeg removed the community label Dec 18, 2024
@chernser chernser added client-api-v2 investigating Investigation of a root cause is on going and removed performance bug labels Dec 20, 2024
@chernser
Copy link
Contributor

Good day, @Byte-n !
Thank you for reporting it!

Is problem 100% reproducible?

It is quite strange that server or client doesn't timeout after 12hours. What is typical length of rowsStr ?
How much time is spend splitting it by lines?
The thread is RUNNABLE so it seems doing something. It would be blocked or done if nothing to send.

@chernser chernser added the network network and IO related issues label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-api-v2 investigating Investigation of a root cause is on going network network and IO related issues
Projects
None yet
Development

No branches or pull requests

3 participants