Trino-cli has no option to accept targetResultSize query parameter #22303

anilsomisetty · 2024-06-06T13:16:29Z

In query result GET request from coordinator it has a query parameter targetResultSize which is per get request how much a coordinator can send to client.
I see there is no way to configure it from trino client.

mosabua · 2024-06-19T16:56:43Z

What is the use case you are trying to solve with exposing this parameter? The protocol is already conversational so there is no real limit for the overall data returned.

anilsomisetty · 2024-06-19T17:25:00Z

Hi @mosabua

Currently in below mentioned code I see that for query get request to fetch resultset there is a queryparameter targetResultSize defined with a default size as 16MB i.e 16MB of result set would be sent to client from coordinator per get request.

Another thing to note here is this parameter is never set i.e it's always null because there is no way this parameter can be set from client. but I see a condition to check take min of this value and max target size which is 128MB if it is set.

If this parameter can be set through client the queries that would fetch large resultset from coordinator mininization of the get request calls can be done which would reduce the wait time and queries would complete faster.

ExecuteStatementResource.java(

trino/core/trino-main/src/main/java/io/trino/server/protocol/ExecutingStatementResource.java

Line 216 in 1b0a011

if (targetResultSize == null) {

)

please let me know your comments.

mosabua · 2024-06-19T18:06:55Z

This might be something we can do in our current work on improving client protocol performance with Project Swift.

#22271

Note that we don't know if increasing the size does indeed improve performance. Did you test this and can report any findings?

Also .. ideally users would not have to configure such things just to get a faster connection.

anilsomisetty · 2024-06-24T09:13:26Z

What I have observed is in code targetResultSize was set as QueryParam and was never getting used or set from trino client while running a query, so I was just curious on why it was not done.

I have run a query which returns approximately 6GB of data with 7.7Million rows on a trino cluster from DBVisualizer using trino-jdbc client driver jar, these are my findigs:

When targetResultSize is default i.e 16MB it took 1hour to fetch this result
When targetResultSize is made maximum i.e 128MB it took 4minutes to fetch this result

mosabua · 2024-06-24T14:57:57Z

Wow .. very interesting finding. We should take this into account for Project Swift @wendigo @dain @electrum @martint

wendigo · 2024-06-24T16:18:14Z

@mosabua I can't agree with these numbers - I've been benchmarking changing this value and there is a diminishing improvement for values over 32 MB. The buffering and compression happens on the coordinator so setting this value high, puts an additional pressure on the coordinator. That doesn't scale well.

anilsomisetty · 2024-06-24T17:37:54Z

My trino cluster has 6 nodes = 5 workers and 1 coordinator(doesn't act as worker) where each node has 1TB memory and 128 CPU where I have run the query.

wendigo · 2024-07-17T12:29:40Z

We don't want to expose this parameter to the clients. With the spooled client protocol extension (#22662) it won't be effective either way.

pichlerpa · 2025-01-06T13:49:03Z

Interesting, I just added this parameter to the Power BI connector for testing and I did notice a significant performance change on my local machine, where 16MB was the fastest for loading 500k rows into Power BI.

TargetResultSize 1MB (default) --> time taken: 00:02:37

TargetResultSize 16MB --> time taken: 00:01:46

TargetResultSize 128MB --> time taken: 00:01:58

wendigo · 2025-01-06T14:34:02Z

@pichlerpa this puts an additional pressure on the coordinator as the last response is cached in order for the client to be able to retry the last call. That's why it's not recommended approach.

pichlerpa · 2025-01-06T18:02:35Z

@wendigo I understand, but I think it's great to have the option at least. In case of Power BI, this can be very useful to speed up imports which often happen just once per night or a few times during the day.

wendigo · 2025-01-06T19:14:55Z

@pichlerpa that's why we've added a spooled protocol that allows you to have segments of a bigger size. In our benchmarks the new jdbc driver is 2-4x faster.

pichlerpa · 2025-01-06T19:32:35Z

@wendigo Ok, yes, it seems to work like Databricks' cloud fetch mechanism. But what if I don't have an object store available? Isn't this a relatively simple option to improve performance, as long as the coordinator isn't too busy at those times to avoid OOM errors? I don't really see the problem, if you know the system and functionality, why not? Isn't it better than having endless round-trips to collect query results.

wendigo · 2025-01-06T20:04:01Z

@pichlerpa we are considering adding a spooling manager that works with the local worker storage

pichlerpa · 2025-01-07T07:52:27Z

@wendigo is there any documentation or reference available for how to implement the spooling protocol in a new connector/client not relying on JDBC? Thanks for your feedback BTW!

wendigo · 2025-01-07T10:06:30Z

@pichlerpa #22662

anilsomisetty self-assigned this Jun 6, 2024

mosabua mentioned this issue Jun 24, 2024

Project Swift #22271

Open

wendigo closed this as completed Jul 17, 2024

pichlerpa mentioned this issue Jan 8, 2025

Support for targetResultSize CreativeDataEU/PowerBITrinoConnector#36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trino-cli has no option to accept targetResultSize query parameter #22303

Trino-cli has no option to accept targetResultSize query parameter #22303

anilsomisetty commented Jun 6, 2024 •

edited

Loading

mosabua commented Jun 19, 2024

anilsomisetty commented Jun 19, 2024 •

edited

Loading

mosabua commented Jun 19, 2024 •

edited

Loading

anilsomisetty commented Jun 24, 2024

mosabua commented Jun 24, 2024

wendigo commented Jun 24, 2024

anilsomisetty commented Jun 24, 2024

wendigo commented Jul 17, 2024

pichlerpa commented Jan 6, 2025

wendigo commented Jan 6, 2025

pichlerpa commented Jan 6, 2025

wendigo commented Jan 6, 2025

pichlerpa commented Jan 6, 2025

wendigo commented Jan 6, 2025

pichlerpa commented Jan 7, 2025

wendigo commented Jan 7, 2025

Trino-cli has no option to accept targetResultSize query parameter #22303

Trino-cli has no option to accept targetResultSize query parameter #22303

Comments

anilsomisetty commented Jun 6, 2024 • edited Loading

mosabua commented Jun 19, 2024

anilsomisetty commented Jun 19, 2024 • edited Loading

mosabua commented Jun 19, 2024 • edited Loading

anilsomisetty commented Jun 24, 2024

mosabua commented Jun 24, 2024

wendigo commented Jun 24, 2024

anilsomisetty commented Jun 24, 2024

wendigo commented Jul 17, 2024

pichlerpa commented Jan 6, 2025

wendigo commented Jan 6, 2025

pichlerpa commented Jan 6, 2025

wendigo commented Jan 6, 2025

pichlerpa commented Jan 6, 2025

wendigo commented Jan 6, 2025

pichlerpa commented Jan 7, 2025

wendigo commented Jan 7, 2025

anilsomisetty commented Jun 6, 2024 •

edited

Loading

anilsomisetty commented Jun 19, 2024 •

edited

Loading

mosabua commented Jun 19, 2024 •

edited

Loading