Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Olog : Intermittent connection #3236

Closed
GDH-ISIS opened this issue Jan 15, 2025 · 15 comments
Closed

Olog : Intermittent connection #3236

GDH-ISIS opened this issue Jan 15, 2025 · 15 comments

Comments

@GDH-ISIS
Copy link

GDH-ISIS commented Jan 15, 2025

The latest build of the windows client (15/01/2025) appears to give very intermittent connection with the following message. On the rare occasion, it does connect.

image

In the past, we have had a good connection to the olog service (on docker). Our Windows 10 client connected to the olog docker service without problems. I previous windows client install still does connect without issue. The olog web client still works.

@GDH-ISIS
Copy link
Author

Error log
2025-01-15 16:22:16 INFO [org.phoebus.logbook.olog.ui.LogEntryTableViewController] Single search: desc=*&start=12 hours&end=now
2025-01-15 16:22:16 WARNING [org.phoebus.olog.es.api.OlogHttpClient] failed to retrieve log entries
java.io.IOException: selector manager closed
at java.net.http/jdk.internal.net.http.HttpClientImpl.send(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientFacade.send(Unknown Source)
at org.phoebus.olog.es.api.OlogHttpClient.findLogs(OlogHttpClient.java:249)
at org.phoebus.olog.es.api.OlogHttpClient.search(OlogHttpClient.java:317)
at org.phoebus.logbook.olog.ui.LogbookSearchJob.lambda$getRunnable$0(LogbookSearchJob.java:59)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: selector manager closed
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.selectorClosedException(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.register(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl.registerEvent(Unknown Source)
at java.net.http/jdk.internal.net.http.PlainHttpConnection.connectAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.Http1Exchange.sendHeadersAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.lambda$responseAsyncImpl0$8(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.checkFor407(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.lambda$responseAsyncImpl0$9(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.uniHandleStage(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.handle(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl0(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.responseAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.MultiExchange.responseAsyncImpl(Unknown Source)
at java.net.http/jdk.internal.net.http.MultiExchange.lambda$responseAsync0$2(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$DelegatingExecutor.execute(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.completeAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.MultiExchange.responseAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl.sendAsync(Unknown Source)
... 12 more
Caused by: java.lang.ArithmeticException: long overflow
at java.base/java.lang.Math.multiplyExact(Unknown Source)
at java.base/java.time.Instant.millisUntil(Unknown Source)
at java.base/java.time.Instant.until(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl.purgeTimeoutsAndReturnNextDeadline(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.run(Unknown Source)

2025-01-15 16:22:16 SEVERE [org.phoebus.logbook.olog.ui.LogbookSearchJob] Failed to obtain logs
java.lang.RuntimeException: java.io.IOException: selector manager closed
at org.phoebus.olog.es.api.OlogHttpClient.findLogs(OlogHttpClient.java:255)
at org.phoebus.olog.es.api.OlogHttpClient.search(OlogHttpClient.java:317)
at org.phoebus.logbook.olog.ui.LogbookSearchJob.lambda$getRunnable$0(LogbookSearchJob.java:59)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: selector manager closed
at java.net.http/jdk.internal.net.http.HttpClientImpl.send(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientFacade.send(Unknown Source)
at org.phoebus.olog.es.api.OlogHttpClient.findLogs(OlogHttpClient.java:249)
... 9 more
Caused by: java.io.IOException: selector manager closed
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.selectorClosedException(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.register(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl.registerEvent(Unknown Source)
at java.net.http/jdk.internal.net.http.PlainHttpConnection.connectAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.Http1Exchange.sendHeadersAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.lambda$responseAsyncImpl0$8(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.checkFor407(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.lambda$responseAsyncImpl0$9(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.uniHandleStage(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.handle(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl0(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.responseAsyncImpl(Unknown Source)
at java.net.http/jdk.internal.net.http.Exchange.responseAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.MultiExchange.responseAsyncImpl(Unknown Source)
at java.net.http/jdk.internal.net.http.MultiExchange.lambda$responseAsync0$2(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$DelegatingExecutor.execute(Unknown Source)
at java.base/java.util.concurrent.CompletableFuture.completeAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.MultiExchange.responseAsync(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl.sendAsync(Unknown Source)
... 12 more
Caused by: java.lang.ArithmeticException: long overflow
at java.base/java.lang.Math.multiplyExact(Unknown Source)
at java.base/java.time.Instant.millisUntil(Unknown Source)
at java.base/java.time.Instant.until(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl.purgeTimeoutsAndReturnNextDeadline(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.run(Unknown Source)

@kasemir
Copy link
Collaborator

kasemir commented Jan 15, 2025

No idea, but Caused by: java.lang.ArithmeticException: long overflow suggests it might be related to switching to the native HttpClient, #3219, and somehow setting a bad timeout

@GDH-ISIS
Copy link
Author

Looking at the dates of our working client Windows builds against those with problems, I think the switch to the native HttpClient could be the cause.

@shroffk
Copy link
Member

shroffk commented Jan 15, 2025

I think it might be cause of this

    private OlogHttpClient(String userName, String password) {
        httpClient = HttpClient.newBuilder()
                .cookieHandler(new CookieManager(null, CookiePolicy.ACCEPT_ALL))
                .followRedirects(HttpClient.Redirect.ALWAYS)
                // HttpClient rejects Duration.ZERO for the connect timeout value.
                // To support infinite timeout (preference value == 0), use Long.MAX_VALUE
                .connectTimeout(Duration.ofMillis(Preferences.connectTimeout <= 0 ? Long.MAX_VALUE : Preferences.connectTimeout))
                .build()

Can you try the following and see what the behaviour is

org.phoebus.olog.es.api/connectTimeout=60000

@shroffk
Copy link
Member

shroffk commented Jan 15, 2025

@GDH-ISIS with this a preference we would avoid using Long.MAX_VALUE

org.phoebus.olog.es.api/connectTimeout=60000

@GDH-ISIS
Copy link
Author

GDH-ISIS commented Jan 15, 2025

I do believe that your suggestion (org.phoebus.olog.es.api/connectTimeout=60000) does work. Much more reliable connection. No failures

@shroffk
Copy link
Member

shroffk commented Jan 15, 2025

Great, glad we can confirm my suspicion... now we need to figure out the correct way to define "infinite" timeout.

@georgweiss
Copy link
Collaborator

Need to admit I did not evaluate consequences of setting connect timeout to Long.MAX_VALUE.

That said, I have established that not setting an explicit timeout implies "infinite" timeout. Will update based on my findings.

@georgweiss
Copy link
Collaborator

Need to add: I've discovered a bug related to attachments handling. Attachment file names containing space char cannot be retrieved as URL encoding replaces space with + (instead of %20).

Will address that too.

@shroffk
Copy link
Member

shroffk commented Jan 16, 2025

@GDH-ISIS can you confirm if Georg's fix works

@GDH-ISIS
Copy link
Author

I can confirm that it looks like the fix has worked (with org.phoebus.olog.es.api/connectTimeout=0). My compliments to you both for the help - not familiar with Java.

May I politely enquire as to whether ChannelFinder may need to be checked (6307d6b)

@shroffk
Copy link
Member

shroffk commented Jan 16, 2025

yes, ChannelFinder and AlarmLogTable will also need the same.

@GDH-ISIS
Copy link
Author

Many thanks to both.

@georgweiss
Copy link
Collaborator

Will take on Alarm logger client and channel finder promptly.

@georgweiss
Copy link
Collaborator

Turns out neither channel finder not alarm log client are supporting user defined connect timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants