Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryBatcher never stops when call to get URIs fails #1327

Closed
ralfhergert opened this issue Nov 3, 2021 · 5 comments
Closed

QueryBatcher never stops when call to get URIs fails #1327

ralfhergert opened this issue Nov 3, 2021 · 5 comments
Labels
Milestone

Comments

@ralfhergert
Copy link

We are trying to execute a long running query using withConsistentSnapshot=true. Depending on the configuration of our ML-DB the QueryBatcher may recieve a server error, as soon as the ML-DB is no longer capable of providing the snapshot. That is not an issue. But what is a problem for us is, that the exception thrown in

try (UrisHandle results = queryMgr.uris(queryMethod, query, filtered, handle, start, afterUri, forest.getForestName())) {
slips by "unnoticed". Meaning:

  • the QueryBatcher does not stop itself and still considers itself as working/running
  • no FailureListener attached to the QueryBatcher is called
  • the exception is just logged
    This is how the log messages looks like, when all worker threads are dying due to a server-side error:
Exception in thread "pool-11-thread-1" com.marklogic.client.FailedRequestException: Local message: failed to apply resource at internal/uris: Internal Server Error. Server Message: Server (not a REST instance?) did not respond with an expected REST Error message.
at com.marklogic.client.impl.OkHttpServices.checkStatus(OkHttpServices.java:4449)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3382)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3438)
at com.marklogic.client.impl.QueryManagerImpl.uris(QueryManagerImpl.java:169)
at com.marklogic.client.impl.OkHttpServices.uris(OkHttpServices.java:3030)
at com.marklogic.client.datamovement.impl.QueryBatcherImpl$QueryTask.run(QueryBatcherImpl.java:738)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3373)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.processQuery(OkHttpServices.java:3130)
at java.base/java.lang.Thread.run(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3373)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.processQuery(OkHttpServices.java:3130)
at com.marklogic.client.datamovement.impl.QueryBatcherImpl$QueryTask.run(QueryBatcherImpl.java:738)
at java.base/java.lang.Thread.run(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.checkStatus(OkHttpServices.java:4449)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3382)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
Exception in thread "pool-11-thread-2" com.marklogic.client.FailedRequestException: Local message: failed to apply resource at internal/uris: Internal Server Error. Server Message: Server (not a REST instance?) did not respond with an expected REST Error message.
at com.marklogic.client.impl.OkHttpServices.uris(OkHttpServices.java:3030)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3438)
at com.marklogic.client.impl.QueryManagerImpl.uris(QueryManagerImpl.java:169)
Exception in thread "pool-11-thread-3" com.marklogic.client.FailedRequestException: Local message: failed to apply resource at internal/uris: Internal Server Error. Server Message: Server (not a REST instance?) did not respond with an expected REST Error message.
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3373)
at com.marklogic.client.datamovement.impl.QueryBatcherImpl$QueryTask.run(QueryBatcherImpl.java:738)
at com.marklogic.client.impl.OkHttpServices.checkStatus(OkHttpServices.java:4449)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.uris(OkHttpServices.java:3030)
at java.base/java.lang.Thread.run(Unknown Source)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3438)
at com.marklogic.client.impl.QueryManagerImpl.uris(QueryManagerImpl.java:169)
at com.marklogic.client.impl.OkHttpServices.processQuery(OkHttpServices.java:3130)
at com.marklogic.client.impl.OkHttpServices.postResource(OkHttpServices.java:3382)

We would expect:

  • that instead of logging the exception(s), the QueryBatcher calls the registed FailureListeners
  • that the QueryBatcher no longer considers itself to be "running" (since all it's worker threads are now dead)
  • the exception should not be logged when the error is escalated
@ehennum
Copy link
Contributor

ehennum commented Nov 8, 2021

@ralfhergert, the expectations seem quite sensible and presumably will bear up under investigation.

Have you noticed whether the error log or the request log for the appserver on the enode host has a related server-side error?

@georgeajit, for what it's worth, my wild guess is that the server is not sending the error in JSON format. If so, the client would be unable to parse it. That is, the fix would have two parts. On the server, the appserver should respect the error format header for all errors. On the client, the catch should provide special handling for the unsupportable timestamp per the expectations given in the issue report.

@ralfhergert
Copy link
Author

I checked for server-side errors, but could not find any 400/500 errors. The AppServer for our database we are trying to retrieve the documents from is listening to port 8040. The AppServer is configured to use "/MarkLogic/rest-api/error-handler.xqy" as "error handler". The log level is currently "info".

In fact in 8040_AccessLog.txt I did find the POST-requests for the next pages and all subsequent GET-requests for the single documents. But then the requests suddenly stop. There is no request being answered with a 400 or 500 category response.
Also the 8040_ErrorLog.txt, 8040_RequestLog.txt and general ErrorLog.txt show no correlating error message.

BTW we are using 9.0-13.4 as server and the java-client in version 5.5.0.

@ehennum
Copy link
Contributor

ehennum commented Nov 9, 2021

Thanks, @ralfhergert , for following up.

@rjrudin
Copy link
Contributor

rjrudin commented Nov 17, 2022

@ralfhergert Apologies for the delay in responding. Your diagnosis is correct, and this is the same issue as #1287 - if the call to get URIs fails for any reason, the exception is logged but the job is not stopped and thus it hangs indefinitely. We are tracking this internally but will leave this ticket open as a reminder for us to respond to you when a fix is included in a release.

@rjrudin rjrudin changed the title Exceptions thrown in QueryBatcherImpl#L738 slip by unnoticed QueryBatcher never stops when call to get URIs fails Nov 17, 2022
@rjrudin rjrudin removed this from the java-client-api-NEXT milestone Nov 17, 2022
@rjrudin rjrudin added this to the 6.1.0 milestone Dec 31, 2022
rjrudin added a commit that referenced this issue Jan 1, 2023
Addresses DEVEXP-147 (internal bug).
rjrudin added a commit that referenced this issue Jan 3, 2023
Addresses DEVEXP-147 (internal bug).
rjrudin added a commit that referenced this issue Jan 3, 2023
Addresses DEVEXP-147 (internal bug).
rjrudin added a commit that referenced this issue Jan 3, 2023
Addresses DEVEXP-147 (internal bug).
rjrudin added a commit that referenced this issue Jan 4, 2023
Addresses DEVEXP-147 (internal bug).
rjrudin added a commit that referenced this issue Jan 4, 2023
Addresses DEVEXP-147 (internal bug). 

Also enabled a test that had been disabled due to this bug.
rjrudin added a commit that referenced this issue Jan 5, 2023
#1327 QueryBatcher now stops when query fails
@rjrudin
Copy link
Contributor

rjrudin commented Mar 28, 2023

Resolved in 6.1.0

@rjrudin rjrudin closed this as completed Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants