Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better exception when accessing two non-XA datasources in a transaction #40431

Closed
yrodiere opened this issue May 3, 2024 · 32 comments · Fixed by #40779
Closed

Better exception when accessing two non-XA datasources in a transaction #40431

yrodiere opened this issue May 3, 2024 · 32 comments · Fixed by #40779
Labels
area/agroal area/narayana Transactions / Narayana kind/enhancement New feature or request
Milestone

Comments

@yrodiere
Copy link
Member

yrodiere commented May 3, 2024

Description

#39283 revealed that Agroal 2.3+ now makes it possible to detect that an application is accessing two non-XA datasources in a transaction, which is unsafe.

When detected, we get an exception looking like this:

2024-05-02 11:55:18,201 WARN  [org.apa.cam.com.sch.SchedulerConsumer] (Camel (camel-1) thread #1 - scheduler://read-clean-write) Error processing exchange. Exchange[2388535E05D1BE3-0000000000000000]. Caused by: [org.springframework.jdbc.CannotGetJdbcConnectionException - Failed to obtain JDBC Connection]: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection
        at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:84)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:653)
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:695)
        at org.apache.camel.component.sql.SqlProducer.processInternal(SqlProducer.java:150)
        at org.apache.camel.component.sql.SqlProducer.process(SqlProducer.java:137)
        at org.apache.camel.support.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:65)
        at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:210)
        at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$SimpleTask.handleFirst(RedeliveryErrorHandler.java:462)
        at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$SimpleTask.run(RedeliveryErrorHandler.java:438)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.executeFromQueue(DefaultReactiveExecutor.java:240)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor.executeFromQueue(DefaultReactiveExecutor.java:77)
        at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.await(DefaultAsyncProcessorAwaitManager.java:95)
        at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.process(DefaultAsyncProcessorAwaitManager.java:84)
        at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:200)
        at org.apache.camel.impl.engine.CamelInternalProcessor.processTransacted(CamelInternalProcessor.java:397)
        at org.apache.camel.impl.engine.CamelInternalProcessor.process(CamelInternalProcessor.java:327)
        at org.apache.camel.processor.Pipeline$PipelineTask.run(Pipeline.java:102)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.executeFromQueue(DefaultReactiveExecutor.java:240)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor.executeFromQueue(DefaultReactiveExecutor.java:77)
        at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.await(DefaultAsyncProcessorAwaitManager.java:95)
        at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.process(DefaultAsyncProcessorAwaitManager.java:84)
        at org.apache.camel.support.AsyncProcessorSupport.process(AsyncProcessorSupport.java:32)
        at org.apache.camel.impl.engine.CamelInternalProcessor.processTransacted(CamelInternalProcessor.java:397)
        at org.apache.camel.impl.engine.CamelInternalProcessor.process(CamelInternalProcessor.java:327)
        at org.apache.camel.processor.Pipeline$PipelineTask.run(Pipeline.java:102)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.executeFromQueue(DefaultReactiveExecutor.java:240)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor.executeFromQueue(DefaultReactiveExecutor.java:77)
        at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.await(DefaultAsyncProcessorAwaitManager.java:95)
        at org.apache.camel.impl.engine.DefaultAsyncProcessorAwaitManager.process(DefaultAsyncProcessorAwaitManager.java:84)
        at org.apache.camel.support.AsyncProcessorSupport.process(AsyncProcessorSupport.java:32)
        at org.apache.camel.jta.TransactionErrorHandler.processByErrorHandler(TransactionErrorHandler.java:234)
        at org.apache.camel.jta.TransactionErrorHandler$1.run(TransactionErrorHandler.java:197)
        at org.apache.camel.quarkus.component.jta.TransactionalJtaTransactionPolicy.runWithTransaction(TransactionalJtaTransactionPolicy.java:47)
        at org.apache.camel.quarkus.component.jta.RequiredJtaTransactionPolicy.run(RequiredJtaTransactionPolicy.java:26)
        at org.apache.camel.jta.TransactionErrorHandler.doInTransactionTemplate(TransactionErrorHandler.java:187)
        at org.apache.camel.jta.TransactionErrorHandler.processInTransaction(TransactionErrorHandler.java:138)
        at org.apache.camel.jta.TransactionErrorHandler.process(TransactionErrorHandler.java:102)
        at org.apache.camel.jta.TransactionErrorHandler.process(TransactionErrorHandler.java:111)
        at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$SimpleTask.handleFirst(RedeliveryErrorHandler.java:462)
        at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$SimpleTask.run(RedeliveryErrorHandler.java:438)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.doRun(DefaultReactiveExecutor.java:199)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.executeReactiveWork(DefaultReactiveExecutor.java:189)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.tryExecuteReactiveWork(DefaultReactiveExecutor.java:166)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.schedule(DefaultReactiveExecutor.java:148)
        at org.apache.camel.impl.engine.DefaultReactiveExecutor.scheduleMain(DefaultReactiveExecutor.java:59)
        at org.apache.camel.processor.Pipeline.process(Pipeline.java:163)
        at org.apache.camel.impl.engine.CamelInternalProcessor.processNonTransacted(CamelInternalProcessor.java:354)
        at org.apache.camel.impl.engine.CamelInternalProcessor.process(CamelInternalProcessor.java:330)
        at org.apache.camel.component.scheduler.SchedulerConsumer.sendTimerExchange(SchedulerConsumer.java:70)
        at org.apache.camel.component.scheduler.SchedulerConsumer.poll(SchedulerConsumer.java:50)
        at org.apache.camel.support.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:204)
        at org.apache.camel.support.ScheduledPollConsumer.run(ScheduledPollConsumer.java:118)
        at [email protected]/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
        at [email protected]/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
        at [email protected]/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
        at [email protected]/java.lang.Thread.run(Thread.java:1583)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:833)
        at org.graalvm.nativeimage.builder/com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:211)
Caused by: java.sql.SQLException: Exception in association of connection to existing transaction
        at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:130)
        at io.agroal.pool.ConnectionPool.getConnection(ConnectionPool.java:257)
        at io.agroal.pool.DataSource.getConnection(DataSource.java:86)
        at io.agroal.api.AgroalDataSource_syYGL47zwCt2LQzSIsPp-v9wXhg_Synthetic_ClientProxy.getConnection(Unknown Source)
        at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:160)
        at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:118)
        at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:81)
        ... 60 more
Caused by: java.sql.SQLException: Unable to enlist connection to existing transaction
        at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:121)
        ... 66 more

The root cause is really not helpful, because it doesn't specify why the connection couldn't be enlisted to the transaction. We should improve that message, and that means changes in Agroal.

@barreiro I think you had plans in that area; could you please link the corresponding Agroal Jira?

Implementation ideas

Unfortunately Transaction.enlistResource only returns a boolean to state whether enlistment worked or not.

So here are our only solutions:

  1. Rely on Narayana-specific interfaces that would expose an alternate version of enlistResource which throws an exception on failure, with details regarding what failed exactly ("Can't add two non-XA resources")
  2. Just adapt the message in Agroal to provide a few common reasons for the failure, without certainty.
@mmusgrov
Copy link
Contributor

mmusgrov commented May 6, 2024

I assume that Agroal is swallowing the exception so just report that as the root cause, ie changing the JTA enlistment semantics is not the correct approach. Note that there is an open PR to allow mulitple XA unaware resources (#40365) .

@yrodiere
Copy link
Member Author

yrodiere commented May 6, 2024

I assume that Agroal is swallowing the exception

@mmusgrov There is no exception from Narayana here, just a method returning a boolean:

https://github.com/agroal/agroal/blob/30465b6eaae28366449832e1ee22bbe8a9b83562/agroal-narayana/src/main/java/io/agroal/narayana/NarayanaTransactionIntegration.java#L120-L122

so just report that as the root cause

I can't, see above :)

Note that there is an open PR to allow mulitple XA unaware resources (#40365) .

Yes I'm aware of that, having been involved quite a bit xD As we discussed though, this is a temporary solution. We need to think of longer term solutions, and providing clear exception messages when someone tries to do something incorrect is the first step.

@mmusgrov
Copy link
Contributor

mmusgrov commented May 6, 2024

So can we enable arjuna logging (unless there are already warnings) to see why the enlist call returns false?

@mmusgrov
Copy link
Contributor

mmusgrov commented May 6, 2024

If "Agroal 2.3+ now makes it possible to detect that an application is accessing two non-XA datasources in a transaction" why does it still attempt the enlistment?

@mmusgrov
Copy link
Contributor

mmusgrov commented May 6, 2024

I need more information in order to provide a better analysis of why the enlist is failing.

@mmusgrov
Copy link
Contributor

mmusgrov commented May 6, 2024

@shawkins and @yrodiere Also note that in the multiple one phase aware resources case, if the first one phase aware resource commits and the second one rolls back then overall result reported back to the caller is rollback even though the first one committed (they ought to get a heuristic outcome) and I am fixing that at the moment (I'm on a national holiday today so it may have to wait until tomorrow to be completed). Since we never anticipated use of multiple one phase resources I am busy testing it now and writing test cases where I find deficiencies.

@yrodiere
Copy link
Member Author

yrodiere commented May 6, 2024

If "Agroal 2.3+ now makes it possible to detect that an application is accessing two non-XA datasources in a transaction" why does it still attempt the enlistment?

It "makes it possible to detect" by implementing that LastResource* interface. Agroal doesn't do the detection itself, Narayana does, during enlistment, and it just returns a boolean with no information that Agroal can pass over to the user.

@shawkins and @yrodiere Also note that in the multiple one phase aware resources case, if the first one phase aware resource commits and the second one rolls back then overall result reported back to the caller is rollback even though the first one committed (they ought to get a heuristic outcome) and I am fixing that at the moment (I'm on a national holiday today so it may have to wait until tomorrow to be completed). Since we never anticipated use of multiple one phase resources I am busy testing it now and writing test cases where I find deficiencies.

Good to know, thanks. And this all definitely can wait until tomorrow (or later) as far as I'm concerned, so don't feel forced to work today :)

@mmusgrov
Copy link
Contributor

mmusgrov commented May 6, 2024

So is the test adding the two XA unaware resources with the allowUnsafeMultipleLastResources property implemented and set? If so then that is probably enough for me to reproduce the problem. But if the test is doing something else then I need more information to determine why the second resource is being rejected.

@shawkins
Copy link
Contributor

shawkins commented May 6, 2024

So is the test adding the two XA unaware resources with the allowUnsafeMultipleLastResources property implemented and set? If so then that is probably enough for me to reproduce the problem.

@mmusgrov Yes with allowUnsafeMultipleLastResources not enabled, then the second Connection access like this shawkins/quarkus-quickstarts@014e306#diff-761c636458b29cbabf9e4f4b500472088e222e1310327cade67b7db30a8febb0R25 will result in the exception in the description.

The AddOutcome is AR_DUPLICATE, but the calling logic is treating that the same as AR_REJECTED - https://github.com/jbosstm/narayana/blob/ae4755680f1e5dc7988f8525d1352d24a8adfffd/ArjunaJTA/jta/classes/com/arjuna/ats/internal/jta/transaction/arjunacore/TransactionImple.java#L638

@yrodiere
Copy link
Member Author

yrodiere commented May 7, 2024

So is the test adding the two XA unaware resources with the allowUnsafeMultipleLastResources property implemented and set?

You don't need allowUnsafeMultipleLastResources. This exception arises when you access two non-XA datasources in a single transaction, which by default is not allowed since Quarkus 3.8.2, as we're all quite aware by now :)

This is the exact problem we've been discussing for the past week(s), but I'm talking about a different improvement in that area:

  • Previous GitHub issues/PRs were all about finding workarounds for people who need to use two datasources in the same transaction (temporarily, because it's a bad idea).
  • This GitHub issue is about making sure that when it's not allowed (the default), the exception message people get is actionable, i.e. explains what's wrong exactly (at least) and what can be done to avoid it (ideally).

If so then that is probably enough for me to reproduce the problem. But if the test is doing something else then I need more information to determine why the second resource is being rejected.

Just access two Agroal, non-XA datasources in the same transaction, in Quarkus 3.8.2 or 3.9.latest or 3.10.latest or Quarkus main, and you'll get this exception.

@mmusgrov
Copy link
Contributor

mmusgrov commented May 7, 2024

Thanks @shawkins , I'll look into creating a narayana unit test to duplicate this behaviour.

@mdillenk
Copy link

mdillenk commented May 10, 2024

I get this same error when pointing Keycloak 24.0.4 at an empty MSSQL version 15 or 16 database, it fails after creating all the tables and starts to populate them.

c:\DCX5\keycloak-24.0.4\bin>kc.bat start-dev --db mssql --db-url "jdbc:sqlserver://localhost:1433;databaseName=dcx5_keycloak;trustServerCertificate=false;encrypt=false" --db-username sa --db-password XXXXXX
2024-05-10 14:35:27,169 INFO [org.keycloak.quarkus.runtime.hostname.DefaultHostnameProvider] (main) Hostname settings: Base URL: , Hostname: , Strict HTTPS: false, Path: , Strict BackChannel: false, Admin URL: , Admin: , Port: -1, Proxied: false
2024-05-10 14:35:27,448 WARN [org.infinispan.CONFIG] (keycloak-cache-init) ISPN000569: Unable to persist Infinispan internal caches as no global state enabled
2024-05-10 14:35:27,545 INFO [org.infinispan.CONTAINER] (keycloak-cache-init) ISPN000556: Starting user marshaller 'org.infinispan.jboss.marshalling.core.JBossUserMarshaller'
2024-05-10 14:35:29,944 WARN [io.quarkus.agroal.runtime.DataSources] (JPA Startup Thread) Datasource enables XA but transaction recovery is not enabled. Please enable transaction recovery by setting quarkus.transaction-manager.enable-recovery=true, otherwise data may be lost if the application is terminated abruptly
2024-05-10 14:35:31,142 INFO [org.keycloak.broker.provider.AbstractIdentityProviderMapper] (main) Registering class org.keycloak.broker.provider.mappersync.ConfigSyncEventListener
2024-05-10 14:35:32,769 INFO [org.keycloak.quarkus.runtime.storage.legacy.liquibase.QuarkusJpaUpdaterProvider] (main) Initializing database schema. Using changelog META-INF/jpa-changelog-master.xml
2024-05-10 14:35:34,544 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,701 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,701 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,701 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,888 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,919 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,982 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,982 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,998 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:34,998 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,013 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,013 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,029 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,029 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,044 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,486 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:35,486 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,022 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,022 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,038 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,038 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,330 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,346 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,346 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,346 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,361 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,569 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:37,991 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:38,752 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:39,029 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:39,123 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.
2024-05-10 14:35:39,217 WARN [liquibase.executor.jvm.JdbcExecutor] (main) Caution: Changing any part of an object name could break scripts and stored procedures.

UPDATE SUMMARY
Run: 124
Previously run: 0
Filtered out: 0

Total change sets: 124

2024-05-10 14:35:40,473 WARN [com.arjuna.ats.jta] (main) ARJUNA016061: TransactionImple.enlistResource - XAResource.start returned: XAException.XAER_RMERR for < formatId=131077, gtrid_length=35, bqual_length=36, tx_uid=0:ffff0a201e65:e842:663e770c:0, node_name=quarkus, branch_uid=0:ffff0a201e65:e842:663e770c:3f, subordinatenodename=null, eis_name=0 >: javax.transaction.xa.XAException: com.microsoft.sqlserver.jdbc.SQLServerException: Failed to create the XA control connection. Error: "The connection is closed."
at com.microsoft.sqlserver.jdbc.SQLServerXAResource.dtc_XA_interface(SQLServerXAResource.java:775)
at com.microsoft.sqlserver.jdbc.SQLServerXAResource.start(SQLServerXAResource.java:809)
at io.agroal.narayana.BaseXAResource.start(BaseXAResource.java:153)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:637)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:398)
at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:120)
at io.agroal.pool.ConnectionPool.getConnection(ConnectionPool.java:257)
at io.agroal.pool.DataSource.getConnection(DataSource.java:86)
at io.quarkus.hibernate.orm.runtime.customized.QuarkusConnectionProvider.getConnection(QuarkusConnectionProvider.java:23)
at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:46)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:113)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:143)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.connection(StatementPreparerImpl.java:54)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$5.doPrepare(StatementPreparerImpl.java:153)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$StatementPreparationTemplate.prepareStatement(StatementPreparerImpl.java:183)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareQueryStatement(StatementPreparerImpl.java:155)
at org.hibernate.sql.exec.spi.JdbcSelectExecutor.lambda$list$0(JdbcSelectExecutor.java:85)
at org.hibernate.sql.results.jdbc.internal.DeferredResultSetAccess.executeQuery(DeferredResultSetAccess.java:231)
at org.hibernate.sql.results.jdbc.internal.DeferredResultSetAccess.getResultSet(DeferredResultSetAccess.java:167)
at org.hibernate.sql.results.jdbc.internal.JdbcValuesResultSetImpl.advanceNext(JdbcValuesResultSetImpl.java:218)
at org.hibernate.sql.results.jdbc.internal.JdbcValuesResultSetImpl.processNext(JdbcValuesResultSetImpl.java:98)
at org.hibernate.sql.results.jdbc.internal.AbstractJdbcValues.next(AbstractJdbcValues.java:19)
at org.hibernate.sql.results.internal.RowProcessingStateStandardImpl.next(RowProcessingStateStandardImpl.java:66)
at org.hibernate.sql.results.spi.ListResultsConsumer.consume(ListResultsConsumer.java:202)
at org.hibernate.sql.results.spi.ListResultsConsumer.consume(ListResultsConsumer.java:33)
at org.hibernate.sql.exec.internal.JdbcSelectExecutorStandardImpl.doExecuteQuery(JdbcSelectExecutorStandardImpl.java:209)
at org.hibernate.sql.exec.internal.JdbcSelectExecutorStandardImpl.executeQuery(JdbcSelectExecutorStandardImpl.java:83)
at org.hibernate.sql.exec.spi.JdbcSelectExecutor.list(JdbcSelectExecutor.java:76)
at org.hibernate.sql.exec.spi.JdbcSelectExecutor.list(JdbcSelectExecutor.java:65)
at org.hibernate.query.sqm.internal.ConcreteSqmSelectQueryPlan.lambda$new$2(ConcreteSqmSelectQueryPlan.java:137)
at org.hibernate.query.sqm.internal.ConcreteSqmSelectQueryPlan.withCacheableSqmInterpretation(ConcreteSqmSelectQueryPlan.java:362)
at org.hibernate.query.sqm.internal.ConcreteSqmSelectQueryPlan.performList(ConcreteSqmSelectQueryPlan.java:303)
at org.hibernate.query.sqm.internal.QuerySqmImpl.doList(QuerySqmImpl.java:509)
at org.hibernate.query.spi.AbstractSelectionQuery.list(AbstractSelectionQuery.java:427)
at org.hibernate.query.Query.getResultList(Query.java:120)
at org.keycloak.models.jpa.MigrationModelAdapter.init(MigrationModelAdapter.java:59)
at org.keycloak.models.jpa.MigrationModelAdapter.(MigrationModelAdapter.java:42)
at org.keycloak.models.jpa.JpaRealmProvider.getMigrationModel(JpaRealmProvider.java:99)
at org.keycloak.storage.datastore.DefaultMigrationManager.migrate(DefaultMigrationManager.java:131)
at org.keycloak.migration.MigrationModelManager.migrate(MigrationModelManager.java:33)
at org.keycloak.quarkus.runtime.storage.legacy.database.QuarkusJpaConnectionProviderFactory.migrateModel(QuarkusJpaConnectionProviderFactory.java:213)
at org.keycloak.quarkus.runtime.storage.legacy.database.QuarkusJpaConnectionProviderFactory.initSchema(QuarkusJpaConnectionProviderFactory.java:207)
at org.keycloak.models.utils.KeycloakModelUtils.lambda$runJobInTransaction$1(KeycloakModelUtils.java:257)
at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransactionWithResult(KeycloakModelUtils.java:379)
at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:256)
at org.keycloak.quarkus.runtime.storage.legacy.database.QuarkusJpaConnectionProviderFactory.postInit(QuarkusJpaConnectionProviderFactory.java:132)
at org.keycloak.quarkus.runtime.integration.QuarkusKeycloakSessionFactory.init(QuarkusKeycloakSessionFactory.java:105)
at org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication.createSessionFactory(QuarkusKeycloakApplication.java:67)
at org.keycloak.services.resources.KeycloakApplication.startup(KeycloakApplication.java:131)
at org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication.onStartupEvent(QuarkusKeycloakApplication.java:54)
at org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication_Observer_onStartupEvent_GNZ8m5QenZ9h9VNelo7awjUZFDE.notify(Unknown Source)
at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:346)
at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:328)
at io.quarkus.arc.impl.EventImpl.fire(EventImpl.java:82)
at io.quarkus.arc.runtime.ArcRecorder.fireLifecycleEvent(ArcRecorder.java:155)
at io.quarkus.arc.runtime.ArcRecorder.handleLifecycleEvents(ArcRecorder.java:106)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy_0(Unknown Source)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy(Unknown Source)
at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
at io.quarkus.runtime.Application.start(Application.java:101)
at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:111)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
at org.keycloak.quarkus.runtime.KeycloakMain.start(KeycloakMain.java:108)
at org.keycloak.quarkus.runtime.cli.command.AbstractStartCommand.run(AbstractStartCommand.java:33)
at picocli.CommandLine.executeUserObject(CommandLine.java:2026)
at picocli.CommandLine.access$1500(CommandLine.java:148)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
at picocli.CommandLine.execute(CommandLine.java:2170)
at org.keycloak.quarkus.runtime.cli.Picocli.parseAndRun(Picocli.java:125)
at org.keycloak.quarkus.runtime.KeycloakMain.main(KeycloakMain.java:98)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at io.quarkus.bootstrap.runner.QuarkusEntryPoint.doRun(QuarkusEntryPoint.java:62)
at io.quarkus.bootstrap.runner.QuarkusEntryPoint.main(QuarkusEntryPoint.java:33)

2024-05-10 14:35:40,489 WARN [com.arjuna.ats.jta] (main) ARJUNA016138: Failed to enlist XA resource io.agroal.narayana.BaseXAResource@6c861b25: jakarta.transaction.SystemException: TransactionImple.enlistResource - XAResource.start ARJUNA016054: could not register transaction: < formatId=131077, gtrid_length=35, bqual_length=36, tx_uid=0:ffff0a201e65:e842:663e770c:0, node_name=quarkus, branch_uid=0:ffff0a201e65:e842:663e770c:3f, subordinatenodename=null, eis_name=0 >
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:690)
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:398)
at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:120)
at io.agroal.pool.ConnectionPool.getConnection(ConnectionPool.java:257)
at io.agroal.pool.DataSource.getConnection(DataSource.java:86)
at io.quarkus.hibernate.orm.runtime.customized.QuarkusConnectionProvider.getConnection(QuarkusConnectionProvider.java:23)
at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:46)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:113)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:143)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.connection(StatementPreparerImpl.java:54)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$5.doPrepare(StatementPreparerImpl.java:153)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl$StatementPreparationTemplate.prepareStatement(StatementPreparerImpl.java:183)
at org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareQueryStatement(StatementPreparerImpl.java:155)
at org.hibernate.sql.exec.spi.JdbcSelectExecutor.lambda$list$0(JdbcSelectExecutor.java:85)
at org.hibernate.sql.results.jdbc.internal.DeferredResultSetAccess.executeQuery(DeferredResultSetAccess.java:231)
at org.hibernate.sql.results.jdbc.internal.DeferredResultSetAccess.getResultSet(DeferredResultSetAccess.java:167)
at org.hibernate.sql.results.jdbc.internal.JdbcValuesResultSetImpl.advanceNext(JdbcValuesResultSetImpl.java:218)
at org.hibernate.sql.results.jdbc.internal.JdbcValuesResultSetImpl.processNext(JdbcValuesResultSetImpl.java:98)
at org.hibernate.sql.results.jdbc.internal.AbstractJdbcValues.next(AbstractJdbcValues.java:19)
at org.hibernate.sql.results.internal.RowProcessingStateStandardImpl.next(RowProcessingStateStandardImpl.java:66)
at org.hibernate.sql.results.spi.ListResultsConsumer.consume(ListResultsConsumer.java:202)
at org.hibernate.sql.results.spi.ListResultsConsumer.consume(ListResultsConsumer.java:33)
at org.hibernate.sql.exec.internal.JdbcSelectExecutorStandardImpl.doExecuteQuery(JdbcSelectExecutorStandardImpl.java:209)
at org.hibernate.sql.exec.internal.JdbcSelectExecutorStandardImpl.executeQuery(JdbcSelectExecutorStandardImpl.java:83)
at org.hibernate.sql.exec.spi.JdbcSelectExecutor.list(JdbcSelectExecutor.java:76)
at org.hibernate.sql.exec.spi.JdbcSelectExecutor.list(JdbcSelectExecutor.java:65)
at org.hibernate.query.sqm.internal.ConcreteSqmSelectQueryPlan.lambda$new$2(ConcreteSqmSelectQueryPlan.java:137)
at org.hibernate.query.sqm.internal.ConcreteSqmSelectQueryPlan.withCacheableSqmInterpretation(ConcreteSqmSelectQueryPlan.java:362)
at org.hibernate.query.sqm.internal.ConcreteSqmSelectQueryPlan.performList(ConcreteSqmSelectQueryPlan.java:303)
at org.hibernate.query.sqm.internal.QuerySqmImpl.doList(QuerySqmImpl.java:509)
at org.hibernate.query.spi.AbstractSelectionQuery.list(AbstractSelectionQuery.java:427)
at org.hibernate.query.Query.getResultList(Query.java:120)
at org.keycloak.models.jpa.MigrationModelAdapter.init(MigrationModelAdapter.java:59)
at org.keycloak.models.jpa.MigrationModelAdapter.(MigrationModelAdapter.java:42)
at org.keycloak.models.jpa.JpaRealmProvider.getMigrationModel(JpaRealmProvider.java:99)
at org.keycloak.storage.datastore.DefaultMigrationManager.migrate(DefaultMigrationManager.java:131)
at org.keycloak.migration.MigrationModelManager.migrate(MigrationModelManager.java:33)
at org.keycloak.quarkus.runtime.storage.legacy.database.QuarkusJpaConnectionProviderFactory.migrateModel(QuarkusJpaConnectionProviderFactory.java:213)
at org.keycloak.quarkus.runtime.storage.legacy.database.QuarkusJpaConnectionProviderFactory.initSchema(QuarkusJpaConnectionProviderFactory.java:207)
at org.keycloak.models.utils.KeycloakModelUtils.lambda$runJobInTransaction$1(KeycloakModelUtils.java:257)
at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransactionWithResult(KeycloakModelUtils.java:379)
at org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:256)
at org.keycloak.quarkus.runtime.storage.legacy.database.QuarkusJpaConnectionProviderFactory.postInit(QuarkusJpaConnectionProviderFactory.java:132)
at org.keycloak.quarkus.runtime.integration.QuarkusKeycloakSessionFactory.init(QuarkusKeycloakSessionFactory.java:105)
at org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication.createSessionFactory(QuarkusKeycloakApplication.java:67)
at org.keycloak.services.resources.KeycloakApplication.startup(KeycloakApplication.java:131)
at org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication.onStartupEvent(QuarkusKeycloakApplication.java:54)
at org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication_Observer_onStartupEvent_GNZ8m5QenZ9h9VNelo7awjUZFDE.notify(Unknown Source)
at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:346)
at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:328)
at io.quarkus.arc.impl.EventImpl.fire(EventImpl.java:82)
at io.quarkus.arc.runtime.ArcRecorder.fireLifecycleEvent(ArcRecorder.java:155)
at io.quarkus.arc.runtime.ArcRecorder.handleLifecycleEvents(ArcRecorder.java:106)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy_0(Unknown Source)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy(Unknown Source)
at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
at io.quarkus.runtime.Application.start(Application.java:101)
at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:111)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
at org.keycloak.quarkus.runtime.KeycloakMain.start(KeycloakMain.java:108)
at org.keycloak.quarkus.runtime.cli.command.AbstractStartCommand.run(AbstractStartCommand.java:33)
at picocli.CommandLine.executeUserObject(CommandLine.java:2026)
at picocli.CommandLine.access$1500(CommandLine.java:148)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
at picocli.CommandLine.execute(CommandLine.java:2170)
at org.keycloak.quarkus.runtime.cli.Picocli.parseAndRun(Picocli.java:125)
at org.keycloak.quarkus.runtime.KeycloakMain.main(KeycloakMain.java:98)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at io.quarkus.bootstrap.runner.QuarkusEntryPoint.doRun(QuarkusEntryPoint.java:62)
at io.quarkus.bootstrap.runner.QuarkusEntryPoint.main(QuarkusEntryPoint.java:33)

2024-05-10 14:35:40,489 WARN [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (main) SQL Error: 0, SQLState: null
2024-05-10 14:35:40,489 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (main) Exception in association of connection to existing transaction
2024-05-10 14:35:40,547 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: Failed to start server in (development) mode
2024-05-10 14:35:40,547 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: Unable to acquire JDBC Connection [Exception in association of connection to existing transaction] [n/a]
2024-05-10 14:35:40,547 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: Exception in association of connection to existing transaction
2024-05-10 14:35:40,547 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: Unable to enlist connection to existing transaction
2024-05-10 14:35:40,547 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) For more details run the same command passing the '--verbose' option. Also you can use '--help' to see the details about the usage of the particular command.

@mdillenk
Copy link

nm i fixed it by adding non-xa true to my start command

c:\DCX5\keycloak-24.0.4\bin>kc.bat start-dev --db mssql --db-url "jdbc:sqlserver://localhost:1433;databaseName=dcx5_keycloak;trustServerCertificate=false;encrypt=false" --db-username sa --db-password XXXXXX --transaction-xa-enabled false

@mmusgrov
Copy link
Contributor

@yrodiere and @shawkins Changing the exception message is quite invasive and impacts existing users so, strictly speaking, the original message would need deprecation first. How about the following explicit log message instead:

WARN: ARJUNA016154: Unable to enlist duplicate one phase resource because CoreEnvironmentBean.isAllowMultipleLastResources is disabled

@mmusgrov
Copy link
Contributor

That would be in addition to the warning issued during start up if the allowUnsafeMultipleLastResources property is set.

@shawkins
Copy link
Contributor

How about the following explicit log message instead:

@mmusgrov instead of duplicate, could it say multiple?

@yrodiere
Copy link
Member Author

yrodiere commented May 22, 2024

@yrodiere and @shawkins Changing the exception message is quite invasive and impacts existing users so, strictly speaking, the original message would need deprecation first.

I don't understand why we're talking merely about changing a message.

Once again, the more important obstacle is that the API between Agroal and Narayana is a method returning a boolean to indicate success/failure, which doesn't allow proper exception propagation. And, critically, doesn't allow Agroal to wrap Narayana's failure with its own, adding context (such as the fact that the "resource" we're talking about is a Datasource).
That is the problem which needs solving.

Now, introducing a new method might require deprecating the old one, sure (though that's not strictly required). Changing the behavior of the existing API, similarly, might require introducing a configuration flag defaulting to the old value, with some deprecation to warn about future changes.

I don't see how any of that is blocking anything, though.

How about the following explicit log message instead:

WARN: ARJUNA016154: Unable to enlist duplicate one phase resource because CoreEnvironmentBean.isAllowMultipleLastResources is disabled

IMO such a log is only useful as a quick, temporary stopgap if further work needs more time. As a user, getting an exception with little information and having to look at the logs to understand the exception is poor user experience, be it only because logs are easy to miss.

If you want to go that way, FWIW I'd explain the problem and potential solutions a bit more:

WARN: ARJUNA016154: Unable to add multiple one-phase resources in the same transaction, because this could result in partial rollbacks. To fix this, either enable XA transaction support in your resources, use one separate transaction for each resource, or enable CoreEnvironmentBean.isAllowMultipleLastResources.

Phrasing probably needs improvement, but I think this information needs to be there.

@jhalliday
Copy link
Contributor

Where programmatic exception handling can be used for runtime resolution of a problem, then the exception approach has some advantages. Here however the exception indicates a logic error rather than an environmental one and needs to be resolved at development time, not runtime. The only thing that code noticing the problem can reasonably do, is log the exception. So either narayana generates a log message (it already does) or something else catches the exception and then does. Either way, the end result is developers needing to look at the log. That log may be written to console in dev mode to slide the WARN log messages or stack traces under the developers nose, but its still just a log. Whilst the JTA spec that says how enlistResource works does leave some room for maneuver here, there is little to be gained by making behavior changes that risk breaking existing uses of the standard API.

And, critically, doesn't allow Agroal to wrap Narayana's failure with its own, adding context (such as the fact that the "resource" we're talking about is a Datasource)

Narayana did not fail. It correctly detected and prevented an unsafe situation that could have resulted in data loss. It's functioning as intended.
What additional information do you believe it can provide for the exception handler to write to the log, that narayana is not already itself writing to the log anyhow? Agroal already knows what the resource is - it passed it to narayana in the first place as an arg to enlistResource. When agroal throws the exception, nothing prevents it from adding that information. It currently doesn't include that in the exception message it creates, which is perhaps a usability issue in Agroal. On the other hand, WARN ARJUNA12140 logs the existing and offered resources anyhow, so perhaps it doesn't need to.

TL;DR: Improved usability is a good thing, but changing API behavior is not. Fortunately, one does appear to require the other on this occasion.

@yrodiere
Copy link
Member Author

yrodiere commented May 23, 2024

there is little to be gained by making behavior changes that risk breaking existing uses of the standard API

You don't necessarily need to change that behavior, or even Jakarta Transaction. You could simply expose a superset of the Jakarta Transaction APIs in Narayana. After all, the Agroal integration is already specific to Narayana.

What additional information do you believe it can provide

The reason for the inability to add a resource. All Agroal has is an enlistResource method that returns "true if the resource was enlisted successfully; otherwise false".

How in the world is Agroal supposed to go from here to giving an actionable message to users such as "cannot register two non-XA resources in the same transaction; enable XA for this datasource or use two separate transactions"?

for the exception handler to write to the log, that narayana is not already itself writing to the log anyhow?

Context is everything. A single exception shows you the problem, where it happened and (if it's worded correctly) how to address it.

With two log lines, on the other hand, you split this information in two places that are not necessarily going to be easy to reconcile:

  • Production logs may have many concurrent requests, so the two lines may be spaced apart from each other.
  • Traceability in production is another issue, as REST APIs will often return an error ID, which can then be passed to the REST APIs' support, who will do some lookup, and get the exception log line only... with no context.
  • People developing apps may first notice the exception in their browser, in some "Error 500" page... where the logs simply do not appear!
  • Even people developing apps and relying on console logs may just miss the line above the exception, mistaking it for part of the application startup logs.
  • Some people are just bad at looking at logs and waste our time if it's not crystal clear. You'd be surprised at how many people report a problem with just an exception message and don't even include the stack trace.

On the other hand, WARN ARJUNA12140 logs the existing and offered resources anyhow, so perhaps it doesn't need to.

Logs are no way to report problems of that kind of importance to end users. This is just a workaround for an API unsuitable to Agroal's needs. It's fine if you don't want/can't change it, but let's not pretend everything is alright here.

@jhalliday
Copy link
Contributor

After all, the Agroal integration is already specific to Narayana.

Really? Historically it was specific to the tx component integration SPI, which allowed it to share work with wildfly. Which bits are coupling directly against narayana? Or is the integration SPI now considered part of narayana as it's the only supported implementation?

The reason for the inability to add a resource.

Fair. I think your best bet here is just to assume that if you enlist a LastResource and get false, it's because there already is another one enlisted. There are very few other reasons that will ever cause an enlistment problem for a LastResource since its start call won't go over the network. I'd rather deal with the vanishingly small number of false positives than change the API to get to 100% certainty on the reason. I don't recall seeing one with any other cause in many years, but if you're worried then keep track of which LastResource a tx has by using TransactionSynchronizationRegistry.putResource/getResource

How in the world is Agroal supposed to go from here to giving an actionable message to users such as "cannot register two non-XA resources in the same transaction; enable XA for this datasource or use two separate transactions"?

see above, but: narayana already does that and your argument seems to be they don't read the messages anyhow :-)

With two log lines, on the other hand, you split this information in two places that are not necessarily going to be easy to reconcile

Indeed. What information do you need that isn't already in the one log line that ARJUNA12140 writes?

Logs are no way to report problems of that kind of importance to end users.

There is nothing else. Exceptions are just just log entries that happen to include a stack trace. Those frames that are within arjuna code don't provide any useful context to users. The end user only needs the exception API if they are going to try to handle it programmatically, which they aren't.

but let's not pretend everything is alright here.

I'm not. JTA is a mess of an API, because it's a kludge over XA's C centric conventions. But I'm also far from convinced that changing the API for this brings enough, or indeed any, benefit to justify the effort. There are definitely improvements that could be made, but we can get something good enough without going that far. If we're going to start making tx API changes, there are at least three on my list that are far more impactful than this one.

Some people are just bad at looking at logs and waste our time if it's not crystal clear. You'd be surprised at how many people report a problem with just an exception message and don't even include the stack trace.

I've been doing tx support for two decades, it's going to take quite a lot to surprise me. Seeing a material drop in support load from this change may - there just isn't the evidence it is causing enough problems to be worth the effort to implement a new API. Much quicker changes, like tweaking the existing agroal exception text to include more of the context that's already available but which it's not currently using, would give most of the benefit whilst leaving most engineering time for higher priority issues.

@mmusgrov
Copy link
Contributor

I assume that Agroal is swallowing the exception

@mmusgrov There is no exception from Narayana here, just a method returning a boolean:

https://github.com/agroal/agroal/blob/30465b6eaae28366449832e1ee22bbe8a9b83562/agroal-narayana/src/main/java/io/agroal/narayana/NarayanaTransactionIntegration.java#L120-L122

so just report that as the root cause

I can't, see above :)

Note that there is an open PR to allow mulitple XA unaware resources (#40365) .

Yes I'm aware of that, having been involved quite a bit xD As we discussed though, this is a temporary solution. We need to think of longer term solutions, and providing clear exception messages when someone tries to do something incorrect is the first step.

@yrodiere FYI We do actually indicate why the transaction rolls back when it is asked to commit. If you look at exception (exception.getCause()), although the reason string could be improved (I'll raise a Narayana PR to do that). Did you paste all of the exception trace? You should see something like:

 detailMessage = "setRollbackOnly called from:"
 cause = {Throwable@2478} "java.lang.Throwable: setRollbackOnly called from:"
 stackTrace = {StackTraceElement[33]@2495}
  0 = {StackTraceElement@2498} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.setRollbackOnly(TransactionImple.java:314)"
  1 = {StackTraceElement@2499} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.markRollbackOnly(TransactionImple.java:1634)"
  2 = {StackTraceElement@2500} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:650)"
  3 = {StackTraceElement@2501} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:398)"
  4 = {StackTraceElement@2502} "com.hp.mwtests.ts.jta.lastresource.LastResourceDisallowTestCase.testDisallowed(LastResourceDisallowTestCase.java:38)"
  etc

@yrodiere
Copy link
Member Author

After all, the Agroal integration is already specific to Narayana.

Really? Historically it was specific to the tx component integration SPI, which allowed it to share work with wildfly. Which bits are coupling directly against narayana? Or is the integration SPI now considered part of narayana as it's the only supported implementation?

I have no idea of the history behind this, I've never even contributed to Agroal.

But I can point to the integration, which is specific to Narayana (and as far as Quarkus is concerned, I'm happy it is): https://github.com/agroal/agroal/blob/30465b6eaae28366449832e1ee22bbe8a9b83562/agroal-narayana/src/main/java/io/agroal/narayana/NarayanaTransactionIntegration.java#L22-L25

Indeed. What information do you need that isn't already in the one log line that ARJUNA12140 writes?

The stacktrace. What part of the user code is triggering the issue.

Though I suppose you could include it in the warning log, however unusual that is.

there just isn't the evidence it is causing enough problems to be worth the effort to implement a new API

Alright, then the problem is priority/bandwidth -- that I can completely understand.

I think your best bet here is just to assume that if you enlist a LastResource and get false, it's because there already is another one enlisted. There are very few other reasons that will ever cause an enlistment problem for a LastResource since its start call won't go over the network. I'd rather deal with the vanishingly small number of false positives than change the API to get to 100% certainty on the reason. I don't recall seeing one with any other cause in many years

Alright, then @barreiro our best chance at solving this would be to change the Agroal message to something like below; would that be possible?

Unable to enlist connection to datasource %s to existing transaction. This is generally caused by an attempt to add multiple one-phase resources in the same transaction, because this could result in partial rollbacks. To fix this, either enable XA transaction support in your datasource, use one separate transaction for each resource, or enable workarounds related to 
"allowing multiple last resources" in the Narayana configuration of your framework.

@yrodiere
Copy link
Member Author

@yrodiere FYI We do actually indicate why the transaction rolls back when it is asked to commit. If you look at exception (exception.getCause()), although the reason string could be improved (I'll raise a Narayana PR to do that). Did you paste all of the exception trace? You should see something like:

 detailMessage = "setRollbackOnly called from:"
 cause = {Throwable@2478} "java.lang.Throwable: setRollbackOnly called from:"
 stackTrace = {StackTraceElement[33]@2495}
  0 = {StackTraceElement@2498} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.setRollbackOnly(TransactionImple.java:314)"
  1 = {StackTraceElement@2499} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.markRollbackOnly(TransactionImple.java:1634)"
  2 = {StackTraceElement@2500} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:650)"
  3 = {StackTraceElement@2501} "com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.enlistResource(TransactionImple.java:398)"
  4 = {StackTraceElement@2502} "com.hp.mwtests.ts.jta.lastresource.LastResourceDisallowTestCase.testDisallowed(LastResourceDisallowTestCase.java:38)"
  etc

I didn't even know enliseResource called markRollbackOnly, actually. It's a bit weird that it doesn't throw an exception in that case, but whatever -- that's beside the point.

The thing is... if Agroal cannot enlist a resource, it's faced with a situation it cannot recover from, so it throws an exception. And I suspect that blocks this setRollbackOnly called from from ever being reported to the user, because the exception from Agroal will be reported instead.

See the stacktrace from #39283 , there is no mention of setRollbackOnly, just the exception from Agroal:

io.quarkus.arc.ArcUndeclaredThrowableException: Error invoking subclass method
        at io.test.agroal.bug.reproducer.ReproducerApp_Subclass.run(Unknown Source)
        at io.test.agroal.bug.reproducer.ReproducerApp_ClientProxy.run(Unknown Source)
        at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:132)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
        at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
        at io.quarkus.runner.GeneratedMain.main(Unknown Source)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at io.quarkus.runner.bootstrap.StartupActionImpl$1.run(StartupActionImpl.java:113)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: jakarta.transaction.RollbackException: ARJUNA016053: Could not commit transaction.
        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1283)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:104)
        at io.quarkus.narayana.jta.runtime.NotifyingTransactionManager.commit(NotifyingTransactionManager.java:70)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.endTransaction(TransactionalInterceptorBase.java:406)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:171)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.invokeInOurTx(TransactionalInterceptorBase.java:107)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.doIntercept(TransactionalInterceptorRequired.java:38)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorBase.intercept(TransactionalInterceptorBase.java:61)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired.intercept(TransactionalInterceptorRequired.java:32)
        at io.quarkus.narayana.jta.runtime.interceptor.TransactionalInterceptorRequired_Bean.intercept(Unknown Source)
        at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
        at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
        at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
        ... 10 more
Caused by: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection [Exception in association of connection to existing transaction] [n/a]
        at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:63)
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:108)
        at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:94)
        at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:116)
        at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:143)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl.connection(MutationStatementPreparerImpl.java:137)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl$1.doPrepare(MutationStatementPreparerImpl.java:48)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl$StatementPreparationTemplate.prepareStatement(MutationStatementPreparerImpl.java:106)
        at org.hibernate.engine.jdbc.internal.MutationStatementPreparerImpl.prepareStatement(MutationStatementPreparerImpl.java:38)
        at org.hibernate.engine.jdbc.mutation.internal.ModelMutationHelper.standardStatementPreparation(ModelMutationHelper.java:145)
        at org.hibernate.engine.jdbc.mutation.internal.ModelMutationHelper.lambda$standardPreparation$0(ModelMutationHelper.java:118)
        at org.hibernate.engine.jdbc.mutation.internal.PreparedStatementDetailsStandard.resolveStatement(PreparedStatementDetailsStandard.java:87)
        at org.hibernate.engine.jdbc.mutation.internal.JdbcValueBindingsImpl.lambda$beforeStatement$0(JdbcValueBindingsImpl.java:88)
        at java.base/java.lang.Iterable.forEach(Iterable.java:75)
        at org.hibernate.engine.jdbc.mutation.spi.BindingGroup.forEachBinding(BindingGroup.java:51)
        at org.hibernate.engine.jdbc.mutation.internal.JdbcValueBindingsImpl.beforeStatement(JdbcValueBindingsImpl.java:85)
        at org.hibernate.engine.jdbc.mutation.internal.AbstractMutationExecutor.performNonBatchedMutation(AbstractMutationExecutor.java:104)
        at org.hibernate.engine.jdbc.mutation.internal.MutationExecutorSingleNonBatched.performNonBatchedOperations(MutationExecutorSingleNonBatched.java:40)
        at org.hibernate.engine.jdbc.mutation.internal.AbstractMutationExecutor.execute(AbstractMutationExecutor.java:52)
        at org.hibernate.persister.entity.mutation.InsertCoordinator.doStaticInserts(InsertCoordinator.java:175)
        at org.hibernate.persister.entity.mutation.InsertCoordinator.coordinateInsert(InsertCoordinator.java:113)
        at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2873)
        at org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:104)
        at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:632)
        at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:499)
        at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:363)
        at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:41)
        at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:127)
        at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1403)
        at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:484)
        at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:2319)
        at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:1976)
        at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:439)
        at org.hibernate.resource.transaction.backend.jta.internal.JtaTransactionCoordinatorImpl.beforeCompletion(JtaTransactionCoordinatorImpl.java:336)
        at org.hibernate.resource.transaction.backend.jta.internal.synchronization.SynchronizationCallbackCoordinatorNonTrackingImpl.beforeCompletion(SynchronizationCallbackCoordinatorNonTrackingImpl.java:47)
        at org.hibernate.resource.transaction.backend.jta.internal.synchronization.RegisteredSynchronization.beforeCompletion(RegisteredSynchronization.java:37)
        at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:52)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:351)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:69)
        at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:138)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1271)
        ... 22 more
Caused by: java.sql.SQLException: Exception in association of connection to existing transaction
        at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:130)
        at io.agroal.pool.ConnectionPool.getConnection(ConnectionPool.java:257)
        at io.agroal.pool.DataSource.getConnection(DataSource.java:86)
        at io.quarkus.hibernate.orm.runtime.customized.QuarkusConnectionProvider.getConnection(QuarkusConnectionProvider.java:23)
        at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:46)
        at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnecagedImpl.java:113)
        ... 59 more
Caused by: java.sql.SQLException: Unable to enlist connection to existing transaction
        at io.agroal.narayana.NarayanaTransactionIntegration.associate(NarayanaTransactionIntegration.java:121)
        ... 64 more

We could try to look into ways of leveraging suppressed exceptions to report both to the user, but considering the exception from Narayana just reports a stacktrace we already have, without a detailed message like "you can't add two non-XA resources to a single transaction", I'm not sure that's worth the trouble.

@mmusgrov
Copy link
Contributor

Caused by: jakarta.transaction.RollbackException: ARJUNA016053: Could not commit transaction.

If you look at the cause of the exception (e.getCause()) then setRollbackOnly will appear in the backtrace.

@mmusgrov
Copy link
Contributor

We could try to look into ways of leveraging suppressed exceptions to report both to the user, but considering the exception from Narayana just reports a stacktrace we already have, without a detailed message like "you can't add two non-XA resources to a single transaction", I'm not sure that's worth the trouble.

Our exception includes the root cause and any exception that Agroal generates should maintain the root cause. I can look into including a more detailed reason message in the exception's root cause.

@gastaldi
Copy link
Contributor

#40779 introduces a better error message (see agroal/agroal@6c62f82)

@jhalliday
Copy link
Contributor

jhalliday commented May 23, 2024

The stacktrace. What part of the user code is triggering the issue.

The narayana internal frames won't mean anything to them. Their own code's frames are there anyhow in the exception that agroal already throws.

Caused by: jakarta.transaction.RollbackException: ARJUNA016053: Could not commit transaction.
at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1283)
at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:104)
Caused by: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection [Exception in association of connection to existing transaction] [n/a]
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:63)

That actually is the deferred exception support at work, just in a different way. When commit is called, the transaction system MUST complete termination of the transaction before returning control. That is, it can't stop half way through terminating the tx and throw an exception. So if there is a problem, such as the pre-commit flush failing as here, then it catches and stashes that exception, goes on to finish the job by rolling back, then attaches the exception to the RollbackException it throws at the end. The same pattern can be used when a user explicitly calls setRollbackOnly rather than the tx system catching an exception - the stack at the point where the user or library called setRollbackOnly is stashed and thrown as cause if the user later tries to call commit. We could do that internally - catch the hibernate exception and call setRollbackOnly, relying on it to do the stack capture, but then the cause would rooted in the exception handler, not in hibernate.

Nothing would stop us stashing some more context when we internally call setRollbackOnly before returning false from enlistResource, but that would help the tx rollback message, not the one that's thrown by agroal. A slightly kludgy workaround would be to have enlistResource stash the offending LastResource instances on the tx context, from which agroal could retrieve them by tsr.getResource.

@yrodiere yrodiere linked a pull request Jun 3, 2024 that will close this issue
@yrodiere
Copy link
Member Author

yrodiere commented Jun 3, 2024

The narayana internal frames won't mean anything to them. Their own code's frames are there anyhow in the exception that agroal already throws.

We're going in circles... Yes the stacktrace with user frames is in the Agroal exception, but that exception has no information about what failed exactly.

Anyway. Let's settle for the hardcoded message introduced in #40779 / agroal/agroal@6c62f82 , which is similar to what you suggested. It's probably the best compromise right now considering adding new (throwing) methods in Narayana is not an option.

Closing, thanks.

@yrodiere yrodiere closed this as completed Jun 3, 2024
@yrodiere yrodiere added this to the 3.12 - main milestone Jun 3, 2024
@gsmet gsmet modified the milestones: 3.12 - main, 3.11.1 Jun 4, 2024
@yrodiere
Copy link
Member Author

See also #41090

@mmusgrov
Copy link
Contributor

@yrodiere FYI we added a specific arjuna logging code for the message that get's reported in the rollback exception, notice that I took onboard your recommended text.

@mmusgrov
Copy link
Contributor

We did that in narayana pr jbosstm/narayana#2256

@yrodiere
Copy link
Member Author

Got it @mmusgrov , thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/agroal area/narayana Transactions / Narayana kind/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants