You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SELECT e.{Configuration.SequenceNrColumnName} as SeqNr FROM {Configuration.FullJournalTableName} e WHERE e.{Configuration.PersistenceIdColumnName} = @PersistenceId
UNION
SELECT m.{Configuration.SequenceNrColumnName} as SeqNr FROM {Configuration.FullMetaTableName} m WHERE m.{Configuration.PersistenceIdColumnName} = @PersistenceId) as u";
Due to how the query is written, the max value is calculated from a temporary table that selects everything matching the specified PersistenceId , and doing an union between the journal and the metadata table. This is normally fine with actors with a low number of events for a specific PersistenceId, but the process of building the intermediate table forces the database to actually select all the matching rows and retrieve them. For actors that have a high number of events for a single PersistenceId this causes a very high I/O throughput and a long execution time.
We noticed this on our system where we have some EventProcessors actors that have million of events for a single PersistenceId. Even though they have snapshots, this query is still executed and it takes several seconds to complete. When Akka is booted and persisted actors are recovered, we have a dozen of those EventProcessors actors that compete to complete this query, causing I/O exhaustion and SQL timeouts after several minutes:
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at System.Data.SqlClient.SqlCommand.InternalEndExecuteReader(IAsyncResult asyncResult, String endMethod)
at System.Data.SqlClient.SqlCommand.EndExecuteReaderInternal(IAsyncResult asyncResult)
at System.Data.SqlClient.SqlCommand.EndExecuteReader(IAsyncResult asyncResult)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location ---
at Akka.Persistence.Sql.Common.Journal.AbstractQueryExecutor.SelectHighestSequenceNrAsync(DbConnection connection, CancellationToken cancellationToken, String persistenceId)
at Akka.Persistence.Sql.Common.Journal.SqlJournal.ReadHighestSequenceNrAsync(String persistenceId, Int64 fromSequenceNr)
at Akka.Util.Internal.AtomicState.CallThrough[T](Func`1 task)
at Akka.Util.Internal.AtomicState.CallThrough[T](Func`1 task)
ClientConnectionId:1ea18be6-f53f-43aa-b760-fea225669a7f
Error Number:-2,State:0,Class:11
To Reproduce
Steps to reproduce the behavior:
Run the affected SQL query on any PersistenceId. Example with example actor called CheckoutSagaManager: SELECT MAX(u.SeqNr) as SequenceNr FROM ( SELECT e.SequenceNr as SeqNr FROM dbo.EventJournal e WHERE e.PersistenceId = 'CheckoutSagaManager' UNION SELECT m.SequenceNr as SeqNr FROM dbo.Metadata m WHERE m.PersistenceId = 'CheckoutSagaManager') as u
Analyze the execution plan, and notice how the number of rows read matches the number of events for that PersistenceId:
Expected behavior
This query does not need to read all rows, it should only select the MAX by using the existing index
Environment
Windows
Solution
The solution to this problem is very simple, as it is sufficient to include the MAX statement in the sub-queries in the UNION, so that the database is not forced to retrieve all these rows.
I'm currently working on the fix, and I will create PR with resolution shortly.
The text was updated successfully, but these errors were encountered:
@Aaronontheweb, just dropping by to say that I've tested release 1.4.29 and I can see in SQL profiler the updated query being applied and outputting expected result.
Version Information
Reproduced with Akka 1.4.10.
Uses Akka.Persistence.SqlServer for persistance.
Describe the bug
When persisted actors are recovered, the highest sequence number is retrieved with the following query:
akka.net/src/contrib/persistence/Akka.Persistence.Sql.Common/Journal/QueryExecutor.cs
Lines 343 to 348 in ccb4670
Due to how the query is written, the max value is calculated from a temporary table that selects everything matching the specified PersistenceId , and doing an union between the journal and the metadata table. This is normally fine with actors with a low number of events for a specific PersistenceId, but the process of building the intermediate table forces the database to actually select all the matching rows and retrieve them. For actors that have a high number of events for a single PersistenceId this causes a very high I/O throughput and a long execution time.
We noticed this on our system where we have some EventProcessors actors that have million of events for a single PersistenceId. Even though they have snapshots, this query is still executed and it takes several seconds to complete. When Akka is booted and persisted actors are recovered, we have a dozen of those EventProcessors actors that compete to complete this query, causing I/O exhaustion and SQL timeouts after several minutes:
To Reproduce
Steps to reproduce the behavior:
Run the affected SQL query on any PersistenceId. Example with example actor called CheckoutSagaManager:
SELECT MAX(u.SeqNr) as SequenceNr FROM ( SELECT e.SequenceNr as SeqNr FROM dbo.EventJournal e WHERE e.PersistenceId = 'CheckoutSagaManager' UNION SELECT m.SequenceNr as SeqNr FROM dbo.Metadata m WHERE m.PersistenceId = 'CheckoutSagaManager') as u
Analyze the execution plan, and notice how the number of rows read matches the number of events for that PersistenceId:
Expected behavior
This query does not need to read all rows, it should only select the MAX by using the existing index
Environment
Windows
Solution
The solution to this problem is very simple, as it is sufficient to include the MAX statement in the sub-queries in the UNION, so that the database is not forced to retrieve all these rows.
I'm currently working on the fix, and I will create PR with resolution shortly.
The text was updated successfully, but these errors were encountered: