-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Elsa 3.2.1 - Workflow instances aren't persisted when starting #6067
Comments
Is no one else experiencing this issue ? This looks like a pretty serious problem impacting both workflows execution monitoring and workflow instance resuming in the cause of a server stop. There is just no trace in the database of a workflow instance being executed until it finishes running. |
Hi @valsov , this is actually by design. The rationale being that not all scenarios demand immediate persistence of a workflow instance. Rather, there are scenarios where performance is more important than an up-to-date state in the DB. To that end, we have this parent issue: #4833 Until that feature is implemented, the workflow state is committed to the DB only when there are no more activities to execute (i.e. when the workflow gets suspended or is finished). Now, as you have observed, some trigger implementations decide to create the workflow instance before executing it. This is a deviation from the current pattern and was done in order to store workflow input with the workflow instance, rather than sending that input via a service bus message - which can lead to message size exceeded exceptions when providing large inputs. Until the aforementioned feature is implemented, you can try implementing the |
Hello @sfmskywalker , thank you for your input on this. For reference, this is the code I'm using to restart a workflow instance: var runner = serviceProvider.GetRequiredService<IWorkflowRunner>();
var definitionStore = serviceProvider.GetRequiredService<IWorkflowDefinitionService>();
var instanceStore = serviceProvider.GetRequiredService<IWorkflowInstanceStore>();
var instanceManager = serviceProvider.GetRequiredService<IWorkflowInstanceManager>();
// Retrieve state
var instance = await instanceStore.FindAsync(new Elsa.Workflows.Management.Filters.WorkflowInstanceFilter
{
WorkflowStatus = Elsa.Workflows.WorkflowStatus.Running
});
var workflowGraph = await definitionStore.FindWorkflowGraphAsync(instance.DefinitionVersionId);
// Restart workflow
var endState = await runner.RunAsync(workflowGraph, instance.WorkflowState);
// Persist instance
await instanceManager.SaveAsync(endState.WorkflowState, CancellationToken.None); I look forward to the workflow execution recovery feature you mentioned. |
It's great to see that you got this working! In order to persist and reload workflow variables, try using the |
Thanks again. I managed to implement a simple recovery solution with your answers. Here is the code in case anyone needs it: public async Task HandleAsync(ActivityExecuted notification, CancellationToken cancellationToken)
{
try
{
var workflowInstanceManager = serviceProvider.GetRequiredService<IWorkflowInstanceManager>();
await workflowInstanceManager.SaveAsync(notification.ActivityExecutionContext.WorkflowExecutionContext, cancellationToken);
// Persist variables state when the activity sets a variables:
// 1) SetVariable activity
// 2) Set variable via output
if (notification.ActivityExecutionContext.Activity is SetVariable ||
notification.ActivityExecutionContext.Activity.GetOutputs().Any())
{
var variableService = serviceProvider.GetRequiredService<IVariablePersistenceManager>();
await variableService.SaveVariablesAsync(notification.ActivityExecutionContext.WorkflowExecutionContext);
}
}
catch (Exception e)
{
logger.LogError(e, "Workflow lifecycle notification handling failed ({LifecycleEvent})", nameof(ActivityExecuted));
}
} /// <summary>
/// Service running on startup to resume workflows that were executing before the server stop.
/// </summary>
/// <remarks>Monitor this issue to remove this custom code: <see href="https://github.com/elsa-workflows/elsa-core/issues/4833"/></remarks>
public class WorkflowsRecoveryBackgroundService(IServiceScopeFactory serviceScopeFactory, ILogger<WorkflowsRecoveryBackgroundService> logger) : BackgroundService
{
/// <summary>
/// Server startup time, should be set by the server configuration pipeline
/// </summary>
public static DateTime ServerStartupTime { get; set; }
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
using var scope = serviceScopeFactory.CreateScope();
var instanceStore = scope.ServiceProvider.GetRequiredService<IWorkflowInstanceStore>();
try
{
// Find all workflow instances with a running status that started before the application started, thus needing recovery
var instancesNeedingRecovery = (await instanceStore.FindManyAsync(new WorkflowInstanceFilter
{
WorkflowStatus = WorkflowStatus.Running,
TimestampFilters =
[
new TimestampFilter
{
Column = nameof(WorkflowInstance.UpdatedAt),
Operator = TimestampFilterOperator.LessThan,
Timestamp = new DateTimeOffset(ServerStartupTime)
}
]
}, stoppingToken)).ToList();
if (instancesNeedingRecovery.Count == 0)
{
logger.LogDebug("Workflows recovery process finished: no instances to recover");
return;
}
logger.LogInformation("Workflows recovery process started for {WorkflowInstancesCount} workflow instances", instancesNeedingRecovery.Count);
var tasks = new List<Task>();
foreach (var instance in instancesNeedingRecovery)
{
tasks.Add(ResumeWorkflowExecution(instance, stoppingToken));
}
await Task.WhenAll(tasks);
logger.LogInformation("Workflows recovery process finished");
}
catch (OperationCanceledException)
{
throw;
}
catch (Exception e)
{
logger.LogError(e, "Workflows recovery process failed");
}
}
private async Task ResumeWorkflowExecution(WorkflowInstance instance, CancellationToken stoppingToken)
{
using var logScope = logger.AddLogPropertyToScope("CorrelationId", instance.CorrelationId);
try
{
// Create a new service scope from each recovery: needed for a correct execution context
using var serviceScope = serviceScopeFactory.CreateScope();
var runner = serviceScope.ServiceProvider.GetRequiredService<IWorkflowRunner>();
var definitionStore = serviceScope.ServiceProvider.GetRequiredService<IWorkflowDefinitionService>();
var instanceManager = serviceScope.ServiceProvider.GetRequiredService<IWorkflowInstanceManager>();
var variableService = serviceScope.ServiceProvider.GetRequiredService<IVariablePersistenceManager>();
// Retrieve state
var workflowGraph = await definitionStore.FindWorkflowGraphAsync(instance.DefinitionVersionId);
var executionContext = await WorkflowExecutionContext.CreateAsync(serviceScope.ServiceProvider, workflowGraph, instance.WorkflowState, instance.CorrelationId, instance.ParentWorkflowInstanceId, instance.WorkflowState.Input, instance.WorkflowState.Properties);
await variableService.LoadVariablesAsync(executionContext);
logger.LogInformation("Workflow instance execution resuming started: {Workflow}", workflowGraph.Workflow.WorkflowMetadata.Name);
// Start workflow
var endState = await runner.RunAsync(executionContext);
// Persist workflow instance
await instanceManager.SaveAsync(endState.WorkflowState, stoppingToken);
}
catch (Exception e)
{
logger.LogError(e, "Failed to resume workflow instance execution with Id={WorkflowInstanceId}", instance.Id);
}
}
} |
Description
When starting a workflow instance, both manually from Elsa studio, or automatically with a CRON expression (configured in studio) with a configured Quartz scheduler; the instance isn't persisted to the database until it has completed execution.
This is concerning because if the server stops, a running workflow wont resume.
Steps to Reproduce
To help us identify the issue more quickly, please follow these guidelines:
In Elsa studio:
Elsa server configuration code:
Attachments:
cron-start-workflow.json
manual-start-workflow.json
Reproduction Rate: every time, both with scheduled and manual execution from Elsa studio
Video/Screenshots: N/A
Additional Configuration:
Expected Behavior
I would expect the workflow instance to be persisted to the database when its execution starts, this way I can view its progress in the instances view in Elsa dashboard.
Actual Behavior
The instance is only persisted at its execution end. I checked
IWorkflowInstanceStore
, the workflow instances dashboard page and[WorkflowInstances]
SQL table. The instance is nowhere to be found until it has finished executing.Environment
Troubleshooting Attempts
I looked for documentation on the subject but didn't find any. I tried to configure the workflow engine with multiple options, like manually configuring the workflow execution pipeline, but it didn't change the behavior.
It look like
BackgroundWorkflowDispatcher
doesn't create a workflow instance before executing it. This is strange because the MassTransit implementation ofIWorkflowDispatcher
(MassTransitWorkflowDispatcher
) creates one. Am I missing something ?Related Issues
I Found this related issue, but I don't have access to the code executing the workflows in my use cases:
#5832
The text was updated successfully, but these errors were encountered: