-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ADF Tool #21
Improve ADF Tool #21
Conversation
tongwu-sh
commented
Mar 18, 2020
- Support parallel processing for ADF Tool
- Support cross region source and destination
- Change to use streaming mode without overhead on disk
Feature/multithreads
Feature/multithreads
Merge from master
Fix data transfer issue
src/Fhir.Anonymizer.AzureDataFactoryPipeline/src/FhirBlobConsumer.cs
Outdated
Show resolved
Hide resolved
Console.WriteLine($"[{stopWatch.Elapsed.ToString()}][tid:{args.CurrentThreadId}]: {processedCount} Completed. {processedErrorCount} Failed. {consumedCount} consume completed."); | ||
}; | ||
|
||
await executor.ExecuteAsync(CancellationToken.None, false, progress).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/Should we build resiliency against node crash in the middle of a large blob processing (say picking only unprocessed data on restart)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add file level retry at ADF pipeline first? Use foreach activity to run anonymizer tool on single file and retry if it failed?
To achieve resume functionality, looks like we need somewhere for partial status, in batch mode, we can do this with additional storage table or ... If we change to use azure function, looks like we can easily leverage the partial status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a story to track this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a story to backlog for tracking. Thanks! https://microsofthealth.visualstudio.com/Health/_workitems/edit/73121/