-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial creation of filesystem scanner for DMLib #21492
Initial creation of filesystem scanner for DMLib #21492
Conversation
…ses for filesystem scanner
sdk/storage/Azure.Storage.Common.DataMovement/src/FilesystemScanner.cs
Outdated
Show resolved
Hide resolved
// Skip the file if opening stream fails | ||
catch | ||
{ | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, add a TODO for logging, it would be good for the user to know why we didn't upload certain files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File check removed, see below.
sdk/storage/Azure.Storage.Common.DataMovement/src/FilesystemScanner.cs
Outdated
Show resolved
Hide resolved
public void ScanFolderContainingMixedPermissions() | ||
{ | ||
// Arrange | ||
string testPath = "C:\\Test"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use something like this https://docs.microsoft.com/en-us/dotnet/api/system.io.path.gettemppath?view=net-5.0&tabs=windows
Also this to create files
https://docs.microsoft.com/en-us/dotnet/api/system.io.path.gettempfilename?view=net-5.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the other tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote tests to use this, all seem to pass on Windows.
TODO: Testing on Linux/macOS
I'm thinking the build errors from the pipelines at the moment might be coming from the tests. The Mono.Posix.NETStandard package I added is causing the following warning to show:
- MSB3245: Could not resolve this reference. Could not locate the assembly "Mono.Posix". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.
- Azure.Storage.Common.DataMovement.Tests
- Azure.Storage.Blobs.DataMovement.Tests
Windows builds fine, but since the package is only relevant for Unix/Posix platforms, it could be causing the build failures due to missing assembly. I need to set up the VM and check on this + run tests.
…precede first successful yield
520abcc
to
3c8090d
Compare
This pull request is protected by Check Enforcer. What is Check Enforcer?Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass. Why am I getting this message?You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged. What should I do now?If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows: What if I am onboarding a new service?Often, new services do not have validation pipelines associated with them. In order to bootstrap pipelines for a new service, please perform following steps: For data-plane/track 2 SDKs Issue the following command as a pull request comment:
For track 1 management-plane SDKsPlease open a separate PR and to your service SDK path in this file. Once that PR has been merged, you can re-run the pipeline to trigger the verification. |
/check-enforcer evaluate |
/check-enforcer reset |
1 similar comment
/check-enforcer reset |
/check-enforcer evaluate |
3c8090d
to
faac640
Compare
/azp run net - storage - ci |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run net - storage - ci |
Azure Pipelines successfully started running 1 pipeline(s). |
@@ -202,6 +202,7 @@ | |||
<PackageReference Update="Microsoft.Rest.ClientRuntime.Azure.TestFramework" Version="[1.7.7, 2.0.0)" /> | |||
<PackageReference Update="Microsoft.ServiceFabric.Data" Version="3.3.624" /> | |||
<PackageReference Update="Microsoft.Spatial" Version="7.5.3" /> | |||
<PackageReference Update="Mono.Posix" Version="7.0.0-alpha8.21302.6" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check if we can introduce these dependencies first.
My guess is that we can't as they appear to be 3rd party.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are only for testing we can add them in test project file directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#21715 I took the two dependencies I introduced for testing out of this file and used VersionOverride in the .csproj. The error spooked me and override sounded scary.
I can't tell if the Mono.Posix is/isn't 3rd party, since Microsoft is listed as an owner alongside some other users. There was another package called Mono.Posix.NETStandard with only Microsoft as an owner, but for some reason, that package was searching for the assembly "Mono.Posix". I couldn't think of anything to get that one to properly find the assembly without potentially causing other issues.
Do you think I should drop the dependency and maybe just execute bash + call chmod manually? I fiddled with manually loading the c library for chmod, but I couldn't do it with .NET Core 2.1 as a target.
/// </summary> | ||
/// <param name="path">Filesystem location.</param> | ||
/// <returns>Enumerable list of absolute paths containing all relevant files the user has permission to access.</returns> | ||
public static IEnumerable<string> ScanLocation(string path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#21715 I refactored the scanner to make it non-static and shift a couple things around, let me know if there's any issues with this implementation.
* Create packages for DM Common and Blobs * Making Test packages for DM Common and Blobs; Added Readme, Changelog, BreakingChanges stubs * WIP - Added BlobDirectoryUploadOptions and StorageTransferStatus * Initial creation of filesystem scanner for DMLib (#21492) * Filesystem scanner refactored to non-static implementation (#21715) * Created filesystem scanner for DM Common * Modifed scanner to properly handle missing permissions; added test cases for filesystem scanner * Tests remade using native temp files; Scanner now throws errors that precede first successful yield * Changed Posix compatibility dependency * Edited versioning and READMEs to adhere to pipelines * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Scanner will only work on one path for now * Capitalization on FileSystemScanner * Changed scanner to internal * Refactored FS scanner to use factory model and work better with mocking (#21894) * Refactored FS scanner to use factory model and work better with mocking * Rename class/simplify factory implementation * Return folders as well; preview of ctor logic changes (only throw if path nonexistent/malformed) * Changed parameter name for scan (continueOnError), re-exported API * More exported API changes * DMLib Skeleton start (#22336) * WIP - Removed DataMovement Blobs package, consildate to one package * WIP - Storage Transfer Jobs * WIP - remove dm blobs * WIP - Added TraansferItemScheduler * Ran exportapis * WIP - Resolve package conflicts * Addressed most PR comments * Ran export-api script * Made job for each specific operation for blobs * Added specific copy directory jobs, added option bags for copy scenarios * Ran ExportApi script * Update comments in StorageTransferManager * Rename BlobUploadDirectoryOptions -> BlobDirectoryUploadOptions * Run ExportAPI * PR Comments * Merge fix * WIP * Directory Upload and Download basic tests work * Test recordings test * Rerecord tests * WIP - not all ListBlobs/GetBlobs tests for DirectoryClient pass * WIP - blobtransfermanager * WIP - Moving configuations for DM Blobs * WIP - blobtransferjobs * Updated storage solution file * WIP - pathScanner tests * WIP - champion scenarios * WIP - champ scenarios * WIP - small changes * WIP' * WIP * WIP * Create packages for DM Common and Blobs * Making Test packages for DM Common and Blobs; Added Readme, Changelog, BreakingChanges stubs * WIP - Added BlobDirectoryUploadOptions and StorageTransferStatus * Initial creation of filesystem scanner for DMLib (#21492) * Filesystem scanner refactored to non-static implementation (#21715) * Created filesystem scanner for DM Common * Modifed scanner to properly handle missing permissions; added test cases for filesystem scanner * Tests remade using native temp files; Scanner now throws errors that precede first successful yield * Changed Posix compatibility dependency * Edited versioning and READMEs to adhere to pipelines * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Scanner will only work on one path for now * Capitalization on FileSystemScanner * Changed scanner to internal * Refactored FS scanner to use factory model and work better with mocking (#21894) * Refactored FS scanner to use factory model and work better with mocking * Rename class/simplify factory implementation * Return folders as well; preview of ctor logic changes (only throw if path nonexistent/malformed) * Changed parameter name for scan (continueOnError), re-exported API * More exported API changes * DMLib Skeleton start (#22336) * WIP - Removed DataMovement Blobs package, consildate to one package * WIP - Storage Transfer Jobs * WIP - remove dm blobs * WIP - Added TraansferItemScheduler * Ran exportapis * WIP - Resolve package conflicts * Addressed most PR comments * Ran export-api script * Made job for each specific operation for blobs * Added specific copy directory jobs, added option bags for copy scenarios * Ran ExportApi script * Update comments in StorageTransferManager * Rename BlobUploadDirectoryOptions -> BlobDirectoryUploadOptions * Run ExportAPI * PR Comments * Merge fix * Merge main update * WIP * Builds here without Azure.Storage.DataMovement.Blobs * Builds - DMLib common, DMlib blobs, DMlib samples * Added back in blobs tests * BlobTransferScheduler updated, logger updated, plan file updated * API generates * Rerun some tests, attempting to fix some parallel start problems * Resolve bad merge conflicts * DMLib builds but Blobs.Tests does not build * Conversion from internal job to job details * Run exports api update * Update logger information * Changed threadpool method to use inherit TransferScheduler * Remove previous implementation of getting job details, and combine into one * Removing mistake of committed files * Update to Job Plan header * Updating manager detail of API * Add abstract resumeJob to base storagetransfermanager * Update event arguments to have individual ones for each case, update progress handler to basic handler, update copy method * Removed base DM models, made base event args, made protected ctor StorageTransferManager * Changed Directory Download to DownloadTo, added overwrite options, updated internal pipeline transfer for directoryclient * change string to uri for local paths, remove unncessary things from blob job properties * WIP - changing job details out, added more champ scenarios regarding progress tracking * Updating Resume API, correcting event arg names, correctly linked internal deep copy for directory client * Readded upload directory tests with working json files, changed uploadDirectory API return type, Mild changes to some APIs, renamed part files * WIP * Cannot catch exception properly, tear downs upload call * Addressing Arch board comment * Some fixes from merging from main Remove test dependency on AesGcm for datamovement * WIP * Renamed Experimental to DataMovement * Fixed channel blocklist issue * WIP - changing event handler in uploader to trigger block status * Working commit block handler * WIP * Changes to Download and APIs regarding download * Copy Service Job cleanup * WIP - API changes to StorageResource and Controller * WIP * WIP - Aligning blobs API usage * WIP - Added dependenices to Azure.Storage.DataMovement.Test * WIP - Updated APIs to include checkpointing * WIP - ConsumeableStream -> GetConsumerableStream * WIP - make old API structure internal; todo: remove all old APIs * WIP - Remade API for blobs DM, removed CopyMethod * WIP -Update to StorageTransfer event args name * WIP - Removed factory calls, made dervived storage resource types public * Merged BlobDataController to main controller, renamed DataController to Transfermanager, removed ListType from StorageResource * WIP - Added Checkpointer API, removed unnecessary -1 enum values, updated job plan header bytes * WIP - removed options from respective derived storage resource calls, added options bag to blob storage resources * WIP - renamed CommitListTYpe to clearer type * WIP - Update to Copy Options api in blockblob storage, and samples * WIP - Updated APIs * WIP - Updated APIs to include offset streams * WIP - Rename writetooffsetoptions with storageresource prefixed * WIP - copy to and from and update to mmp job plan file * Added over the concurrency tuner * Remove ConfigureAwait from samples * WIP - changes to MMF, service to service copy and adding method to pass the token credential or bearer token to storage resource * WIP - fixes to event handler, removable of complete transfer check api * WIP - fix to closing stream when reading from local, setting blocklist order before commiting * WIP - tests * WIP - Remove unnecessary APIs and old code * Removing more unnecessary changes and test recordings for old tests * More removal of old test recordings * Removing BlobFolderClient / BlobVirtualDirectoryClient * Ran Export APIs, moved DataTransferExtensions to DataTransfer * ApiView Comments addressed * Renamed from Blobs.DataMovement to DataMovement.Blobs * Ran ExportApis * Updating assemblyinfo datamovement blobs namespace * Move over Storage Resource tests; Made some API corrections * Remove suppression on editorconfig * Added API for creation of blobs storage resource, max chunk size, more tests, fixes * Changed GetStorageResources to return a base class of storage resource; fixed bugs with append / sequential operations; Updated copy status handler for async copy * PR Comments - reverted necessary config files, moved constants to a separate file, rremvoed globalsupression files * Export APIs * PR Comments - removed merge mistakes, updated some xml comments, change some option bags, removed blobstorageresourcefactory, removed more globalsupression files * PR Comments - Move unnecessary return xml removed and removed localfilefactory * PR Comments - Removing leftover folder models from BlobVirtualFolderClient * Updating GetProperties comment XML, removing first value from cpu monitor reading, adding try block to delete file when failed download chunks occur * Fix to directory, and some test changes to use DataTransfer awaitcompletion * Update to tests and adding discovered length optimization * Ignore some tests for now, to push recording in a separate PR * Update readmes * Ignore more tests * Ignore more local directory tests * Temporarily remove nuget package link; readd when link works when package is released * Update snippets to include length Co-authored-by: Rushi Patel <[email protected]>
* Create packages for DM Common and Blobs * Making Test packages for DM Common and Blobs; Added Readme, Changelog, BreakingChanges stubs * WIP - Added BlobDirectoryUploadOptions and StorageTransferStatus * Initial creation of filesystem scanner for DMLib (Azure#21492) * Filesystem scanner refactored to non-static implementation (Azure#21715) * Created filesystem scanner for DM Common * Modifed scanner to properly handle missing permissions; added test cases for filesystem scanner * Tests remade using native temp files; Scanner now throws errors that precede first successful yield * Changed Posix compatibility dependency * Edited versioning and READMEs to adhere to pipelines * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Scanner will only work on one path for now * Capitalization on FileSystemScanner * Changed scanner to internal * Refactored FS scanner to use factory model and work better with mocking (Azure#21894) * Refactored FS scanner to use factory model and work better with mocking * Rename class/simplify factory implementation * Return folders as well; preview of ctor logic changes (only throw if path nonexistent/malformed) * Changed parameter name for scan (continueOnError), re-exported API * More exported API changes * DMLib Skeleton start (Azure#22336) * WIP - Removed DataMovement Blobs package, consildate to one package * WIP - Storage Transfer Jobs * WIP - remove dm blobs * WIP - Added TraansferItemScheduler * Ran exportapis * WIP - Resolve package conflicts * Addressed most PR comments * Ran export-api script * Made job for each specific operation for blobs * Added specific copy directory jobs, added option bags for copy scenarios * Ran ExportApi script * Update comments in StorageTransferManager * Rename BlobUploadDirectoryOptions -> BlobDirectoryUploadOptions * Run ExportAPI * PR Comments * Merge fix * WIP * Directory Upload and Download basic tests work * Test recordings test * Rerecord tests * WIP - not all ListBlobs/GetBlobs tests for DirectoryClient pass * WIP - blobtransfermanager * WIP - Moving configuations for DM Blobs * WIP - blobtransferjobs * Updated storage solution file * WIP - pathScanner tests * WIP - champion scenarios * WIP - champ scenarios * WIP - small changes * WIP' * WIP * WIP * Create packages for DM Common and Blobs * Making Test packages for DM Common and Blobs; Added Readme, Changelog, BreakingChanges stubs * WIP - Added BlobDirectoryUploadOptions and StorageTransferStatus * Initial creation of filesystem scanner for DMLib (Azure#21492) * Filesystem scanner refactored to non-static implementation (Azure#21715) * Created filesystem scanner for DM Common * Modifed scanner to properly handle missing permissions; added test cases for filesystem scanner * Tests remade using native temp files; Scanner now throws errors that precede first successful yield * Changed Posix compatibility dependency * Edited versioning and READMEs to adhere to pipelines * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Refactored scanner to non-static implementation; provided user configurable options for error handling * Removed test dependencies from central package list * Scanner will only work on one path for now * Capitalization on FileSystemScanner * Changed scanner to internal * Refactored FS scanner to use factory model and work better with mocking (Azure#21894) * Refactored FS scanner to use factory model and work better with mocking * Rename class/simplify factory implementation * Return folders as well; preview of ctor logic changes (only throw if path nonexistent/malformed) * Changed parameter name for scan (continueOnError), re-exported API * More exported API changes * DMLib Skeleton start (Azure#22336) * WIP - Removed DataMovement Blobs package, consildate to one package * WIP - Storage Transfer Jobs * WIP - remove dm blobs * WIP - Added TraansferItemScheduler * Ran exportapis * WIP - Resolve package conflicts * Addressed most PR comments * Ran export-api script * Made job for each specific operation for blobs * Added specific copy directory jobs, added option bags for copy scenarios * Ran ExportApi script * Update comments in StorageTransferManager * Rename BlobUploadDirectoryOptions -> BlobDirectoryUploadOptions * Run ExportAPI * PR Comments * Merge fix * Merge main update * WIP * Builds here without Azure.Storage.DataMovement.Blobs * Builds - DMLib common, DMlib blobs, DMlib samples * Added back in blobs tests * BlobTransferScheduler updated, logger updated, plan file updated * API generates * Rerun some tests, attempting to fix some parallel start problems * Resolve bad merge conflicts * DMLib builds but Blobs.Tests does not build * Conversion from internal job to job details * Run exports api update * Update logger information * Changed threadpool method to use inherit TransferScheduler * Remove previous implementation of getting job details, and combine into one * Removing mistake of committed files * Update to Job Plan header * Updating manager detail of API * Add abstract resumeJob to base storagetransfermanager * Update event arguments to have individual ones for each case, update progress handler to basic handler, update copy method * Removed base DM models, made base event args, made protected ctor StorageTransferManager * Changed Directory Download to DownloadTo, added overwrite options, updated internal pipeline transfer for directoryclient * change string to uri for local paths, remove unncessary things from blob job properties * WIP - changing job details out, added more champ scenarios regarding progress tracking * Updating Resume API, correcting event arg names, correctly linked internal deep copy for directory client * Readded upload directory tests with working json files, changed uploadDirectory API return type, Mild changes to some APIs, renamed part files * WIP * Cannot catch exception properly, tear downs upload call * Addressing Arch board comment * Some fixes from merging from main Remove test dependency on AesGcm for datamovement * WIP * Renamed Experimental to DataMovement * Fixed channel blocklist issue * WIP - changing event handler in uploader to trigger block status * Working commit block handler * WIP * Changes to Download and APIs regarding download * Copy Service Job cleanup * WIP - API changes to StorageResource and Controller * WIP * WIP - Aligning blobs API usage * WIP - Added dependenices to Azure.Storage.DataMovement.Test * WIP - Updated APIs to include checkpointing * WIP - ConsumeableStream -> GetConsumerableStream * WIP - make old API structure internal; todo: remove all old APIs * WIP - Remade API for blobs DM, removed CopyMethod * WIP -Update to StorageTransfer event args name * WIP - Removed factory calls, made dervived storage resource types public * Merged BlobDataController to main controller, renamed DataController to Transfermanager, removed ListType from StorageResource * WIP - Added Checkpointer API, removed unnecessary -1 enum values, updated job plan header bytes * WIP - removed options from respective derived storage resource calls, added options bag to blob storage resources * WIP - renamed CommitListTYpe to clearer type * WIP - Update to Copy Options api in blockblob storage, and samples * WIP - Updated APIs * WIP - Updated APIs to include offset streams * WIP - Rename writetooffsetoptions with storageresource prefixed * WIP - copy to and from and update to mmp job plan file * Added over the concurrency tuner * Remove ConfigureAwait from samples * WIP - changes to MMF, service to service copy and adding method to pass the token credential or bearer token to storage resource * WIP - fixes to event handler, removable of complete transfer check api * WIP - fix to closing stream when reading from local, setting blocklist order before commiting * WIP - tests * WIP - Remove unnecessary APIs and old code * Removing more unnecessary changes and test recordings for old tests * More removal of old test recordings * Removing BlobFolderClient / BlobVirtualDirectoryClient * Ran Export APIs, moved DataTransferExtensions to DataTransfer * ApiView Comments addressed * Renamed from Blobs.DataMovement to DataMovement.Blobs * Ran ExportApis * Updating assemblyinfo datamovement blobs namespace * Move over Storage Resource tests; Made some API corrections * Remove suppression on editorconfig * Added API for creation of blobs storage resource, max chunk size, more tests, fixes * Changed GetStorageResources to return a base class of storage resource; fixed bugs with append / sequential operations; Updated copy status handler for async copy * PR Comments - reverted necessary config files, moved constants to a separate file, rremvoed globalsupression files * Export APIs * PR Comments - removed merge mistakes, updated some xml comments, change some option bags, removed blobstorageresourcefactory, removed more globalsupression files * PR Comments - Move unnecessary return xml removed and removed localfilefactory * PR Comments - Removing leftover folder models from BlobVirtualFolderClient * Updating GetProperties comment XML, removing first value from cpu monitor reading, adding try block to delete file when failed download chunks occur * Fix to directory, and some test changes to use DataTransfer awaitcompletion * Update to tests and adding discovered length optimization * Ignore some tests for now, to push recording in a separate PR * Update readmes * Ignore more tests * Ignore more local directory tests * Temporarily remove nuget package link; readd when link works when package is released * Update snippets to include length Co-authored-by: Rushi Patel <[email protected]>
All SDK Contribution checklist:
This checklist is used to make sure that common guidelines for a pull request are followed.
Draft
mode if it is:General Guidelines and Best Practices
Testing Guidelines
SDK Generation Guidelines
*.csproj
andAssemblyInfo.cs
files have been updated with the new version of the SDK. Please double check nuget.org current release version.Additional management plane SDK specific contribution checklist:
Note: Only applies to
Microsoft.Azure.Management.[RP]
orAzure.ResourceManager.[RP]
Management plane SDK Troubleshooting
If this is very first SDK for a services and you are adding new service folders directly under /SDK, please add
new service
label and/or contact assigned reviewer.If the check fails at the
Verify Code Generation
step, please ensure:generate.ps1/cmd
to generate this PR instead of callingautorest
directly.Please pay attention to the @microsoft.csharp version output after running
generate.ps1
. If it is lower than current released version (2.3.82), please run it again as it should pull down the latest version.Note: We have recently updated the PSH module called by
generate.ps1
to emit additional data. This would help reduce/eliminate the Code Verification check error. Please run following command:Old outstanding PR cleanup
Please note:
If PRs (including draft) has been out for more than 60 days and there are no responses from our query or followups, they will be closed to maintain a concise list for our reviewers.