Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TarEntry.ExtractToFile does not set the last write time of the file to the local time #73825

Closed
ayousuf23 opened this issue Aug 11, 2022 · 10 comments · Fixed by #74400
Closed
Assignees
Milestone

Comments

@ayousuf23
Copy link
Contributor

Description

When using TarEntry.ExtractToFile method, the last write time of the extracted file is the UTC time, not the local time.

Reproduction Steps

  1. Create a file with a last write time time in the past: Set-ItemProperty -Path "TestDrive:/oldfile.txt" -Name "LastWriteTime" -Value '2003-01-16 14:44'
  2. Create a tar archive containing this file using TarWriter
  3. Extract the file from the archive using TarEntry.ExtractToFile

Expected behavior

The last write time should have a time that is in the local timezone.

Actual behavior

The last write time is in the UTC timezone e.g. it is ahead of Pacific Time by 8 hours.

Regression?

No response

Known Workarounds

No response

Configuration

7.0.100-preview.6.22352.1
PowerShell 7.3.0-preview6
Windows 11, x64

Other information

No response

@ayousuf23
Copy link
Contributor Author

@carlossanlop

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 11, 2022
@ghost
Copy link

ghost commented Aug 11, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When using TarEntry.ExtractToFile method, the last write time of the extracted file is the UTC time, not the local time.

Reproduction Steps

  1. Create a file with a last write time time in the past: Set-ItemProperty -Path "TestDrive:/oldfile.txt" -Name "LastWriteTime" -Value '2003-01-16 14:44'
  2. Create a tar archive containing this file using TarWriter
  3. Extract the file from the archive using TarEntry.ExtractToFile

Expected behavior

The last write time should have a time that is in the local timezone.

Actual behavior

The last write time is in the UTC timezone e.g. it is ahead of Pacific Time by 8 hours.

Regression?

No response

Known Workarounds

No response

Configuration

7.0.100-preview.6.22352.1
PowerShell 7.3.0-preview6
Windows 11, x64

Other information

No response

Author: ayousuf23
Assignees: -
Labels:

area-System.IO

Milestone: -

@adamsitnik
Copy link
Member

I am moving it to .NET 8 milestone now (today is the last day to merge .NET 7 changes). If @carlossanlop decides that it's a bug we can always backport the fix to 7.0.

@adamsitnik adamsitnik added this to the 8.0.0 milestone Aug 12, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 12, 2022
@danmoseley danmoseley modified the milestones: 8.0.0, 7.0.0 Aug 22, 2022
@danmoseley
Copy link
Member

We have to fix this, we can't set the wrong times on files. I'll fix it.

@danmoseley
Copy link
Member

Our date and time types are easy to get wrong -- the type system does not help prevent you mixing up UTC and local.

Looking at what Zip does first (offset at my home is -7):

  1. Reads original file on disk LastWriteTime.LocalDateTime, which produces DateTime with Kind= Local.

  2. Converts to DateTimeOffset using implicit cast, producing one with -7 offset.

  3. To set on the file reads dateTimeOffset.DateTime property and converts that into what it writes into the .zip.

  4. Reading the zip, it creates a DateTime with Kind = Unspecified.

  5. Converts to DateTimeOffset using the DateTimeOffset(DateTime) constructor, producing one with -7 offset.

  6. To set on the file reads dateTimeOffset.DateTime property (which creates Kind=Unspecified, local time) and passes that to SetLastWriteTime.

vs Tar

  1. Reads original file on disk LastWriteTimeUtc, which produces DateTime with Kind = Utc

  2. Converts to DateTimeOffset using implicit cast, producting one with 0 offset.

  3. To set on the file reads dateTimeOffset.DateTime property and converts that into what it writes into the .tar.

  4. Reading the tar, it creates a DateTimeOffset with a zero offset.

  5. To set on the file reads dateTimeOffset.DateTime property (which creates Kind=Unspecified, utc) and passes that to SetLastWriteTime. .. but it's UTC

Dizzying ..

The main gotcha seems to be getting dateTimeOffset.DateTime. The docs say

any information about the DateTimeOffset value's relationship to UTC is lost by the conversion when the DateTime property is used.

In the case of zip, it gets away with it because the DateTimeOffset has a zero offset, so it doesn't matter.

The constructor of DateTimeOffset over DateTime (or equivalently the implicit conversion) is also potentially lossy

For UTC and local DateTime values, the Offset property of the resulting DateTimeOffset value accurately reflects the UTC or local time zone offset.
However, for DateTime values whose Kind property is DateTimeKind.Unspecified, these two conversion methods produce a DateTimeOffset value whose offset is that of the local time zone

What I conclude from this is

  1. If you have a free hand in this kind of scenario, prefer DateTimeOffset to DateTime, and translate to and from DateTime only when you need to.
  2. When you translate to DateTime don't use the DateTime property on DateTimeOffset. use the UtcDateTime property or LocalDateTime property
  3. Avoid DateTime with Kind=Unspecified or if you can't, be sure to pass the timezone to the DateTimeOffset constructor, avoid the simple constructor and the implicit cast operator.

@mattjohnsonpint

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 23, 2022
@mattjohnsonpint
Copy link
Contributor

mattjohnsonpint commented Aug 23, 2022

@danmoseley - the behaviors with DateTime and DateTimeOffset you describe are pretty well established. Especially DateTimeOffset.DateTime having DateTimeKind.Unspecified.

Part of the issue here is that FileSystemInfo only has LastWriteTime and LastWriteTimeUtc - both of which are DateTime values. LastWriteTime is converted to UTC based on Kind, and Unspecified is treated as local. There aren't properties that expose filesystem times as DateTimeOffset.

Also not sure if it's relevant here, but keep in mind there are differences in file system behavior. NTFS and most Unix file systems keep file times in terms of UTC, while FAT/FAT32/exFAT systems keep them in local time without offset. This tends to be an issue more with USB thumbdrives and camera SD cards though.

Ultimately with this issue, I think the problem must be somewhere in the TarEntry or TarWriter classes or dependent classes - wherever it is reading time from the file system. It should be fixed there - not on DateTime or DateTimeOffset

@mattjohnsonpint
Copy link
Contributor

mattjohnsonpint commented Aug 23, 2022

Oh I see your changes now. Yep - that seems right. The real culprit was here:

File.SetLastWriteTime(destinationFileName, lastWriteTime.DateTime);

That should probably be .LocalDateTime

@mattjohnsonpint
Copy link
Contributor

Or rather the inverse:

File.SetLastWriteTimeUtc(destinationFileName, lastWriteTime.UtcDateTime); 

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 23, 2022
@danmoseley
Copy link
Member

Thanks @mattjohnsonpint

Well established, but is there anything we could/should do to help avoid this kind of error? For example deprecate DateTmeOffset.DateTime.

@mattjohnsonpint
Copy link
Contributor

That would create too much pain, IMHO. Indeed if you have a DateTimeOffset and need to populate a DateTime field or property, it would be better to use LocalDateTime or UtcDateTime - but only if the target is expecting one of those. It very well may be that the target is fine with an Unspecified kind.

One example I can think of is when persisting a DateTimeOffset to a database that doesn't have a native DateTimeOffset type. You would need the .DateTime and .Offset properties separately, without modification.

I could see it might be valuable to have an analyzer that understands many common targets (like LastWriteTime and LastWriteTimeUtc) and warns you in those cases. Perhaps it could even be smart enough to see that there are pairs of properties with "Utc" or "Local" in their names. This would be similar to the existing "Method has async overload" analyzer that warns you when there's both Foo and FooAsync and you're calling Foo in an async context.

Also worth pointing out here that the inverse operation is another gotcha - specifically, the DateTime to DateTimeOffset implicit operator. I would flag any usages of that with an analyzer. I think there's an open issue about that already somewhere.

@ghost ghost locked as resolved and limited conversation to collaborators Sep 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants