Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional root in Path and support for roots for Windows #170

Closed
szymon-rd opened this issue May 8, 2023 · 12 comments · Fixed by #196
Closed

Add optional root in Path and support for roots for Windows #170

szymon-rd opened this issue May 8, 2023 · 12 comments · Fixed by #196
Milestone

Comments

@szymon-rd
Copy link
Contributor

szymon-rd commented May 8, 2023

Per the following discussion: #136
I propose that we:

  • Introduce a getter for optional root in Path
  • Return the root in segments. Java's Path returns segments without the drive letter, but keeps the information. I.e. - the path is relative, root is not present, and the .getRoot on segments return null, but the segments can be brought back to be absolute with .toAbsolutePath. segments in os-lib do something different that Java's Path iterator, as it returns only Strings, not relative paths that can be made absolute.
  • TBD: Add RootPath class to expose apply accepting a drive letter. It could be the type of os.root, so people could do either os.root / ... or os.root("C:") / .... We could also add os.drive("C:") or something like that, and leave the os.root defaulting to the system drive.
@szymon-rd
Copy link
Contributor Author

szymon-rd commented May 8, 2023

@lefou @lihaoyi, if you think my proposal is a sensible direction to go, I can pick up this task for implementation.

@lihaoyi
Copy link
Member

lihaoyi commented May 8, 2023

I think it sounds reasonable

@lihaoyi
Copy link
Member

lihaoyi commented May 8, 2023

We do a similar thing with os.resource / "bar" / "qux", which can be called as os.resource(foo) / "bar" / "qux" to pass in an explicit classloader

@lefou
Copy link
Member

lefou commented May 8, 2023

We definitely need some test coverage and documentation for the edge cases, e.g. making a path relative to another fails when the root is not the same.

@szymon-rd
Copy link
Contributor Author

szymon-rd commented May 9, 2023

There is one issue that has to be given some consideration:

If we return the root (drive letter) in segments on Windows, then we run into a couple of problems:

  • The same code would return different values for Windows and Linux (offset by one on Windows).
  • Existing code could be broken without an error; we would change the behavior of segments by offsetting it by 1. Some people could bump the versions and only notice once it was too late.
  • It could be confusing as it would differ from the Java Path's behavior.

Instead, we could add a non-optional root and say it is / on Linux and Mac (and include it in segments). It solves the first problem but makes the second even worse. Let's consider making the root non-optional, given that / is de facto the root. However, adding the root to segments seems like a no-go.

The segments method seems to have problematic behavior anyway. In Java's Path, the relative path is returned, not a String, and it persists the context that allows the developer to reconstruct the absolute path from it. Iterating on segments in os-lib is more challenging because it returns just a String.

I propose changing the segments to return Iterator[Path] instead of Iterator[String]. It would allow operating on the segments more easily and fix the mentioned problems. It would also be a breaking source change; therefore - it would not break at runtime.

@lihaoyi
Copy link
Member

lihaoyi commented May 9, 2023

Do we have to return root in segments? what if we just don't?

Anyone who wants it can call .root, and anyone who wants the equivalent Java paths can call .toIO/ .toNIO. I don't think being able to reconstruct the os.Path purely from its segments is a hard requirement

@lefou
Copy link
Member

lefou commented May 9, 2023

Now, we claim that there should be no need to use lower level Java API, which isn't fully true unless we cover Windows file system roots. But there is a reason, why Unix-like OSes don't have the "drive" distinction. It sucks. I think, we should not include the root in the segments, but handle it properly in (de-)serialization.

@lefou
Copy link
Member

lefou commented May 9, 2023

Maybe, we can just invent some new type to deal with full Windows paths including a drive letter and provide some way to resolve it from a os.Path or vice versa.

@lihaoyi
Copy link
Member

lihaoyi commented May 9, 2023

I wonder if #167 is related to this issue? JVM on linux cna have multiple filesystem roots too IIRC, with ZipFileSystems and in-process virtual file systems, so it's not purely a windows-ism. Presumably whatever we do for Windows should work for those as well

@szymon-rd
Copy link
Contributor Author

Maybe, we can just invent some new type to deal with full Windows paths including a drive letter and provide some way to resolve it from a os.Path or vice versa.

One could argue that root on Unix-like OSes is just /, and on windows C:\. That gives somewhat consistent behavior.

@lefou
Copy link
Member

lefou commented May 10, 2023

JVM on linux cna have multiple filesystem roots too IIRC, with ZipFileSystems and in-process virtual file systems, so it's not purely a windows-ism. Presumably whatever we do for Windows should work for those as well

Do we target these use cases? Do we have a way in os-lib API to select a files system?

@lihaoyi
Copy link
Member

lihaoyi commented Aug 8, 2023

Do we target these use cases? Do we have a way in os-lib API to select a files system?

I've never thought about it, but if we're aiming for OS-Lib to be a feature-complete wrapper of the java.io/java.nio filesystem APIs, then being able to operate across filesystem roots does seem to make sense. That means Windows drives, ZipFileSystem, in-memory JimFS, etc.

And I think it shouldn't be too much complexity either. I think once we get the os.Paths able to support different filesystem roots, then everything else should mostly just work since os.* mostly just forwards to java.io/java.nio underneath. There may be some caveats (e.g. moves not being atomic across filesystem boundaries), but no more caveats than is present in the underlying filesystem APIs

lihaoyi pushed a commit that referenced this issue Sep 12, 2023
Thanks for a great library! Hopefully this change will increase interest
among windows shell environment developers.

This fixes #201

The essence of the fix is to support paths with a leading slash (`'/' or
'\'`) on Windows.

In this PR, this path type is referred to as `driveRelative`, due to
subtle differences in how it is handled on `Windows` versus other
platforms.

Example code:
```scala
    val p1 = java.nio.file.Paths.get("/omg")
    printf("isAbsolute: %s\n", p1.isAbsolute)
    printf("%s\n", p1)
    printf("%s\n", os.Path("/omg"))
```
Output on Linux or OSX:
```sh
isAbsolute: true
/omg
/omg
```
On Windows:
```scala
isAbsolute: false
\omg
java.lang.IllegalArgumentException: requirement failed: \omg is not an absolute path
        at os.Path.<init>(Path.scala:474)
        at os.Path$.apply(Path.scala:426)
        at oslibChek$package$.main(oslibChek.sc:11)
        at oslibChek$package.main(oslibChek.sc)
```
### Background
On Windows, a `driveRelative` path is considered relative because a
drive letter prefix is required to fully resolve it.
Like other platforms, Windows also supports `posix relative` paths,
which are relative to the `current working directory`.

Because all platforms, including `Windows`, support absolute paths and
`posix relative` paths, it's convenient when writing
platform-independent code to view `driveRelative` paths as absolute
paths, as they are on other platforms.

On Windows, the `current working drive` is an immutable value that is
captured on jvm startup. Therefore, a `driveRelative` path on Windows
unambiguously refers to a unique file with a hidden drive letter.

This PR treats `driveRelative` paths as absolute paths in Windows, and
has no effect on other platforms.

#### Making `os/test/src/PathTests` platform independent

This PR enables `Unix()` tests in `os/test/src/PathTests.scala` so they
also verify required semantics in Windows.

### How this PR relates to #170 and #196
The purpose of #196 seems to be to add full support on Windows, without
resorting to `java.nio` classes and methods.
The addition of `os.root(...)` or `os.drive(...)` would be convenient
for people writing windows-only code, whereas this PR is intended to
allow writing platform-independent code, so the concerns seem to be
orthogonal.

It seems important not to alter `Path.segments` in a way that is
incompatible with `java.nio` segments.
 
An alternate way to preserve drive letter information would be to add a
method to one of the `Path.scala` traits that returns a drive letter if
appropriate on `Windows` and an empty String elsewhere.

```scala
trait PathChunk {
  def segments: Seq[String]
  def rootPrefix: String  // empty String, or Windows drive letter
  def ups: Int
}
```

With this approach, the following assertion would be valid on all
platforms:

```scala
os.Path(pwd.rootPrefix) ==> pwd
```

Then `driveRoot` in this PR would become `pwd.rootPrefix`.

There is another (perhaps never used?) very subtle feature of `Windows`
filesystem: each drive has a different value for `pwd`. It may not be
necessary to model this in `os-lib`, since these values are immutable in
a running jvm, and there are existing workarounds.


### Additional Details regarding unique aspects of Windows Filesystem
Paths
For completeness, the following lengthy `scala` REPL session illustrates
Path values returned by `java.nio.file.Paths.get()`, some of which might
be surprising.
<details>

Relevant information: My system drives are C: and F:, but not J:

First, some unsurprising results:
```scala
c:\work-directory> scala.bat
scala> import java.nio.file.Paths
Welcome to Scala 3.3.1-RC5 (17.0.2, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.

scala> Paths.get("/")
val res0: java.nio.file.Path = \

scala> Paths.get("/").isAbsolute
val res2: Boolean = false

scala> Paths.get("/").toAbsolutePath
val res1: java.nio.file.Path = C:\

scala> Paths.get("c:/").toAbsolutePath
C:\work-directory

scala> Paths.get("f:/").toAbsolutePath
F:\

scala> Paths.get("f:/..").normalize.toAbsolutePath
f:\

scala> Paths.get("c:/..").normalize.toAbsolutePath
c:\
```
Now some less obvious results:
```scala
scala> Paths.get("c:").toAbsolutePath
C:\work-directory

scala> Paths.get("f:").toAbsolutePath
F:\

scala> Paths.get("f:/..").normalize.toAbsolutePath
f:\

scala> Paths.get("c:/..").normalize.toAbsolutePath
c:\

scala> Paths.get("c:..").normalize.toAbsolutePath
C:\work-directory

scala> Paths.get("f:..").normalize.toAbsolutePath
F:\..

scala> Paths.get("F:Users")
val res1: java.nio.file.Path = F:Users

scala> Paths.get("F:Users").toAbsolutePath
val res2: java.nio.file.Path = F:\work-directory\Users

scala> Paths.get("j:").toAbsolutePath
java.io.IOError: java.io.IOException: Unable to get working directory of drive 'J'
  at java.base/sun.nio.fs.WindowsPath.toAbsolutePath(WindowsPath.java:926)
  at java.base/sun.nio.fs.WindowsPath.toAbsolutePath(WindowsPath.java:42)
  ... 35 elided
Caused by: java.io.IOException: Unable to get working directory of drive 'J'
  ... 37 more
```

The error message returned for a non-existing drive seems to imply that
existing drives have a `working directory`.
This is consistent with the `Windows API` as described here:

[GetFullPathNameA](https://learn.microsoft.com/en-gb/windows/win32/api/fileapi/nf-fileapi-getfullpathnamea)

I can set the `working drive` for F: in a `CMD` session before I start
up the `jvm`, and it does affect the output of `Paths.get()`.
```CMD
c:\work-directory>F:

f:\>cd work-directory

F:\work-directory>scala.bat
Welcome to Scala 3.3.1-RC5 (17.0.2, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.

scala> import java.nio.file.Paths

scala> Paths.get("/")
val res0: java.nio.file.Path = \

scala> Paths.get("/").toAbsolutePath
val res1: java.nio.file.Path = C:\

scala> Paths.get("f:")
val res2: java.nio.file.Path = f:

scala> Paths.get("f:").toAbsolutePath
val res3: java.nio.file.Path = F:\work-directory

scala> Paths.get("c:").toAbsolutePath
val res4: java.nio.file.Path = C:\work-directory

scala> Paths.get("c:")
val res5: java.nio.file.Path = c:
```

Regarding the interpretation of a drive letter expression not followed
by a '/' or '\\', the
rule is that it represents the `working directory` for that drive.

These values are immutable: after the `JVM` starts up it's not possible
to change a drive `working directory`.
</details>
lihaoyi added a commit that referenced this issue Oct 18, 2023
Resolve #170 

This PR adds a `def root(root: String, fileSystem: FileSystem): Path` in
`os` package object. Additionally, it adds `root` and `filesystem`
members to `Path`. It addresses two problems:
 - Specifying custom roots for a path
 - Using custom filesystems

## Filesystem

Filesystem was added as a field of the `os.Path`. It was not added as a
Root field for simplicity and consistency with Java's `nio`. Root is a
part of the path; filesystem is the context of the whole path. One
filesystem can have many roots; one root can have many subdirectories.

## Root as String

A string is the simplest type that can represent the path's root. It
also corresponds to the mental model of the path - usually, developers
perceive it as a String. Therefore - String was chosen as the type of
root.

Pull request: #196

---------

Co-authored-by: Li Haoyi <[email protected]>
@lefou lefou added this to the after 0.9.1 milestone Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants