Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement std::io traits for FileAccess #446

Closed
fpdotmonkey opened this issue Oct 8, 2023 · 14 comments · Fixed by #473
Closed

Implement std::io traits for FileAccess #446

fpdotmonkey opened this issue Oct 8, 2023 · 14 comments · Fixed by #473
Labels
c: engine Godot classes (nodes, resources, ...) feature Adds functionality to the library

Comments

@fpdotmonkey
Copy link
Contributor

API users interested in File IO in the res:// and user:// file systems need to work with Godot's FileAccess API. gdext should provide a Rusty API for this by implementing some subset of std::io::{BufRead, IsTerminal, Read, Seek, Write} for godot::engine::FileAccess.

@Bromeon Bromeon added feature Adds functionality to the library c: engine Godot classes (nodes, resources, ...) labels Oct 8, 2023
@Bromeon
Copy link
Member

Bromeon commented Oct 8, 2023

There is a more general question, regarding how far gdext should go and make engine APIs rusty. We would end up with conflicting APIs, e.g.

fn FileAccess::seek(&mut self, position: i32);
fn FileAccess::seek_end(&mut self, position: i32);
// vs.
fn Seek::seek(&mut self, pos: SeekFrom) -> Result<u64>;

Even if we can resolve naming conflicts, modifying godot::engine classes is quite unprecedented, so it would be good to have a more general strategy. Maybe it makes sense to define rusty wrapper types -- similar to how we provide extensions for load() and get_node_as()?

Can you elaborate what motivated your use case, beyond "it's possible"? 🙂

@fpdotmonkey
Copy link
Contributor Author

fpdotmonkey commented Oct 8, 2023

This came up in a discussion on Discord on getting data for a SQLite table into a relevant library where we found that getting the bytes for a files to require some massaging. However, I think that case landed on finding a better solution, and I dont personally have an application for it. So firmly in the realm of its possible and sounds reasonable, but doesnt solve an immediate problem.

@PgBiel
Copy link
Contributor

PgBiel commented Oct 8, 2023

Maybe it makes sense to define rusty wrapper types -- similar to how we provide extensions for load() and get_node_as()?

I think that makes more sense, and we could perhaps even decouple the wrappers from the underlying Godot things - have something that just works:tm: by using the underlying Godot APIs for it (and, if you need full flexibility, then just call them manually). In this example, we'd have some Filesystem wrapper or something which uses FileAccess and similar things under the hood.

(Of course, we could be kinda limited in the amount of decoupling possible since e.g. it could be necessary to place a node in the scene tree or something like that. But, in principle, it could be a nice path to follow.)

@StatisMike
Copy link
Contributor

StatisMike commented Nov 8, 2023

So I've implemented Read and Write on a FileAccess wrapper struct in my project. The implementation allows serializing and deserializing binary data on the fly, without ever loading whole Vec<u8> in the memory. There are some hacky things there, so it is not intended to be a part of exported API of my crate. These hacky things allows to read and write additional data at the beginning of the file, without seeking the positions, just using the built-in functionality of FileAccess which moves the cursor automatically after get_* and store_* calls.

https://github.com/StatisMike/godot_io/blob/3-provide-binary-saving-format/godot_io_defs/src/bin_resource.rs#L13-L111

With some pointers of how this functionality could be done safer, and more useful and how it should be implemented into gdext I would happily try to provide a good quality PR.

@StatisMike
Copy link
Contributor

@Bromeon This is something I came upon after cleaning up the implementation mentioned above. Any pointers, also regarding the style in which it could be implemented in gdext?

https://github.com/StatisMike/godot_io/blob/3-provide-binary-saving-format/godot_io_defs/src/file_access.rs

@Bromeon
Copy link
Member

Bromeon commented Nov 9, 2023

@StatisMike thanks for the initial implementation!

Some feedback: the current implementation is quite a thin and transparent wrapper around Gd<FileAccess>. We could maybe provide the most commonly used APIs directly on the wrapper class. Concretely:

  • There could be variants of open methods, similar to File::open()
    • These could map directly to the static FileAccess::open_* equivalents.
    • Maybe there are ways to combine/split some, to make usage more idiomatic. There's one default parameter compression_mode.
  • Some documentation would be nice, notably:
    • That the type is reference-counted.
    • That Drop will not close the file, but only when the last reference goes out of scope.
    • Link to Rust FileAccess class.
  • Name FaWrapper is not very descriptive, I would name the struct GFile (originally I intended GodotFile, but this might be more in line with the upcoming GodotString -> GString rename).
  • Instead of the 3 get variants, I would just provide a method fn to_inner() -> Gd<FileAccess> which copies the pointer, we can see if more is needed. If the API on the wrapper is rich enough, such cases should be rare.

Feel free to open a PR once those changes are applied 🙂

@StatisMike
Copy link
Contributor

StatisMike commented Nov 9, 2023

@Bromeon Struct name was only prototyping, will refer to the struct below as GFile.

My rationale behind (at very least) get_mut() -> &mut Gd<FileAccess> is to make it obvious that after moving the Gd<FileAccess> pointer either by reading or writing, the wrapper position should be also updated manually, by calling GFile::update_position().

I don't really think that there is any option to update the wrapper position after get_inner is cloned and used mutably - and I think the original Gd sitting inside GFile would also have its cursor updated. It could lead to potentials bugs and confusion. That's why I would prefer limiting the access to inner pointer as much as possible. GFile::to_owned() consumed the GFile and extracted the Gd<FileAccess>, putting it inside the GFile required using its constructor, which got its position updated.

I could potentially expose all of Gd<FileAccess> methods on the GFile and then the wrapper position could be updated directly without Gd<FileAccess>::get_position() calls (at least on methods where the position change is obvious and easy to track) - though I generally focused on implementing the std::io traits. Certainly could be done - will check out how many of them there are and how viable it is.

AD the gdext structure, where should the GFile be located? Referenced GodotString lies in godot_core::builtin, but it feels like a bad place for a wrapper struct like GFile or is it?

EDIT:
AD exposing the internal methods: there's also an error handling problem. All Gd<FileAccess> methods don't produce error themselves: we could check for them and make every wrapper method call a Result, but I don't really know how performant it is to make so many calls to Godot from Rust. By requiring the user to get the mutable pointer to inner value we move the responsibility for error checking on the user, and the choice in error handling to Godot. Alternatively, we could also expose the get_error() method, which would make the call itself and return Result<(), std::io::Error>, like the std::io::* traits implementations.

@Bromeon
Copy link
Member

Bromeon commented Nov 9, 2023

My rationale behind (at very least) get_mut() -> &mut Gd<FileAccess> is to make it obvious that after moving the Gd<FileAccess> pointer either by reading or writing, the wrapper position should be also updated manually, by calling GFile::update_position().

Oh, I missed that. This is actually quite a pitfall: users will certainly forget this, naming it get_mut won't help much here.

Instead of storing the position, would it be possible to sync it with Godot's FileAccess before every Rust call that relies on it? It's just an FFI call, would likely be negligible compared to larger file operations.


I could potentially expose all of Gd<FileAccess> methods on the GFile and then the wrapper position could be updated directly without Gd<FileAccess>::get_position() calls (at least on methods where the position change is obvious and easy to track) - though I generally focused on implementing the std::io traits. Certainly could be done - will check out how many of them there are and how viable it is.

I was thinking of locking it down completely, too -- however, the issue is that you lose all interoperability with Godot APIs. Not sure if there are many cases where a FileAccess is expected as a parameter, but you are then also unable to share it with GDScript via #[func] return values or Object.call() arguments.

Maybe a hybrid approach is possible: have a field position: Option<u64> which tracks the position in Rust. The first time someone calls to_inner(), you set the option to None. From then on, query Godot each time when a position is needed. This could be incapsulated in a position() function that simply calls self.position.unwrap_or_else(...).


AD exposing the internal methods: there's also an error handling problem. All Gd<FileAccess> methods don't produce error themselves: we could check for them and make every wrapper method call a Result, but I don't really know how performant it is to make so many calls to Godot from Rust.

I wouldn't worry about performance here. We're doing disk access, sometimes even with encryption or compression; a few indirect function calls don't fall into weight. If they do, users are free to always use FileAccess directly. This wrapper is primarily for convenience.


By requiring the user to get the mutable pointer to inner value we move the responsibility for error checking on the user, and the choice in error handling to Godot. Alternatively, we could also expose the get_error() method, which would make the call itself and return Result<(), std::io::Error>, like the std::io::* traits implementations.

I think this defeats the point of having an "idiomatic Rust" wrapper. If we can make user's life easier while we're at it, let's do it 🙂

(Btw, I don't want to impose extra work on you here -- it's totally fine if the initial PR comes without error checking. I'm just stating where I could imagine the API heading)

@StatisMike
Copy link
Contributor

StatisMike commented Nov 9, 2023

@Bromeon No worries, I like to discuss Rust and happy to provide a good quality code to the community :)

I wouldn't worry about performance here. We're doing disk access, sometimes even with encryption or compression; a few indirect function calls don't fall into weight. If they do, users are free to always use FileAccess directly. This wrapper is primarily for convenience.

This calms me down a little, wrapper methods will then always check for errors and will actually produce std::io::Result<T>. I feel like this is more rusty way of doing things than returning T and hoping for the best.

AD the exposing inner, I think that wrapping most methods will be actually for the best and will actually make the possibility of needing an FileAccess itself perfectly menageable:

pub struct GFile {
  // The FileAccess is kept as private
  fa: Gd<FileAccess>,
  pos: u64,
  buffer: Vec<u8>,
  buffer_size: usize
}

impl GFile {
 // New GFile is constructed by calling GFile::open_* methods
   pub fn open(path: GodotString, flags: ModeFlags) -> std::io::Result<Self> { ... }
   pub fn open_compressed(path: GodotString, flags: ModeFlags, compression_mode: CompressionMode) -> std::io::Result<Self> { ... }
   pub fn open_encrypted(path: GodotString, flags: ModeFlags, key: PackedByteArray) -> std::io::Result<Self> { ... }
   pub fn open_encrypted_with_pass(path: GodotString, flags: ModeFlags, pass: GodotString) -> std::io::Result<Self> { ... }
   
// Obvious cursor moves can be tracked on the Rust side without needing the call to `self.fa.get_position()`
  pub fn get_8(&mut self) -> IoResult<u8> {
    let val = self.fa.get_8();
    self.check_error()?;
    self.pos += 1;
    Ok(val)
  }
  
 // User can retrieve the `Gd<FileAccess>` from `GFile` by destroying the wrapper
  pub fn into_inner(self) -> Gd<FileAccess> {
    self.fa
  }
}
  
  // But can also easily create the `GFile` from the `Gd<FileAccess>` anytime they want:
impl From<Gd<FileAccess>> for GFile {
  fn from(value: Gd<FileAccess>) -> Self {
    let pos = value.get_position();

    Self {
      fa: value,
      pos,
      buffer: Vec::new(),
      buffer_size: Self::DEFAULT_BUFFER_SIZE
    }
  }
}

If you feel that it is a deal-breaker, I will call get_position() to get updated position from Godot on every operation and skip relying on it in GFile (outside of Read, Write and BufRead trait implementations, which I feel that should be as optimized as they can). Or fall back to your proposition of keeping the internal position as None

Edit: Actually, disregard my musing about the position. Actually the tracking of the position is useful only in handful of methods and it's impact should be negligible

Edit: I'm thinking about skipping some wrapping methods:

  • store_buffer, get_buffer, seek, seek_end (that's why we implement std::io::* traits)
  • get_file_as_string (possible duplicate of get_as_text - both returns all contents of the file as GodotString, get_as_text provide additional option)

@Bromeon
Copy link
Member

Bromeon commented Nov 10, 2023

Yes, definitely skip those methods which are available in std::io.

About reading/writing integers of various sizes, should we provide this on the wrapper types? I guess yes?
The problem is that they assume a given endianess, but I guess that's OK if Godot sticks to one.

I would also rename them slightly, starting with read_ and write_.
Example:

pub fn read_u8(&mut self) -> IoResult<u8> {
    self.read_internal(FileAccess::get_8) // maybe closure needed
}

fn read_internal<T, F>(&mut self, read_fn: F) -> IoResult<T>
where
    F: Fn(&FileAccess) -> T,
{
    let val = read_fn(&*self.inner);
    self.check_error()?;
    self.pos += std::mem::size_of::<T>();
    Ok(val)
}

(probably needs some tweaks, but that's the general idea).

Feel free to open a pull request, it's easier to comment on concrete code lines than having a discussion here, and the code being referred to is also not lost over time.


There are two ways to deal with wrapper vs. inner:

  1. We allow reference-counting like FileAccess itself. That means the user can concurrently modify the file pointer. Not just in threads (this is not thread-safe, which is fine), but basically a &mut GFile does not mean one has exclusive access to the file pointer. This would for sure need to be documented.

  2. The alternative is to make it more strict, like you suggested. So instead of to_inner(), we have into_inner() that consumes the unique object, so at that point there's no GFile anymore. That means there is no sharing. Could be that it's less ergonomic, but maybe it's good to start with this and open it up over time if really necessary.

@StatisMike
Copy link
Contributor

I've actually removed the pos field completely. My overestimation of its usefulness came from initial focus on traits, namely Read would use it extensively to spare calls through FileAccess::get_position(). After reviewing it again, I deemed it as overcomplicated, especially if FileAccess::get_error() was already called to check for eventual error during operations. get_position() is now called at the beginning of every GFile::read() operation.

Another strange thing I've seen is that the FileAccess::get() calls are marked as non-mutable, which can be interpreted wrongly - the cursor on the Godot side is moved, even if the file itself isn't changed at all. Additionally, get_pascal_string() is the sole get_* method that is actually called mutably

@Bromeon
Copy link
Member

Bromeon commented Nov 10, 2023

Another strange thing I've seen is that the FileAccess::get() calls are marked as non-mutable, which can be interpreted wrongly - the cursor on the Godot side is moved, even if the file itself isn't changed at all.

Yeah, Godot's API is full of wrong const-correctness, unfortunately. Something we can fix in Rust though 🙂

After reviewing it again, I deemed it as overcomplicated,

To me it seemed like the position would be used as long as we are still in "Rust mode", and it's safe to say that no one else modifies the position of the FileAccess, because the latter is locked away. It's fine to always use Godot's position, but this part didn't seem overly complex to me; maybe I'm missing something?

@StatisMike
Copy link
Contributor

StatisMike commented Nov 10, 2023

I've opened the WIP PR with current work, most of the documentation is still missing. I also plan to create itests, at least for trait implementations and error handling, which are the sole substantial additions above simle Godot functions redirection.

I would also rename them slightly, starting with read_ and write_.

Renaming the functions to read_* and write_* could actually make a clash with some FileAccess methods. Eg. Read provides read_line() method, which returns String, but there is get_line() method on FileAccess that returns GodotString - unless we want to ditch the GodotString returning method for String returning one.

I think we also should mark the reexports through the GFile (like these get_ methods somehow, to make it obvious that they are just thin wrapper. It could also resolve the name clash stemming from renaming get_* to read_* and set_* to write_* (gread_*/gwrite_*?). Though if it stays at not renaming them, I think it should also be clear from their documentation that they are as simple as get

position is needed only in read() method (where it potentially could get expensive for repeated calls) and in Seek trait implementation, and only when moving cursor while basing on SeekFrom::Current() (which I don't think will be used as much, actually). The problem steems from implementing From<Gd<FileAccess>>, which I think is intuitive, but could lead to situation where the pointer is cloned before calling From, and its second instance is used somewhere else. Unless we will make instances created by From always check the position from godot, and the one that never gave its inner pointer use the internal position... unless there are some additional caveats in pointer movements that I'm not aware of, which is likely. Thats why after considering it I deem the pointer-tracking in GFile mid-risk low-reward, so better to drop it completely.

Yeah, Godot's API is full of wrong const-correctness, unfortunately. Something we can fix in Rust though 🙂

So I guess I should make the get_* methods intake &mut self?

@Bromeon
Copy link
Member

Bromeon commented Nov 10, 2023

Renaming the functions to read_* and write_* could actually make a clash with some FileAccess methods. Eg. Read provides read_line() method, which returns String, but there is get_line() method on FileAccess that returns GodotString - unless we want to ditch the GodotString returning method for String returning one.

Good point, maybe get_line could then be renamed to get_line_gstring or so.

I think we also should mark the reexports through the GFile (like these get_ methods somehow, to make it obvious that they are just thin wrapper.

Why? From the user perspective, it doesn't matter. They just want to read/write files.
Where distinction matters is if the involved types are String or GodotString, and I agree this could be named accordingly.

So I guess I should make the get_* methods intake &mut self?

Yes, would be nice! 🙂
Also added a lot of comments in the PR itself. Let's maybe continue there to avoid parallel discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: engine Godot classes (nodes, resources, ...) feature Adds functionality to the library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants