[Improvement]: Add list, remove, clear, and size operations to shuttle-persist #1066

sentinel1909 · 2023-07-02T20:42:50Z

Description of change

This PR adds list and remove operations to shuttle-persist, closes issue #1052.
Looking for feed back to see if I'm on the right track. I feel I'm not handling errors in quite the right way yet. The only way to learn is to practice and ask for feedback.

How has this been tested? (if applicable)

Created two unit tests, test_list and test_remove, to test the new functionality. All tests passed.

jonaro00 · 2023-07-03T00:50:19Z

Use cargo fmt to get past that step in CI.
You can refactor the error handling by following the pattern in the other methods. The key is map_err and the ? shorthand return.
You can probably squeeze in the remove operation in this same PR. This feature is not in a rush, and it is nice if the updates are released together.

sentinel1909 · 2023-07-03T06:25:00Z

Use cargo fmt to get past that step in CI.

You can refactor the error handling by following the pattern in the other methods. The key is map_err and the ? shorthand return.

You can probably squeeze in the remove operation in this same PR. This feature is not in a rush, and it is nice if the updates are released together.

I will be going back to the drawing board, as the saying goes. I'm having great difficulty with the ? operator and error conversion.

Will definitely work in the remove feature.

Glad there's no rush here because this is a meaty bone for me to chew on and it's going to take more time.

…le-persist-list 2023-07-03 update with shuttle-hq rep

sentinel1909 · 2023-07-03T15:32:07Z

One thing I forgot to ask, comments or no comments? I noticed there aren't any elsewhere around where I'm working.

jonaro00 · 2023-07-03T16:00:31Z

If they provide value/context about why the code does X when it can't be directly understood from reading the code, you can leave them in (and even add ones to the other methods). It is also good to use doc comments /// on public functions, but the function names are pretty self-explanatory in this case.
In summary, yes, I think some of your comments are too verbose.

sentinel1909 · 2023-07-03T19:13:11Z

Tests are a failin' where it matters...having a "but it worked on my machine" moment. :) Back to the head banging.

jonaro00 · 2023-07-03T22:34:54Z

It is starting to look really good now!

One thing that I would like to see it for the file name to be stripped of .bin when listing, so that (pseudo-code) remove(list()[0]) will work.

Did you also want to add clear()? It should not be much more than one line of code.

sentinel1909 · 2023-07-03T22:42:36Z

Ooooo...challenge accepted :) The clear() method would nuke all the keys from orbit I assume?

sentinel1909 · 2023-07-03T23:03:52Z

Ooooo...challenge accepted :) The clear() method would nuke all the keys from orbit I assume?

Looks like fs::remove_dir_all will do the trick and is indeed a one-liner. fs::remove_dir won't cut it because it won't work if directory isn't empty.

sentinel1909 · 2023-07-04T01:07:03Z

One thing that I would like to see it for the file name to be stripped of .bin when listing, so that (pseudo-code) remove(list()[0]) will work.

Ok, clear() was easy and is done with a test. For the refinement to list, it should strip off the .bin from the file extension somewhere before going into the list vector that gets returned, correct? This would allow you to pass in a list with the index as you say to another method like remove. This will cause some headscratching but I'm gonna get it! :)

sentinel1909 · 2023-07-04T04:39:48Z

The remove method will accept a list_item parameter, which is a string. This could be the output from list(), called as you outlined in the commend above, i.e. remove(list()[0])

jonaro00 · 2023-07-04T10:02:18Z

The current state of the remove function will just iterate over each item and then delete the last index. I still think remove should just delete one key. What I meant with remove(list()[0]) is that remove() should be able to successfully delete any arbitrary key from the vector that list() returns (in my example, the 0'th). This was why I hinted about removing the .bin suffix.

sentinel1909 · 2023-07-04T13:15:22Z

The current state of the remove function will just iterate over each item and then delete the last index. I still think remove should just delete one key. What I meant with remove(list()[0]) is that remove() should be able to successfully delete any arbitrary key from the vector that list() returns (in my example, the 0'th). This was why I hinted about removing the .bin suffix.

Understood.

…le-persist-list 2023-07-04 General Update

jonaro00

You are making good progress! 🥳 Here are some suggestions.

jonaro00 · 2023-07-05T11:23:25Z

resources/persist/src/lib.rs

+                    .into_os_string()
+                    .into_string()
+                    .unwrap_or_else(|_| "path contains non-UTF-8 characters".to_string())
+                    .split(".bin")


Splitting here works, but can break: if a key aaa.binbbb is saved, that would make the file aaa.binbbb.bin, which after this split+collect becomes aaabbb. I would say trimming the file extension from the end is better. You can check out std::path::Path::with_extension.

Struggling here. Not seeing how std:::path::Path::with_extension helps. I'm getting better at interpreting the docs content, but I have a ways to go. Will resume the head banging.

My mistake. The dir.path() method gives you a PathBuf, and there the method is called set_extension(). The idea is to modify the path of the dir entry before converting it to a string, making it more "properly" handled.

My mistake. The dir.path() method gives you a PathBuf, and there the method is called set_extension(). The idea is to modify the path of the dir entry before converting it to a string, making it more "properly" handled.

Yes!! I was just looking around on StackOverflow and came to the same conclusion...dir.path() gives back a PathBuf! Not being clear on that was the missing piece. Unfortunately I have to return to work today, but I will continue to work this evening.

Have adjusted the list method to a good form now, I feel.

jonaro00 · 2023-07-05T11:26:34Z

resources/persist/src/lib.rs

+    }
+
+    /// remove method deletes a key from the PersistInstance
+    pub fn remove(&self, key: String) -> Result<(), PersistError> {


Since save and load takes the key as a &str, it makes sense that this also does. This means that you will need .as_str() in the test case.

Thanks. Waffled endlessly here. The compiler suggested it would be ok to borrow, in the test. I think you're right though, as_str() conveys intent better. Thanks for confirming that staying consistent with the rest is best.

Now using .as_str() in the assocated test case.

jonaro00 · 2023-07-05T11:28:23Z

resources/persist/src/lib.rs

+
+        persist.save("test_list", "test_list").unwrap();
+
+        let result = vec!["shuttle_persist/test_list/test_list".to_string()];


Should the resulting list item really contain the full path like this?

Fixed, list returns the key name only.

jonaro00 · 2023-07-05T11:32:18Z

resources/persist/src/lib.rs

+        assert_eq!(
+            clear_result.to_string(),
+            "failed to list contents of folder: No such file or directory (os error 2)"
+        );


Maybe it is unintuitive that calling list() on an empty persist store returns an error. If you also call create_dir_all in list() like save() does, you can make sure that the folder always exists when asked for, and an empty Vec can be returned.

Now just removing the files instead of the entire folder, test adjusted accordingly.

…le-persist-list 2023-07-06 remote update with shuttle-hq

sentinel1909 · 2023-07-08T04:57:08Z

I would like to embelish the test for the clear method to create multiple keys within the sample instance of persist, list the result (should contain multiple keys), then remove a random result. This doesn't seem to work. I only ever get one instance of persist with one key. Creating multiples (with a for loop) just overwrites over top of the same one. I will continue to think about this.

jonaro00 · 2023-07-15T10:19:19Z

The implementation of size() surprised me, as the initial suggestion was

Nice! I think it would be interesting to have a .size() method, right? It could just count the amount of files in the folder

I was thinking this would simply be a shorthand for checking the len() of list(), but that is quite trivial for the end user. Knowing the persist's storage size on disk might be useful if the future introduces storage limits, but I don't see it being useful for a user yet. We can keep it as is if the doc comment clearly states what it does, so it is not confused with the above.

The randomness added in the tests are strange to me. You generally want to avoid randomness in tests. While the loops you constructed will seem to always work, the randomness adds no "proof" that it works.
Now, the number of keys N handled in those tests are between 1 and 20, but ensuring it works for 1 and 2 should be enough to be sure that it works for larger N.
One thing that the tests are missing is N=0 (aka does the persist store work as expected when empty?).
You could add N=0 checks before and after some of the tests, for example

persist.remove("abc") // assert this is an Err (or just unwrap_err)
persist.save("abc", "xyz")
persist.remove("abc") // assert this is an Ok (or just unwrap)
persist.remove("abc") // assert this is an Err (or just unwrap_err)

The same pattern can repeat where relevant, for example, checking if list() gives an empty list or an error when empty.

sentinel1909 · 2023-07-17T04:54:47Z

I added a new() method, which looks like this:

pub fn new(name: &str) -> Self {
     PersistInstance { service_name: ServiceName::from_str(name).unwrap() }

The name gets passed in as a parameter when you call new() and then you get back your PersistInstance. I used this in the tests and it works nicely. Two things:

Not sure if this is wanted, if I'm going off script, let me know and I'll change it back.
The unwrap() worries me and I'm not sure how to propgate the error nicely with the ? operator.

I've thrown all the tests out and redid them. I haven't committed yet as I still need to address the N=0 checks as noted above. I'm also thinking the clear() method is wrong and far too verbose.

jonaro00 · 2023-07-17T11:15:57Z

I'd say we can wait with involving ServiceName since it looks like there are huge changes regarding those coming soon. (To propagate with ?, you would have made the new() function return Result<Self, ...>)

clear() could take the approach of just removing and re-creating the directory.

sentinel1909 · 2023-07-24T02:55:09Z

Looks like I failed a couple of the integration tests for cargo-shuttle. Looked at the details, not sure how to resolve the issues.

jonaro00

Well done! We are close to being finished now :)

jonaro00 · 2023-07-24T09:47:28Z

resources/persist/Cargo.toml

@@ -12,3 +12,6 @@ bincode = "1.2.1"
 serde = { version = "1.0.0", features = ["derive"] }
 shuttle-service = { path = "../../service", version = "0.21.0" }
 thiserror = "1.0.32"
+
+[dev-dependencies]
+rand = "0.8.5"


Don't need this anymore :)

jonaro00 · 2023-07-24T09:51:47Z

resources/persist/src/lib.rs

+                .file_stem()
+                .unwrap_or_default()
+                .to_str()
+                .unwrap_or("file name contains non-UTF-8 characters")


This error should propagate instead of being turned into a key name

This one is causing a bit of head bashing, but will continue to bash. I'm not seeing how to propagate right in this moment, but will get there.

You can take a shortcut and map the error to a ListFolder and propagate with ? like the statement before it.

This is exactly what I tried but it's not working. The .to_str() method gives off an Option, which I thought .ok_or() would help me convert into a Result<T, E> which I could then .map_err on, but nope. Will get back at it this evening.

Aha, well then don't you already have a Result<&str, PersistError> after doing .ok_or(PersistError::ListFolder)? If so, you can do the ? immediately, then to_string().

Hmmmmm...might just have to take my laptop with me to work today and continue to work on this at lunch :)

Ok, I think I've got this worked out. I figured out how to use ok_or to convert the error related to the potential for invalid Unicode characters. Here's the revised list method:

/// list method returns a vector of strings containing all the keys associated with a PersistInstance pub fn list(&self) -> Result<Vec<String>, PersistError> { let storage_folder = self.get_storage_folder(); let mut list = Vec::new(); let entries = fs::read_dir(storage_folder).map_err(PersistError::ListFolder)?; for entry in entries { let key = entry.map_err(PersistError::ListFolder)?; let key_name = key .path() .file_stem() .unwrap_or_default() .to_str() .ok_or("the file name contains invalid characters").map_err(PersistError::ListName)? .to_string(); list.push(key_name); } Ok(list) }

I did have to introduce a lifetime specifier on our PersistError enum, to satisfy the compiler:

#[derive(Error, Debug)] pub enum PersistError <'a> { #[error("failed to open file: {0}")] Open(std::io::Error), #[error("failed to create folder: {0}")] CreateFolder(std::io::Error), #[error("failed to list contents of folder: {0}")] ListFolder(std::io::Error), #[error("failed to list the file name: {0}")] ListName(&'a str), #[error("failed to clear folder: {0}")] RemoveFolder(std::io::Error), #[error("failed to remove file: {0}")] RemoveFile(std::io::Error), #[error("failed to serialize data: {0}")] Serialize(BincodeError), #[error("failed to deserialize data: {0}")] Deserialize(BincodeError), }

I would say that error does not need a test case, since it should realistically never happen. Since the keys (file names) are Strings when they are saved, they should be valid utf-8 when reading the directory as well.

jonaro00 · 2023-07-24T10:05:40Z

resources/persist/src/lib.rs

+    #[test]
+    fn test_list_and_size() {
+        let persist = PersistInstance {
+            service_name: ServiceName::from_str("test1").unwrap(),


The tests are looking really good now! However, there is one case that is not yet covered, but it might simplify the code.

If list/load/remove etc are called before any key has been saved, they will fail due to the folder not existing yet. This can be avoided if we ensure the folder exists before any method is called.

My idea is to add a method new(service_name: ServiceName) -> Result<Self, PersistError> that creates the folder upfront. Creating the folder at this time also means it is not needed in save.
This would mean all constructions of PersistInstance { ... } change to PersistInstance::new(...) + .unwrap()/?. It also means that the error needs to map_err into a shuttle_service::Error (or other way of converting) in ResourceBuilder::output.

So a new method after all then...got it.

I'm doing a push as a checkpoint to show where I'm at. As a start, I've made a skeleton new method that instantiates a PersistInstance struct, given a ServiceName. I've substituted this in the tests and all works as before. I'm having difficulty now understanding how to get the folder creation incorporated. If I put a &self in as a parameter to new, how do I then instantiate in the tests? Just need a couple of breadcrumbs to set the path forward.

Ah! Good point.
I think this is a nice workaround. Using the fact that we "have" the instance (self) in the new() function since we are creating it right there.

fn new(...) -> Result<...> { let instance = Self { ... }; let directory = instance.get_storage_folder(); fs::create_dir_all(...).map_err(...)?; Ok(instance) }

It also means that the error needs to map_err into a shuttle_service::Error (or other way of converting) in ResourceBuilder::output.

Having some difficulty now with this piece. I understand what you're saying but am not sure how to make it happen. I will reflect through the day today.

Alright, time to admit it. After sititng staring at the code for a fair bit of time, I have no clue how to convert the errors into a shuttle_service::Error in ResourceBuilder::output. I think I need a hint.

This works

// Get the PersistError in, let's say, a match statement, then return Err(shuttle_service::Error::Custom( /*The PersistError*/.into(), ));

EDIT: based on the above I would imagine that something like this will work:

PersistInstance::new(...).map_err(|e| shuttle_service::Error::Custom(e.into()))

Ok, I feel I've got the new method working. It will convert a PersistError into a shuttle_service::Error. I'm having difficulty with a test though. All I managed to do yesterday evening was test ServiceName, which panics if you pass it something invalid. This doesn't really test the new method erroring out because of an issue with create_dir_all. Also, the cargo-shuttle circleci check is still failing, and I'm not clear why (I have tried to read the log output) or how to resolve it.

Yeah, the create_dir_all is hard to test failures of. Perhaps creating a folder in a root-owned directory, but that might not be reproducible on all machines. I would say that we can trust it without a test.

The CI fail on cargo-shuttle init is a sporadic error (a hard one to fix :/ ), and not related to your changes.

jonaro00 · 2023-07-29T11:02:10Z

resources/persist/src/lib.rs

+                return Err(shuttle_service::Error::Custom(
+                    PersistError::CreateFolder(e).into(),
+                ))


My thought with converting to a shuttle_service::Error::Custom was to have that happen in ResourceBuilder::output() (since the trait requires a shuttle_service::Error type in that result). The new function can simply return this PersistError with map_err and ?, similar to all other errors in this struct. The construction of PersistInstance in output should then use ::new(...) and convert the error if it happens there (with this match clause or map_err).
(The code looks correct, it is just missing the usage of new() in the builder 😄)

Got it. Will sort it out this weekend.

I think I got it! Except for the failing cargo-shuttle circleci :(

…le-persist-list 2023-07-29 General update

…l1909/shuttle-sentinel1909 into improvement-shuttle-persist-list

jonaro00 · 2023-07-30T11:40:35Z

This is looking pretty complete now 🥳! Thanks for the consistent work and tolerating my annoying feedback 😄. I'll let the team know this is ready for review.

sentinel1909 · 2023-07-30T15:05:12Z

This is looking pretty complete now 🥳! Thanks for the consistent work and tolerating my annoying feedback 😄. I'll let the team know this is ready for review.

Thank you for your patience and guidance! Working this solidified a lot of things for me, it was extremely valuable. The code base is way less scary now :)

sentinel1909 · 2023-08-16T16:20:34Z

Wondering if there is any further work to be done here? Are upcoming changes going to make this obsolete and not necessary? Happy to refine this work further if needed.

oddgrd · 2023-08-16T19:32:33Z

Sorry for the delay, Jeff! This looks great, we just haven't been able to get to the final review yet. We'll get to it this week or early next week. 🙂

oddgrd

Looks great to me, thanks for all your work on this @sentinel1909! 🥳 And thanks for the mentoring @jonaro00! I left a very minor comment about the doc comments, we may want to change them a bit.

Oh, and it would be great to have the persist docs in the docs repo updated with these new features as well. If you have time, if not we can also do it. 🙂

oddgrd · 2023-08-17T13:02:15Z

resources/persist/src/lib.rs

        let file_path = self.get_storage_file(key);
        let file = File::create(file_path).map_err(PersistError::Open)?;
        let mut writer = BufWriter::new(file);
        Ok(serialize_into(&mut writer, &struc).map_err(PersistError::Serialize))?
    }

+    /// list method returns a vector of strings containing all the keys associated with a PersistInstance


Nit: should we capitalize these doc comments like we do elsewhere in the codebase? I also don't think we need "list method" as the first part of the comment.

Suggested change

/// list method returns a vector of strings containing all the keys associated with a PersistInstance

/// Returns a vector of strings containing all the keys associated with a [PersistInstance]

sentinel1909 added 2 commits June 28, 2023 22:38

Initial commit (framed up function signature)

40fc370

improvement: add list feature to shuttle persist

5357cb4

Merge remote-tracking branch 'shuttle-hq/main' into improvement-shutt…

c58fac0

…le-persist-list 2023-07-03 update with shuttle-hq rep

Rewrite list, add remove and associated tests

96f11a6

sentinel1909 changed the title ~~[Improvement]: Add list operation to shuttle-persist~~ [Improvement]: Add list and remove operations to shuttle-persist Jul 3, 2023

Run cargo fmt to fix failing CircleCI persist

9e3eea1

Resolve failing unit tests for list and remove

9ea36f7

sentinel1909 added 2 commits July 3, 2023 21:07

Add clear method & remove by passing indexed item

f866b73

Adjustment to list_item parameter in remove method

ece68b7

sentinel1909 added 2 commits July 4, 2023 12:19

Merge remote-tracking branch 'shuttle-hq/main' into improvement-shutt…

f7eee8b

…le-persist-list 2023-07-04 General Update

list method strips .bin, updated remove unit test

407c95a

jonaro00 reviewed Jul 5, 2023

View reviewed changes

sentinel1909 added 2 commits July 6, 2023 19:50

Merge remote-tracking branch 'shuttle-hq/main' into improvement-shutt…

45be093

…le-persist-list 2023-07-06 remote update with shuttle-hq

list, remove, and clear, with tests, updated

f7186e1

sentinel1909 added 3 commits July 7, 2023 21:57

Address clippy warnings in CI check

abc1704

Address sloppy typo and clippy warning

5ff0e37

Fix another clippy warning in CI

5a8a883

sentinel1909 changed the title ~~[Improvement]: Add list and remove operations to shuttle-persist~~ [Improvement]: Add list, remove, clear, and size operations to shuttle-persist Jul 15, 2023

jonaro00 added S labels Jul 15, 2023

Add tests for all list, size, remove and clear

568e1c2

jonaro00 reviewed Jul 24, 2023

View reviewed changes

sentinel1909 added 5 commits July 24, 2023 21:45

Clean deps, fix error propagation in list, add new

9c84868

Resolve issue in circleci check

eced7cd

New method creates the storage folder

1912a92

New method returns shuttle_service::Error

107c461

Fix issue causing circleci/persist to fail

c51f298

jonaro00 reviewed Jul 29, 2023

View reviewed changes

sentinel1909 added 3 commits July 29, 2023 13:49

Update resource builder to use new method

d02dcc1

Merge remote-tracking branch 'shuttle-hq/main' into improvement-shutt…

4058a69

…le-persist-list 2023-07-29 General update

Merge branch 'improvement-shuttle-persist-list' of github.com:sentine…

c7fe00b

…l1909/shuttle-sentinel1909 into improvement-shuttle-persist-list

jonaro00 requested review from iulianbarbu and oddgrd August 16, 2023 18:54

orhun approved these changes Aug 17, 2023

View reviewed changes

oddgrd approved these changes Aug 17, 2023

View reviewed changes

chore: update document comments per review

0472444

jonaro00 merged commit 31dec11 into shuttle-hq:main Aug 17, 2023

oddgrd mentioned this pull request Aug 21, 2023

Update persist docs with new API shuttle-hq/shuttle-docs#169

Closed

sentinel1909 deleted the improvement-shuttle-persist-list branch August 26, 2023 02:46


		persist.save("test_list", "test_list").unwrap();

		let result = vec!["shuttle_persist/test_list/test_list".to_string()];

	/// list method returns a vector of strings containing all the keys associated with a PersistInstance
	/// Returns a vector of strings containing all the keys associated with a [PersistInstance]

[Improvement]: Add list, remove, clear, and size operations to shuttle-persist #1066

[Improvement]: Add list, remove, clear, and size operations to shuttle-persist #1066

Conversation

sentinel1909 commented Jul 2, 2023 • edited Loading

Description of change

How has this been tested? (if applicable)

jonaro00 commented Jul 3, 2023

sentinel1909 commented Jul 3, 2023

sentinel1909 commented Jul 3, 2023

jonaro00 commented Jul 3, 2023

sentinel1909 commented Jul 3, 2023

jonaro00 commented Jul 3, 2023

sentinel1909 commented Jul 3, 2023 • edited Loading

sentinel1909 commented Jul 3, 2023

sentinel1909 commented Jul 4, 2023

sentinel1909 commented Jul 4, 2023

jonaro00 commented Jul 4, 2023

sentinel1909 commented Jul 4, 2023

jonaro00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sentinel1909 Jul 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sentinel1909 commented Jul 8, 2023

jonaro00 commented Jul 15, 2023

sentinel1909 commented Jul 17, 2023 • edited Loading

jonaro00 commented Jul 17, 2023

sentinel1909 commented Jul 24, 2023 • edited Loading

jonaro00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sentinel1909 Jul 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sentinel1909 Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonaro00 Jul 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonaro00 commented Jul 30, 2023

sentinel1909 commented Jul 30, 2023

sentinel1909 commented Aug 16, 2023

oddgrd commented Aug 16, 2023

oddgrd left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sentinel1909 commented Jul 2, 2023 •

edited

Loading

sentinel1909 commented Jul 3, 2023 •

edited

Loading

sentinel1909 Jul 5, 2023 •

edited

Loading

sentinel1909 commented Jul 17, 2023 •

edited

Loading

sentinel1909 commented Jul 24, 2023 •

edited

Loading

sentinel1909 Jul 24, 2023 •

edited

Loading

sentinel1909 Jul 25, 2023 •

edited

Loading

jonaro00 Jul 27, 2023 •

edited

Loading

oddgrd left a comment •

edited

Loading