Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

globset: serde feature - cannot deserialize String into Glob #2386

Closed
jeertmans opened this issue Dec 31, 2022 · 4 comments
Closed

globset: serde feature - cannot deserialize String into Glob #2386

jeertmans opened this issue Dec 31, 2022 · 4 comments
Labels
rollup A PR that has been merged with many others in a rollup.

Comments

@jeertmans
Copy link
Contributor

Hi, I hope that posting issues about subcrates is ok.

What version of globset are you using?

globset = { version = "0.4.9", features = ["serde1"] }.

Describe your bug.

Because the implementation of trait Deserialize uses &str, it does not accept deserializing owned String.

impl<'de> Deserialize<'de> for Glob {
fn deserialize<D: Deserializer<'de>>(
deserializer: D,
) -> Result<Self, D::Error> {
let glob = <&str as Deserialize>::deserialize(deserializer)?;
Glob::new(glob).map_err(D::Error::custom)
}
}

What are the steps to reproduce the behavior?

This problem occurred to me when I tried to parse Globs from a toml::Value object (already obtained from a file). As toml::Value owns strings, it caused a problem.

If needed, I can include a MWE (but quite large).

What is the actual behavior?

When parsing some globs, here is the error obtained:

Error: TomlDecode(Error { inner: ErrorInner { kind: Custom, line: None, col: 0, at: None, message: "invalid type: string \"src/**.rs\", expected a borrowed string", key: [] } })

What is the expected behavior?

Not sure if it is interesting performance-wise, but using a custom deserialize_with function, with String instead of &str, fixes the problem (glob.as_str()) has to be used afterward).

I can provide tests and make a PR if you are ok with this solution. Otherwise, I am also open to discussions :-)

@BurntSushi
Copy link
Owner

I don't think perf is relevant here, since building a glob from a string is quite expensive. Here are the things that will make a PR easier to merge:

  • Small change with tests.
  • No breaking changes.
  • Have someone with more experience with Serde APIs sign off on this. From your issue, I'm not actually totally sure whether this is a problem with how you're using Serde or if it's truly a problem with how Serde is impled in `globset.

@LPGhatguy could you please review this? You're the one who originally added Serde support to globset.

@jeertmans
Copy link
Contributor Author

Thanks for the quick reply!

After comparing with what is done in the regex_serde crate, I changed from String to Cow<str>, in the hope (?) to avoid unnecessary copy.

I created a runnable example that illustrates the error, and how using a different deserialize function can solve it: see https://www.rustexplorer.com/b/qagwmp.

The code is reproduced here:

/*
[dependencies]
serde = { version = "1", features = ["derive"] }
globset = { version = "0.4.9", features = ["serde1"] }
toml = "0.5.10"
*/

use globset::Glob;
use serde::{Deserialize, Deserializer};

#[derive(Debug, Deserialize)]
struct S {
    // (un)comment the following line to see the error
    //#[serde(deserialize_with = "deserialize_glob")]
    #[allow(unused)]
    glob: Glob,
}

#[allow(unused)]
fn deserialize_glob<'de, D: Deserializer<'de>>(
    deserializer: D,
) -> std::result::Result<Glob, D::Error> {
    use serde::de::Error;
    use std::borrow::Cow;
    let glob = <Cow<str> as Deserialize>::deserialize(deserializer)?;

    Glob::new(&glob).map_err(D::Error::custom)
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let string = "glob = '*.md'";

    // Parse toml from borrowed string into S
    let _: S = toml::from_str(string)?;

    // Parse toml from borrowed string into intermediate toml::Value
    let v: toml::Value = toml::from_str(string)?;

    // Convert toml::Value into S
    //   will fail without custom deserialize function
    //   because v owns the String
    let _: S = v.try_into()?;

    Ok(())
}

@LPGhatguy
Copy link
Contributor

Based on how serde::Deserialize is implemented for Cow<'a, T>, I don't think that using Cow<str> will behave any differently than String.

I think the correct solution here is to implement our own visitor that handles both the borrowed and owned case. Always using the owned case (using String or Cow<str>) will always allocate, even when we don't need to.

@jeertmans
Copy link
Contributor Author

Hi @LPGhatguy, I have updated the PR accordingly :-)

@BurntSushi BurntSushi added the rollup A PR that has been merged with many others in a rollup. label Jul 7, 2023
BurntSushi pushed a commit that referenced this issue Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rollup A PR that has been merged with many others in a rollup.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants