Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the filename without an extension/suffix #29

Closed
corasaurus-hex opened this issue Jul 20, 2021 · 14 comments
Closed

Get the filename without an extension/suffix #29

corasaurus-hex opened this issue Jul 20, 2021 · 14 comments

Comments

@corasaurus-hex
Copy link
Contributor

Linux includes a handy function called basename which lets you get a file name from a path and optionally remove a suffix at the same time. Is this something babashka.fs should include?

I have already opened #28 for this but it should be considered a WIP when/if it's decided what should be included.

Some possible options:

  1. Include it as part of the file-name function just as was done in raynes.fs.
  2. Add a new function for doing just the removal part as the PR does.
  3. Add a function that strips the extension or a suffix from a path and returns the entire path, not just the file-name.
  4. Not include this at all?

Looking for input and thoughts from the community, thanks!!

@borkdude
Copy link
Contributor

Alternatives when we go for 4: (first (split-ext f)), (str/replace (str f) #"\.md$" ""), etc.
split-ext already preserves the entire path so you can call file-name on that afterwards: (file-name (first (split-ext f))).

Comment in Slack:

@lukaszkorecki I almost opened a PR for this (it was adding an option to stripe a suffix) but realized that fs/file-name and fs/extension basically give you all the building blocks. After reviewing the code and how we use babashka.fs in majority of the cases we wanted to full name (as in: with extension), so I dropped it.

@corasaurus-hex
Copy link
Contributor Author

split-ext doesn't preserve the path, though? https://github.com/babashka/fs/blob/master/test/babashka/fs_test.clj#L306-L307

@lukaszkorecki
Copy link

To add to that: I think it would be reasonable addition if babashka/fs wants to be the equivalent of something like Ruby's FileUtils: https://ruby-doc.org/stdlib-2.5.5/libdoc/fileutils/rdoc/FileUtils.html and File class (https://ruby-doc.org/core-2.5.5/File.html) which are modeled after common shell operations and/or sys calls.

My initial assumption was that's how things will be named in bb/fs - maybe worth considering some sort of additional namespace/lib (babashka.fs.unix ????) - especially in babashka context, porting shell code which does a lot of path operations (and relies on things like dirname, basename, globbing etc) could be helpful.

But as I mentioned in my original comment in Slack: all the building blocks are present, and starting with a help namespace in your own code might be the way to go (that's how my team approaches things: we have a whole layer of code on top of cognitect/aws-api or carmine, but it wouldn't make sense to merge that in into base libraries.

Just some food for thought of course!

@corasaurus-hex
Copy link
Contributor Author

It might be worth talking about why a thing gets included in babashka/fs, too. I tried to think through the categories of needs a library like this can fill and it seems like there are at least three (let me know if you can think of more!):

  1. Shim over Java filesystem APIs.
  2. Handle common shell scripting needs for filesystem handling.
  3. Handle complex shell scripting needs for filesystem handling.

I think babashka/fs definitely does 1 and 3. It seems like the trouble here is that this code sort of falls into category 2.

Can we make a good argument for 2? I think so. If people commonly need to do it exactly what a function does, and it's filesystem handling, then it makes sense to have it maintained in one place. It saves quite a bit of labor over time and allows babashka scripts to be shorter and not duplicate behavior from script to script. On top of that, it reveals the intention of the code more to see (file-name-without-extension path) vs (first (split-ext path)) which is always a nice thing to have. These are substantial benefits to babashka and babashka scripts.

However, this argument rests on whether this is something people commonly need to do, which I don't know how to show either way. It's common enough that it's included in POSIX shells but that also might be because removing the suffix from a file is so much more difficult in POSIX shells (which would be 3, not 2). Is this a common need? I certainly have used it to remove suffixes in the past in Ruby, but ever using isn't the same as commonly using.

@corasaurus-hex
Copy link
Contributor Author

I'm not sure how relevant it is but for a survey of scripting language stdlibs and if they include suffix removal:

  • python - ❌
  • ruby - ✅
  • node - ✅
  • perl - ✅
  • php - ✅
  • joker - ❌
  • lua - ❌ (needs third party library for path handling)
  • elisp - ✅
  • janet - ❌ (needs third party library for path handling)

@borkdude
Copy link
Contributor

borkdude commented Jul 21, 2021

split-ext doesn't preserve the path, though?

Oops, I think it maybe should. Perhaps it won't be really breaking if we fixed that.

Suffix removal.
Proposal: strip-ext: without an option it removes the extension and returns the full path... and it can take an option {:ext "md"} so it will only remove the ".md" extension and leaves all other extensions.
You can then call file-name on the result if you just wanted the file-name.

Up for discussion.

@corasaurus-hex
Copy link
Contributor Author

Would that work with multiple extensions? So like (strip-ext path {:ext "html.template"}) would remove .html.template? Or would it be more like (strip-ext path {:ext ["html" "template"]})? or would you call it multiple times like (-> path (strip-ext {:ext "template"}) (strip-ext {:ext "html"})?

@borkdude
Copy link
Contributor

We can make it work with (strip-ext path {:ext "html.template"}). By default it would only remove the .template part I think?

@corasaurus-hex
Copy link
Contributor Author

awesome! I was just typing up a comment on why that would work better 😄

@corasaurus-hex
Copy link
Contributor Author

I'd guess that basename removes suffix instead of extension in order to handle things like if you had widgets_view.html.template and wanted to remove _view.html.template. I can only remember needing to do that once or twice ever, but I thought I'd mention it

@corasaurus-hex
Copy link
Contributor Author

And at that point it's just string handling and not really extension removal, so not really fitting for this library.

@corasaurus-hex
Copy link
Contributor Author

I updated the PR to do strip-ext. I went with two positional arguments instead of making the second one an options map. Let me know if you feel strongly about that one, it just felt weird while I was implementing.

@corasaurus-hex
Copy link
Contributor Author

I fixed the PR to with your comments. Let me know what you think.

@corasaurus-hex
Copy link
Contributor Author

Looks like it was merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants