Skip to content
This repository has been archived by the owner on Feb 16, 2023. It is now read-only.

Make tools importable #114

Open
mvdan opened this issue Jan 12, 2018 · 24 comments
Open

Make tools importable #114

mvdan opened this issue Jan 12, 2018 · 24 comments

Comments

@mvdan
Copy link

mvdan commented Jan 12, 2018

Hi Eric! I am developing a shell package - see https://github.com/mvdan/sh.

One of its components is an interpreter. That means I have to implement the shell builtins like echo and cd. One of the big wins of that library is that Go packages that used to need bash to be installed can simply drop that dependency, and use the shell package as a replacement, statically linked into their binary.

However, that breaks down quite easily on systems that don't have coreutils installed. Lots of shell scripts out in the wild depend on coreutils programs like cat, rm and wc. This is why I opened mvdan/sh#93 - to add them to the interpreter as a sort of second layer of builtins.

However, as you probably very well know, adding even just some of them is a ton of work. Which is why I've been looking around for implementations of coreutils.

I could use upstream or the popular implementation in Rust, but that would mean somehow bundling the binaries into the final binary. Something nasty like including them at compile-time as assets and unpacking them into the filesystem at run-time.

But that's not the case with Go, since I can simply import Go packages. Then, the only roadblock that I see is that your tools (nice job, by the way!) are not importable - they are all main packages.

Have you given thought to adding a common interface for all the tools? For example, similar to what os/exec does:

type Ctx struct {
        Dir    string
        GetEnv func(string) string
        Stdin  io.Reader
        Stdout io.Writer
        Stderr io.Writer
}

func Run(c Ctx, name string, args ...string) error

Then one could do something like coreutils.Run(Ctx{...}, "wc", "somefile").

If you have any input, or would like any help to implement this, do let me know.

@ericlagergren
Copy link
Owner

Yes I have. I've started to turn some of them into libraries, but I've been more focused on my decimal library as of late. I plan to spend more time on this library once v3.0 of my decimal package drops, which should be whenever trig functions are added.

If you'd like to help in any way you're more than welcome. I'm down to finally make this library useful and help you out!

@mvdan
Copy link
Author

mvdan commented Jan 12, 2018

Great to hear that! I won't submit a PR right away, as this would require quite a bit of design and refactoring, and I'm not familiar with this codebase. And it would likely save everyone time if you have a look at it first.

When you start working on this or have a design/prototype, do let me know and I'll be happy to help - be it reviews, testing, or coding.

@ericlagergren
Copy link
Owner

ericlagergren commented Jan 12, 2018

Sounds good. I might fiddle around with it a bit today. If you don't hear from me in a week or so, feel free to ping me. I don't mind being bothered. I'm glad somebody's getting use of this library!

@ericlagergren
Copy link
Owner

So, I spent a little while and sketched out an implementation using wc:

Example:

// +build ignore

package main

import (
	"os"

	"github.com/ericlagergren/go-coreutils/coreutils"

	_ "github.com/ericlagergren/go-coreutils/wc"
)

func main() {
	ctx := coreutils.Ctx{
		Stdin:  os.Stdin,
		Stdout: os.Stdout,
		Stderr: os.Stderr,
	}
	coreutils.Run(ctx, "wc", "-l", "cmd.go")
}

@mvdan
Copy link
Author

mvdan commented Jan 13, 2018

Did you forget to commit the coreutils package? I'm also not a terrible fan of the coreutils/coreutils path :) Perhaps you could simply use the root package, or do something else like coreutils/exec.

I would also need Dir in the context struct, similar to what's in the os/exec package. Otherwise, the current dir from the process is forced, which is no good for my interpreter.

Otherwise looks good!

@ericlagergren
Copy link
Owner

ericlagergren commented Jan 13, 2018 via email

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Yes, this is similar to what I was thinking. Registering the commands sounds fine. Ping me when there's a working version I can test out :)

@ericlagergren
Copy link
Owner

Ok, here's what I meant to commit the other day: 8b35c72

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Trying it out now, getting this build error on linux/amd64:

# github.com/ericlagergren/go-coreutils/wc/internal/sys
../../../../land/src/github.com/ericlagergren/go-coreutils/wc/internal/sys/fadv_unix.go:7:22: Fadvise redeclared in this block
        previous declaration at ../../../../land/src/github.com/ericlagergren/go-coreutils/wc/internal/sys/fadv.go:5:21

@ericlagergren
Copy link
Owner

Oh. Just a goofed up build tag inside wc/internal/sys/fadv.go It should be a comma, not a space. Fadv isn't a requirement, anyway. Just theoretically speeds up reading a file by letting the kernel know the desired read pattern.

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Thanks, now it builds. It behaves differently from GNU wc, though. For example, wc -c somefile gives \t<number>\n instead of just <number>\n. And prog | wc gives -\n instead of \t<number>\t<number>\t<number>\n.

Do you happen to have tests that check input/output of your implementations versus GNU's?

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Also, if you have more time, here's another suggestion to add to the common context - a context.Context. This has multiple advantages, such as setting a timeout or being able to cancel. For most programs that won't be very useful, but imagine sleep, cp, or dd.

@ericlagergren
Copy link
Owner

It behaves differently from GNU wc, though.

It does? What version of coreutils are you running? Mine's identical with coreutils 8.29.

$ go run m.go
25986317 /Users/ericlagergren/out2.s
0:1 /tmp $ gwc -c /Users/ericlagergren/out2.s
25986317 /Users/ericlagergren/out2.s
0:1 /tmp $ go run m.go > go.txt; gwc -c /Users/ericlagergren/out2.s > gnu.txt; diff go.txt gnu.txt
0:1 /tmp $

Do you happen to have tests that check input/output of your implementations versus GNU's?

For some, yeah. wc does.

I like the context.Context idea.

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Simpler example:

$ wc --version
wc (GNU coreutils) 8.28
$ wc /dev/null
      0       0       0 /dev/null
$ cat /dev/null | wc
      0       0       0
$ cat /dev/null | wc -c
0

Unless I got something very wrong in my prototype, your implementation seems to always include the filename (even if it reads from stdin) and when given no flags, it seems to not print those three numbers. That's what I meant by the examples above.

@ericlagergren
Copy link
Owner

ericlagergren commented Jan 15, 2018

Gotcha. One of the goals of this project is to have it be byte-for-byte exact with GNU, but sometimes there are good reasons for it not to be. For example, coreutils is meant to run on VAX and stuff, so there's lots of weird edge-case code and sometimes they go from A -> B -> C -> D to do something that Go (because it can abstract more and doesn't need to support machines from the '80s) can do simply by going from A -> D, if that makes sense.

For example, GNU wc uses 7 spaces minimum for all printing, unless it can't stat the input (i.e., it's not regular file). Then it just dumps it with 0 spaces.

It should be easy enough to make byte-for-byte perfect.

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Thanks - your recent changes make sense. Now my tests almost pass - the only problem is what when reading from stdin it still prints a trailing space, like wc -c <somefile prints 8 \n. Other than that, all tests should now pass :)

@ericlagergren
Copy link
Owner

ericlagergren commented Jan 15, 2018 via email

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Sounds good. Note that I absolutely don't need all the tools at once. In particular, the original issue was just about some of the common ones. This will act as an overlay on top of a real os/exec call, so on most environments coreutils will be installed and available anyway.

Even if only one or a few tools are importable as libraries, that's plenty for the interpreter to start using them.

@ericlagergren
Copy link
Owner

ericlagergren commented Jan 15, 2018 via email

@mvdan
Copy link
Author

mvdan commented Jan 15, 2018

Basically any that would be used frequently in shell scripts - rm, cp, mv, mkdir, ls, touch, chmod are perhaps the most common ones.

@ericlagergren
Copy link
Owner

ericlagergren commented Jan 15, 2018

  • rm
  • cp
  • mv
  • mkdir
  • ls
  • touch
  • chmod
  • wc

@mvdan
Copy link
Author

mvdan commented Oct 28, 2022

For those of you who saw this thread, I'm trying to coordinate with a different project now :) u-root/u-root#2527

@ericlagergren
Copy link
Owner

Sorry :)

@mvdan
Copy link
Author

mvdan commented Oct 28, 2022

Certainly not trying to dig up old stuff or put blame - I also have some semi-abandoned projects due to lack of free time and energy :) Just want to point others who might still be interested towards more recent developments.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants