Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run ble.sh (2022) #1069

Open
andychu opened this issue Jan 7, 2022 · 58 comments
Open

Run ble.sh (2022) #1069

andychu opened this issue Jan 7, 2022 · 58 comments

Comments

@andychu
Copy link
Contributor

andychu commented Jan 7, 2022

Continuation of #653

At the very least we should see if we can reproduce the results of running ble.sh unit tests in #762 on our new container-based continuous build

I'm also curious what functionality is missing

@andychu
Copy link
Contributor Author

andychu commented Jan 7, 2022

Just ran it again: http://travis-ci.oilshell.org/github-jobs/2022-01-07__05-34-01.wwz/_tmp/soil/logs/ble-test.txt

[section] util: 895/1011 (116 fail, 0 crash, 0 skip)
osh-0.9.6$ ^D
DONE

@andychu
Copy link
Contributor Author

andychu commented Jan 7, 2022

Hm from #762 we were at

[section] util: 913/1011 (98 fail, 0 crash, 0 skip)

Hm maybe we're on the wrong branch? I think there is a branch that is maybe not maintained

But still there is a regression

@akinomyoga Can you tell me if this osh branch is still maintained? Or should I try running the tests on master?

https://github.com/oilshell/oil/blob/master/test/ble.sh#L18

I revived this build we did in 2020! I haven't touched it since then but it still kinda works. Just curious if there have been any changes

@andychu
Copy link
Contributor Author

andychu commented Jan 7, 2022

Also continuing the thread from #257 , here is a page I drafted about help I would like to raise money to pay for

https://github.com/oilshell/oil/wiki/Compiler-Engineer-Job

Although now that I read it over it looks too detailed. The short summary is: Write a 10K line Python compiler in Python, and 10K line C++ runtime, and get a free shell :-)

(and get paid, although I'm still working on raising the funds)

I think this is a very fun project for the right person. It's fun but I think will take full time effort. (And remember it is over half done, but the code is messy because of the way we used MyPy, and probably needs a rewrite)


I made this web page with line counts, although for some reason it makes things look bigger than they are ... it is a small amount of code, at least compared with GNU bash itself:

https://www.oilshell.org/release/0.9.6/pub/metrics.wwz/line-counts/for-translation.html

@akinomyoga
Copy link
Collaborator

#257 (comment) by @andychu

So, actually, I guess, after a few hooks for the user input are provided by Oil, it is not so hard for me to adjust ble.sh (and probably to submit additional small fixes to Oil) so that it works as a line editor of Oil.

Oh really that would be very interesting -- do you have an idea of the missing features? Maybe we can discuss more on #653 or start a new issue for "ble.sh 2022 :)"

I don't remember the details after two years, but basically

  1. some alternative mechanism for bind -x, which is also related to the signal handling,
  2. read -t 0 or poll/select
  3. some mechanism of eval -g (Provide an API to evaluate code in a subinterpreter #704) or some API for the save/restore the state of options, signal handling, etc.

I think there can be several approaches to 1 and 2:

  • One way is to emulate Bash's behavior perfectly including the behavior of bind -x and signal handling. This is the easiest way from the perspective of ble.sh, but I think it will take time to implement the perfect emulation of the signal handling of Bash. Also, this approach that ble.sh takes for Bash is not so clean.
  • Also, the design will largely depend on who controls the main loop of the interactive shell. If the line editor written in shell (ble.sh) can control the main loop of the shell, the line editor (ble.sh) can behave in the pull style (it is easy to write a pull adapter for push implementation such as while read-byte; do process-byte; done). In that case, hooks for the user input are not needed because the line editor (ble.sh) can directly read a byte from stdin using read.
  • Also we need to consider the special treatment of the signal handling for ble.sh (or a line editor written in shell). For example, SIGINT (C-c) should not terminate the line editor (ble.sh) nor should not trigger the trap handler set by the user in the middle of the processing of the line editor (ble.sh).

oil-native is making slow progress -- as of the latest release there are 1131 tests passing, out of 1900 or so.

Hmm, OK. If we use ble.sh with the Python osh/oil, it might be a good demonstration of osh/oil, but I have to say that the response is too slow to be used as a line editor of daily use. Also, in my vague memory, the footprint of osh with ble.sh was of the order of a few or several hundred megabytes (but it might be different).

@akinomyoga
Copy link
Collaborator

@akinomyoga Can you tell me if this osh branch is still maintained? Or should I try running the tests on master?

Not maintained, meaning that it doesn't track the changes in the latest master. I think the osh branch is basically based on ble.sh in about the middle of 2020. Actually, the latest ble.sh has many changes since May, 2020 and relies on other Bash's behavior that ble.sh didn't use on May, 2020 and Oil doesn't implement. For example, saved_func=$(declare -f func) / eval "$saved_func" #647 are now used in important places, but I guess it will require much efforts to support declare -f func in osh.

@andychu
Copy link
Contributor Author

andychu commented Jan 7, 2022

Ah OK thanks for the info. Off the top of my head I think we have read -t 0 already, which does select().

I don't think declare -f is too hard but I would have to think about it. (We might just print the source code rather than pretty-printing the AST like bash does.)

oil-native should make things faster and smaller! It should be almost as fast as bash to start, and it will use less memory, although I don't know exactly how much. And as mentioned I think we can easily optimize speed by 2x.

There is a rough idea of memory usage here:

https://www.oilshell.org/release/0.9.6/benchmarks.wwz/mycpp-examples/

e.g. on small examples the Python version uses 7.3 MB while the C++ version uses 3.4 MB. That seems like it's going in the right direction but probably not good enough. (hence why I am looking for help!)

How much memory does ble.sh use when run under bash?

I agree oil-native is the important part, and that's why I'm seeking help for it. In the meantime, if you have any feedback about Oil let me know!

@andychu
Copy link
Contributor Author

andychu commented Jan 7, 2022

Hm actually now I realize a good memory benchmark would be source ble.sh in bash and OSH, and compare memory usage. This will even work in oil-native now I think, since it just has to parse and load all the function bodies.

I suspect it will be a lot higher because Oil has a very detailed representation of the code, and it does eager parsing inside $(( )) and ${}, unlike bash. But it will be interesting to see how much. Also I'm sure it can be reduced 2x with optimization, although maybe it's 10x higher already ...

@akinomyoga
Copy link
Collaborator

akinomyoga commented Jan 8, 2022

Ah OK thanks for the info. Off the top of my head I think we have read -t 0 already, which does select().

Oh, OK, I have now confirmed that. I think I need to update the wiki page Running ble.sh With Oil · oilshell/oil Wiki.

I don't think declare -f is too hard but I would have to think about it.

That's nice to hear.

(We might just print the source code rather than pretty-printing the AST like bash does.)

Yeah, that is also one option.

I remember that the source of the functions in JavaScript which is obtained by .toString() is also the actual source code (instead of the one reconstructed from AST), where the code comments are also preserved. There was an interesting trick that one writes the documentation of JavaScript functions in the first comment in the function body so that it can be extracted from func.toString(). If one writes sufficient information in the comments in the function body, that could be used for some kind of reflection.

How much memory does ble.sh use when run under bash?

Actually, it depends on the Bash version. The memory use of ble.sh is larger for recent Bash. For example, the latest ble.sh uses about 50 MB in the latest Bash. Actually, this is one of the recent problems of ble.sh.

Hm actually now I realize a good memory benchmark would be source ble.sh in bash and OSH, and compare memory usage.

I have tried it.

Bash version plain ble.sh (2020-07)
only source / +module+attach
ble.sh (2022-01)
only source / +module+attach
Bash-3.2 3.6 MB 14 MB / 17 MB 15 MB / 19 MB
Bash-4.3 3.8 MB 17 MB / 29 MB 23 MB / 37 MB
Bash-5.1 4.3 MB 21 MB / 37 MB 27 MB / 47 MB
osh-0.9.6 15 MB 76 MB / n/a 90 MB / n/a

In the previous reply, I wrote a few hundred megabytes used by osh in my memory, but that was wrong. (Actually, I have slightly modified ble.sh for osh to suppress the memory use. I think I remember the footprint before that modification.) The memory increase of osh is roughly about (76 - 15) / (21 - 4.3) = 3.653x bash. If ble.sh is fully loaded and activated, I think the footprint of osh will surpass 100 MB: 15 + 3.653 * (47 - 4.3) = 171 MB with a simple extrapolation.

@andychu
Copy link
Contributor Author

andychu commented Jan 8, 2022

Great, thank you for trying it! This is very helpful. I filed #1070 to automate this -- it will help with the C++ translation, which is the focus of the upcoming year.

Actually it's better than I thought -- osh-0.9.6 creates a huge number of Python classes, which you can see with osh -n myscript.

Python objects are very large:

$ python2

class C(object):
  pass

>>> x = C()

>>> sys.getsizeof(x)
64

>>> sys.getsizeof(x.__dict__)
280

So the translation to C++ should reduce that by a lot -- a pointer will be 8 bytes, an integer will be 4 bytes, etc. It's statically laid out rather than dynamic with a __dict__.

I think we can get within 2x of bash on the first try, and then optimize it with enough effort.

@andychu
Copy link
Contributor Author

andychu commented Jan 8, 2022

Also Oil has something like the JavaScript feature you're talking about. It uses ### for doc comments:

http://www.oilshell.org/blog/2021/09/multiline.html#reminder-doc-comments-with

This works with traditional shell functions as well as Oil proc:

f() {
  ### my comment
  echo
}

pp proc f  # pretty print a table

You could use these in ble.sh right now because the syntax is compatible with bash. That is, you can use Oil as a documentation extractor if that proves useful.

Although now I see you use a ## as a prefix, and it has multiple lines. Hm let me think about this.

https://github.com/akinomyoga/ble.sh/blob/master/lib/core-complete.sh

What kind of metadata would you want to put in there?


Anyway, I welcome more feedback on Oil. I hope it will converge with ble.sh at some point, and we both agree C++ translation is important.

Also, I realized that compat_array which you implemented, to make ${a} equal to ${a[0]} should be the default (we should change it to strict_array and opt in). Even though that behavior is confusing, multiple shells agree on it (I think), and Oil has a separate array syntax that behaves more conventionally (const x = a[i], etc.).

This is #967

Also I remember you know quite a bit about C++ too, and I think the translation task could be pretty fun in that respect. Here is a listing of the generated code:

https://www.oilshell.org/release/0.9.6/pub/metrics.wwz/line-counts/oil-cpp.txt

   1366 _tmp/native-tar-test/oil-native-0.9.6/mycpp/gc_heap.h
   2321 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/runtime_asdl.cc
   4221 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/syntax_asdl.h
   9999 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/syntax_asdl.cc
  29060 _tmp/native-tar-test/oil-native-0.9.6/_devbuild/gen/osh-lex.h
  32805 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/osh_eval.cc
  93228 total

And this already passes 1131 out of 1900 spec tests! It can run those fibonacci and bubble_sort examples, etc. I think it should be able to source ble.sh, or at least the old copy. I will try that out.


Actually what is the reason for the n/a in the table? Is ble.sh now using a feature that OSH cannot source? Or was there a regression?

(I didn't look at why the unit test numbers went down, but I was a little surprised by that, since I don't know of any OSH regressions. It is very highly tested. I may dig into it more)

@andychu
Copy link
Contributor Author

andychu commented Jan 8, 2022

Thinking about the doc comments a little more, I think it wouldn't be that hard to support ## as well as ### and then add another column for pre_doc_comment in addition to post_doc_comment

osh$ f() {
> ### hi
> echo
> }
osh$ pp proc f
proc_name       doc_comment
f       hi

The strings are encoded in QSN format, so this can be parsed and used by a doc generator. I'm not sure if this is important for ble.sh, but just mentioning it in case. I guess it is probably not that hard to make a custom parser but I think it is nice if the language can do it for you.

@akinomyoga
Copy link
Collaborator

Also Oil has something like the JavaScript feature you're talking about. It uses ### for doc comments:

Oh, pp proc f looks very useful although I currently don't have a plan to use them from ble.sh itself. I just remember some use cases in JavaScript. The purpose of document comments in ble.sh (starting from ##) is purely notes for me (at least currently).

What kind of metadata would you want to put in there?

I usually put 1) what arguments the shell function accepts, 2) what variable the shell function accesses through the dynamic scoping, and 3) the description of exit status, along with the details of each data format, etc. I actually don't think the short description of the function is so important because I feel that that information should be usually represented by the function name,

Actually what is the reason for the n/a in the table? Is ble.sh now using a feature that OSH cannot source? Or was there a regression?

"n/a" for +module+activate: Yes, there are still some syntaxes that OSH fails to parse in modules. It's not a regression because I was actually aware back then. These were related to or relying on the module for the syntax analysis that is designed for Bash syntax but not for Osh syntax, so I didn't make much effort to find out the causes and to report it here at that time.

"n/a" for ble.sh (2022-01): Just because I haven't tried it yet. First of all, ble.sh cannot be loaded in osh without modifications and there were already many modifications to the osh branch of ble.sh in 2020-07. I first need to rebase these changes to the latest version of ble.sh. In addition, the codebase is largely changed so I guess there will be again many new problems (just as in #653). I did know that there are still many errors in parsing other modules of ble.sh (including syntax analysis), so I don't think all the incompatibility of osh is solved by the discussion in #653. I haven't tried it, but I anticipate that we need efforts to make the latest version of ble.sh loadable in osh.

(I didn't look at why the unit test numbers went down, but I was a little surprised by that, since I don't know of any OSH regressions. It is very highly tested. I may dig into it more)

I'm not sure what you mean by regression, but any behavioral changes could affect the results. It's not surprising to me that the changes to osh behavior affect the result of tests.

Thinking about the doc comments a little more, I think it wouldn't be that hard to support ## as well as ### and then add another column for pre_doc_comment in addition to post_doc_comment

I actually don't use the document comments from inside the script. Maybe I do some static analysis in the future, but I don't think I will do it by the script but would rather write a program in C++ or other languages.

@akinomyoga
Copy link
Collaborator

akinomyoga commented Jan 10, 2022

ble.sh (2022-01) for osh

I have finished the rebasing of the branch osh on the latest master.

  • read -e (edit line using ble.sh), which had worked with osh-0.8.3, does not work now with osh-0.9.6.
  • The footprint after sourcing ble.osh (2022-01) is 90 MB (I have updated the table in the above comment).

Test results

Test summary in osh-0.9.6. The full log is here: blesh-test.txt [ Note: this text file contains escape sequences for color codes, so I recommend opening it with less -r in a terminal ].

[section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
[section] ble/util: 985/1193 (208 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (relative:confine:measure-bbox): 0/0 (0 fail, 0 crash, 17 skip)
[section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
  • Many errors are caused by missing declare -f for the function definition.

For comparison, this is the expected result generated by Bash:

[section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
[section] ble/util: 1193/1193 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (relative:confine:measure-bbox): 17/17 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (justify): 24/24 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
[section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
[section] ble/unicode/GraphemeCluster/c2break: 72/72 (0 fail, 0 crash, 0 skip)
[section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 3251/3251 (0 fail, 0 crash, 0 skip)
[section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)

@andychu
Copy link
Contributor Author

andychu commented Jan 11, 2022

Great thank you for testing it! It's good to see some progress with 985/1193 vs. 895/1011.

I'm not sure what happened with read -e, but I think we can easily implement it.

90 MB doesn't seem too bad considering how large the Python objects are. I'm looking forward to testing out oil-native in #1070. I'll let you know when we do that.


Although I also wonder what method you use to get 3.8 MB "plain" for bash 4.3 ?

I'm trying to compare plain bash vs. OSH here too. The way I do it is $sh -c 'sleep 0.001; cat /proc/$$/status' and then look at VmRSS and VmPeak.

https://github.com/oilshell/oil/blob/master/benchmarks/vm-baseline.sh#L39

Actually bash's peak seems to be higher than OSH which seems wrong because of the Python issue, but I can't see any bug in the benchmark. (This chart doesn't include oil-native.)

https://www.oilshell.org/release/0.9.0/benchmarks.wwz/vm-baseline/

This seems reasonable but maybe there is a better way, and a way to make our measurements more consistent.

@andychu
Copy link
Contributor Author

andychu commented Jan 11, 2022

Wow, I found the bug in the benchmark !!!

https://github.com/oilshell/benchmark-data/tree/master/vm-baseline/spring.2021-12-29__22-16-29

Notice that the process name is actually cat for OSH and zsh. But not for bash. This is because I optimized OSH to exec the last process, instead of fork + exec.

(Shells do this to varying degrees; I noticed yash is very optimal in this regard)

If I run this, it shows that bash creates 3 different processes for that command.

sh=bash; strace -o _tmp/$sh -ff -- $sh -c 'sleep 0.1; cat /proc/$$/status'; ls _tmp/$sh.*

But I you run that with sh=zsh or sh=osh, then only 2 processes are created. So "lose" the shell process image and are instead measuring the cat process image, which has the same PID as the shell!

OK I will fix this ...

@andychu
Copy link
Contributor Author

andychu commented Jan 11, 2022

Yup in this old release before OSH was optimized, you can see the memory is correctly measured ... (but zsh is still incorrectly measured, since it is optimized)

https://www.oilshell.org/release/0.7.0/benchmarks.wwz/vm-baseline/

Compared to the latest one:

https://www.oilshell.org/release/0.9.6/benchmarks.wwz/vm-baseline/

@akinomyoga
Copy link
Collaborator

akinomyoga commented Jan 11, 2022

38. Regression ${var@a} with compat_array

I'm not sure what happened with read -e, but I think we can easily implement it.

Oh, sorry for my confusing writing. read -e is actually not about builtin read but a function read implemented by ble.sh which emulates the behavior of read -e of Readline.

In osh-0.8.3 0.8.pre5, the emulation of read -e by ble.sh (2020-07) worked. In osh-0.9.6, the emulation of read -e by ble.sh (both 2020-07 and 2022-01 versions) fails in its initialization stage.

I've checked what causes the regression. I'm not sure if this is the only regression, but at least ${var@a} introduced in #690 now doesn't work when compat_array is turned on. See the following example

$ osh
osh-0.9.6$ declare -A alpha=(['1']=2)
osh-0.9.6$ echo ${alpha@a}
A
osh-0.9.6$ shopt -s compat_array
osh-0.9.6$ echo ${alpha@a}

osh-0.9.6$

The way that I used to measure the footprint

Although I also wonder what method you use to get 3.8 MB "plain" for bash 4.3 ?

I'm measuring it for an interactive Bash process, which means that 3.8 MB also includes the memory used by GNU Readline which is not loaded in non-interactive Bash processes. Since the history size also affects the footprint, I specify an empty file for the history file. I'm looking at RSS because that is what I care about when I have many interactive Bash processes in a terminal multiplexer. In this regard, I haven't cared about an instantaneous peak of the memory use.

$ HISTFILE=empty.txt bash-4.3 --norc
$ ps -o rss $$
  RSS
 3760

andychu pushed a commit that referenced this issue Jan 12, 2022
Fixes issue tickled by ble.sh, mentioned in #1069.
@andychu
Copy link
Contributor Author

andychu commented Jan 12, 2022

Ah thank you, I just fixed that bug! It was introduced during a refactoring.

Thanks for your help -- I think having the ble.sh tests run will be useful, and the memory benchmark as mentioned.

I'm working on publicizing this "compiler engineer" position to accelerate the project:

https://github.com/oilshell/oil/wiki/Compiler-Engineer-Job

I spent a lot of time debugging the C++ runtime, and it's about half done, but needs more hands on it !

@andychu
Copy link
Contributor Author

andychu commented Jan 20, 2022

Just making a note about running ble tests in our CI:

ble.sh: Insane environment: $USER is empty.
ble.sh: modified USER=uke
  ble/function#suppress-stderr:ble/util/is-stdin-ready
  ^~~~~~~~~~~~
[ eval word at line 4022 of 'out/ble.osh' ]:1: 'ble/function#suppress-stderr:ble/util/is-stdin-ready' not found
  exec 30>&- 30< /dev/tty
             ^~~
[ eval word at line 4403 of 'out/ble.osh' ]:1: Can't open '/dev/tty': No such device or address
Error closing descriptor 30: Bad file descriptor
osh I/O error: Bad file descriptor

It seems like there is a problem with our container; it doesn't like accessing /dev/tty as a non-root user? Not sure how that is usually set up.

$ soil/images.sh cmd ovm-tarball bash -c 'exec 30>&- 30< /dev/tty; echo hi'
bash: /dev/tty: No such device or address
hi

On my host machine, it's not in the root group? It's in the tty group?

$ soil/images.sh cmd ovm-tarball bash -c 'ls -l /dev/tty'
crw-rw-rw- 1 root root 5, 0 Jan 20 07:21 /dev/tty
andy@lenny ~/git/oilshell/oil (dev/andy-31)$ ls -l /dev/tty
crw-rw-rw- 1 root tty 5, 0 Jan 20 02:18 /dev/tty

@akinomyoga
Copy link
Collaborator

akinomyoga commented Jan 20, 2022

  ble/function#suppress-stderr:ble/util/is-stdin-ready
  ^~~~~~~~~~~~
[ eval word at line 4022 of 'out/ble.osh' ]:1: 'ble/function#suppress-stderr:ble/util/is-stdin-ready' not found

This error is caused by declare -f not outputting the definition of the function. There are many other instances of the codes using declare -f in the current ble.sh. I don't want to write workarounds for every place that uses declare -f. Actually, currently I'm waiting for declare -f being implemented.


  exec 30>&- 30< /dev/tty
             ^~~
[ eval word at line 4403 of 'out/ble.osh' ]:1: Can't open '/dev/tty': No such device or address
Error closing descriptor 30: Bad file descriptor
osh I/O error: Bad file descriptor

It seems like there is a problem with our container; it doesn't like accessing /dev/tty as a non-root user? Not sure how that is usually set up.

I don't think the permission of /dev/tty is related. /dev/tty is only available when a TTY/PTY is allocated to the process-group session. I don't know how the CI tests of oil are run, but if it is performed in a Docker container, it seems there is an option -t for docker run or docker exec.

Name, shorthand Default Description
--tty-t Allocate a pseudo-TTY

andychu pushed a commit that referenced this issue Jan 20, 2022
Now the tests run to completion.

    [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
    [section] ble/util: 999/1193 (194 fail, 0 crash, 0 skip)
    [section] ble/canvas/trace (relative:confine:measure-bbox): 0/5 (5 fail,
    0 crash, 12 skip)
    [section] ble/canvas/trace (cfuncs): 0/0 (0 fail, 0 crash, 18 skip)
    [section] ble/decode: 33/33 (0 fail, 0 crash, -1 skip)

Part of #1069.
@andychu
Copy link
Contributor Author

andychu commented Jan 20, 2022

Oh great point, thank you! I am just starting to use Docker/podman.

With the -t flag, the test run, and it looks like this, which looks about right (although I wonder why ble/decode says -1 skip):

    [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
    [section] ble/util: 999/1193 (194 fail, 0 crash, 0 skip)
    [section] ble/canvas/trace (relative:confine:measure-bbox): 0/5 (5 fail,
    0 crash, 12 skip)
    [section] ble/canvas/trace (cfuncs): 0/0 (0 fail, 0 crash, 18 skip)
    [section] ble/decode: 33/33 (0 fail, 0 crash, -1 skip)

@akinomyoga
Copy link
Collaborator

akinomyoga commented May 14, 2023

I decided to assign a sequential number continuing from #653 (I renumbered the above regression for ${var@a}). I think I might be going to raise many issues again, but they can easily be forgotten if I directly write them in the threads or new conversations of Zulip. It is easier for me to manage the list here. If you would like to discuss them on Zulip, we can leave the link to the Zulip conversation in this issue.

39. NYI: EPOCHREALTIME, SECONDS, and other dynamic variables

ble.sh uses it. I believe it is not so difficult to support. We can add the new if branch in Mem.GetValue (core/state.py). There are also EPOCHSECONDS, RANDOM, and SRANDOM.

The implementation of RANDOM seemed to be once attempted in #253 (with also a mention of SECONDS), but I don't think the approach is so correct. The approach #253 seems to add a new grammar for $RANDOM just like $*, $#, etc., but the parsing rule for these special variable names is exactly the same as the normal variables. There are already existing dynamic variables such as FUNCNAME, so we should consider implementing them the same way as FUNCNAME, etc.

@akinomyoga
Copy link
Collaborator

40. BUG: a=(declare v); "${a[@]}" fails

An associative array cannot be initialized by "${a[@]}", where a contains the command to initialize it.

$ bash -c 'a=(declare -A assoc); "${a[@]}"; declare -p assoc'
declare -A assoc
$ osh -c 'a=(declare -A assoc); "${a[@]}"; declare -p assoc'
  a=(declare -A assoc); "${a[@]}"; declare -p assoc
                        ^
[ -c flag ]:1: Can't run assignment builtin recursively

It doesn't even work for a scalar declaration.

$ bash -c 'a=(declare v); "${a[@]}"; declare -p v'
declare -- v
$ osh -c 'a=(declare v); "${a[@]}"; declare -p v'
  a=(declare v); "${a[@]}"; declare -p v
                 ^
[ -c flag ]:1: Can't run assignment builtin recursively

Preferably, it should also work for array assignments, etc.

$ bash -c 'a=(declare -a "arr=(1 2 3)"); "${a[@]}"; declare -p arr'
declare -a arr=([0]="1" [1]="2" [2]="3")

@akinomyoga
Copy link
Collaborator

akinomyoga commented May 15, 2023

41. BUG: [[ -c /dev/null ]] fails in C++ version

As I have already written in the Zulip conversation at the following link, it fails with an assertion error.
https://oilshell.zulipchat.com/#narrow/stream/121539-oil-dev/topic/ble.2Esh.20Testing

$ osh -c '[[ -c /dev/zero ]]'
osh: cpp/osh.cc:129: bool bool_stat::DoUnaryOp(id_kind_asdl::Id_t, Str*): Assertion `false' failed.

With some workarounds, I could run the tests. I attach the full log: test-util.txt. The test summary reads

 84.2% [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
 85.9% [section] ble/util: 1056/1228 (172 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 5/5 (0 fail, 0 crash, 12 skip)
100.0% [section] ble/canvas/trace (cfuncs): 13/13 (0 fail, 0 crash, 5 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)

The expected results with Bash are as follows. Some of the tests for ble/canvas are not even run above:

$ bash --rcfile bashrc.test-util
100.0% [section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/util: 1228/1228 (0 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 5/5 (0 fail, 0 crash, 12 skip)
100.0% [section] ble/canvas/trace (cfuncs): 17/17 (0 fail, 0 crash, 1 skip)
100.0% [section] ble/canvas/trace (justify): 2/2 (0 fail, 0 crash, 28 skip)
100.0% [section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break: 77/77 (0 fail, 0 crash, 0 skip)
---.-% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 0/0 (0 fail, 0 crash, 3251 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)

@akinomyoga
Copy link
Collaborator

akinomyoga commented Feb 22, 2025

54. BUG: shopt -u expand_aliases fails and stops execution

There are use cases. It shouldn't fail. It shouldn't cause warning messages either.

(edit: The original test case was broken. I updated the test case as I reported at #1069 (comment)).

$ cat expand_aliases.sh
#!/usr/bin/env bash

alias echo=false

function f {
  shopt -u expand_aliases
  eval -- "$1"
  shopt -s expand_aliases
}

f 'echo hello'
$ osh expand_aliases.sh
    shopt -u expand_aliases
    ^~~~~
expand_aliases.sh:6: fatal: Syntax options must be set at the top level (outside any function)

@andychu
Copy link
Contributor Author

andychu commented Feb 22, 2025

on 52. - builtin local

I think it was always like this? But many people have ran into it, so I think we should fix it

IIRC my issue was that the word splitting is different when unquoted, e.g. consider

local x=$y  # no word splitting of y
builtin local x=$y  # yes word splitting

however all shells behave like this IIRC, so it can be OK to do .... not 100% sure this is the issue, but I think we can do something

on 53. LHS array parsing

Yes we are taking a shortcut here -- I think only "word" chars are allowed, like a[1+2]=3

if we can can come up with good algorithm, we could change it

Although I think this is the first report I've heard of it

on 54. func warning

I think this is about the func warning? Not about aliases? let me see

@andychu
Copy link
Contributor Author

andychu commented Feb 22, 2025

Yeah I think 54. is about the func keyword. If you change it to myfunc, it should work

It is a bit special, like bash takes coproc as a keyword

@akinomyoga
Copy link
Collaborator

akinomyoga commented Feb 22, 2025

Re: 52. Regression: \builtin local v=1 fails

I think it was always like this? But many people have ran into it, so I think we should fix it

No, it's not about builtin local. The problem happens with \builtin local but not with builtin local. This is not a problem of word splitting.

Re: 54.

Yeah I think 54. is about the func keyword. If you change it to myfunc, it should work

It is a bit special, like bash takes coproc as a keyword

Ah, I'm sorry. I broke the test case while reducing. The original problem can be reproduced with this one (which happens with a function name e.g. f):

$ cat expand_aliases.sh
#!/usr/bin/env bash

alias echo=false

function f {
  shopt -u expand_aliases
  eval -- "$1"
  shopt -s expand_aliases
}

f 'echo hello'
$ osh expand_aliases.sh
    shopt -u expand_aliases
    ^~~~~
expand_aliases.sh:6: fatal: Syntax options must be set at the top level (outside any function)

@andychu
Copy link
Contributor Author

andychu commented Feb 22, 2025

I filed #2260

Yes the solution to func is to quote it:

$ osh -c 'function func { echo hi; }; func'
  function func { echo hi; }; func
                              ^~~~
[ -c flag ]:1: func is a YSH keyword, but this is OSH.

$ osh -c 'function func { echo hi; }; "func"'
hi
$ bash -c 'function coproc { echo hi; }; coproc'
bash: -c: line 1: syntax error near unexpected token `newline'
bash: -c: line 1: `function coproc { echo hi; }; coproc'

$ bash -c 'function coproc { echo hi; }; "coproc"'
hi

( BTW I would like to turn on func in OSH, so it will be parsed differently)

@andychu
Copy link
Contributor Author

andychu commented Feb 22, 2025

expand_aliases.sh:6: fatal: Syntax options must be set at the top level (outside any function)

Oh OK I see

I'll look into that, I'm not sure if that is new or not

@andychu
Copy link
Contributor Author

andychu commented Feb 22, 2025

OK I see that it makes sense when yo'ure doing eval or source !! OK so maybe that is overly aggressive

Maybe we have shopt --set strict_parse or something

  • for OSH - it's off
  • for YSH - it's on
    • in YSH, we generally use reflection on blocks, not string processing like eval. For example p { echo hi }

andychu pushed a commit that referenced this issue Feb 23, 2025
- For ble.sh compatibility with shopt -u expand_aliases - #1069
- the check worked in shell functions, but was broken in proc/func
@andychu
Copy link
Contributor Author

andychu commented Feb 23, 2025

BTW I just disabled the check for expand_aliases, so that should no longer be a problem. (The check was subtly broken for proc/func - we could add it back later in YSH only, if it comes up)

I will work on builtin local x=$y as well

@akinomyoga
Copy link
Collaborator

akinomyoga commented Feb 23, 2025

Thank you.

55. Memory footprint 1.2 GB

By the way, I noticed that the memory footprint becomes extremely large when eval_unsafe_arith is turned on in the C++ version. This is the result for sourcing ble.osh.

Python version C++ version Bash 5.3-beta (for ref)
Without eval_unsafe_arith 69.23 MB (41.64s) 157.9 MB (6.364s) 14.77 MB (0.178s)
With eval_unsafe_arith 69.17 MB (41.45s) 1.207 GB (6.421s) N/A
(Detailed commands and results)

With eval_unsafe_arith

$ cat oshrc.source
#!/usr/bin/env osh
HISTFILE=$HOME/.osh_history
shopt -s eval_unsafe_arith
source out/ble.osh --norc
echo "RSS: $(ps -o rss -p $$ | tail -n 1)" >&2
exit
$ ~/.mwg/git/oilshell/oil/bin/osh --rcfile oshrc.source
  shopt -s checkwinsize
  ^~~~~
out/ble.osh:2815: 'shopt' got invalid option 'checkwinsize'
osh warning: The 'RETURN' hook isn't implemented
RSS: 69168
[ble: elapsed 41.452s, CPU 99.8% (self)usr2ms/sys2ms+(child)usr41.109/sys264ms]
$ ~/.mwg/git/oilshell/oil/_bin/cxx-asan/osh --rcfile oshrc.source
  shopt -s checkwinsize
  ^~~~~
out/ble.osh:2815: 'shopt' got invalid option 'checkwinsize'
osh warning: The 'RETURN' hook isn't implemented
RSS: 1207040
[ble: elapsed 6.421s, CPU 99.5% (self)usr1ms/sys3ms+(child)usr5.824/sys570ms]

Without eval_unsafe_arith

$ cat oshrc.source
#!/usr/bin/env osh
HISTFILE=$HOME/.osh_history
#shopt -s eval_unsafe_arith
source out/ble.osh --norc
echo "RSS: $(ps -o rss -p $$ | tail -n 1)" >&2
exit
$ ~/.mwg/git/oilshell/oil/bin/osh --rcfile oshrc.source
  shopt -s checkwinsize
  ^~~~~
out/ble.osh:2815: 'shopt' got invalid option 'checkwinsize'
osh warning: The 'RETURN' hook isn't implemented
RSS: 69236
[ble: elapsed 41.640s, CPU 99.8% (self)usr3ms/sys1ms+(child)usr41.288/sys273ms]
$ ~/.mwg/git/oilshell/oil/_bin/cxx-asan/osh --rcfile oshrc.source
  shopt -s checkwinsize
  ^~~~~
out/ble.osh:2815: 'shopt' got invalid option 'checkwinsize'
osh warning: The 'RETURN' hook isn't implemented
RSS: 157896
[ble: elapsed 6.364s, CPU 99.6% (self)usr1ms/sys3ms+(child)usr5.845/sys494ms]

Bash 5.3-beta

$ cat bashrc.source
#!/usr/bin/env bash

HISTFILE=$HOME/.mwg/git/oilshell/oil/tmp/bash_history
source out/ble.sh --norc --bash-debug-version=ignore
echo "RSS: $(ps -o rss -p $$ | tail -n 1)" >&2
exit
$ bash --rcfile bashrc.source
RSS: 14796
[ble: elapsed 178ms, CPU 100.5% (self)usr2ms/sys2ms+(child)usr149ms/sys28ms]

The same behavior is observed with oils-for-unix compiled from the tarball on the release page.

Why does this happen?

edit: By the way, the C++ version is faster than the Python version, but it is still 36x slower compared to Bash. The memory size is 10x as large as Bash even without eval_unsafe_arith. Hmm, but maybe this is simply because Bash doesn't do the full parsing at the sourcing stage. Bash performs some parsing at the execution step, so the performance should be different for the actual processing.

@andychu
Copy link
Contributor Author

andychu commented Feb 23, 2025

Oh wow

I would have thought the 36x slower is related to the memory usage bug -- i.e. the GC is causing slowness

But it seems like it is slower either way?


I am also surprised that it is 36x slower. Our microbenchmarks are showing we are close to batch speed, like 0.5x - 2x the speed of bash

First thought: I think eval_unsafe_arith lacks garbage collection, and ble.sh is doing a lot of work there? We may have to add some "manual GC points"

There was a similar bug I fixed elsewhere recently ... it caused a lot of slowness and memory usage


Let me reproduce this a bit later today ... thank you for the report!

I suspect this is a bug, not the normal behavior

@andychu
Copy link
Contributor Author

andychu commented Feb 23, 2025

It is definitely a bug if the C++ version is using more memory than Python! That is not the case on microbenchmarks

https://oils.pub/release/0.27.0/benchmarks.wwz/mycpp-examples/

Although I guess the only benchmark that is really testing memory is "containers" ... but the other ones are testing that there are no leaks


I suspect there are GC bugs in both cases ... i.e. if C++ is using more memory than Python, it's because it's not GC'ing as often, in the right places ...

@andychu
Copy link
Contributor Author

andychu commented Feb 23, 2025

~/.mwg/git/oilshell/oil/_bin/cxx-asan/osh --rcfile oshrc.source

Oh actually don't test with cxx-asan !

You should do ninja _bin/cxx-opt/osh, and then run that binary. It is the optimized binary, that's the same as what the tarball builds.

ASAN adds a lot of memory usage, and it also slows the binary down ! We only use it for testing, to catch memory errors

@andychu
Copy link
Contributor Author

andychu commented Feb 23, 2025

There are some very old wiki pages here

https://github.com/google/sanitizers/wiki/addresssanitizer

https://github.com/google/sanitizers/wiki/AddressSanitizerPerformanceNumbers

ASAN definitely uses much more memory, I remember hitting that when testing the GC ... It is not supposed to introduce that much more time overhead, but let's see

@andychu
Copy link
Contributor Author

andychu commented Feb 23, 2025

I guess we don't have a clear place to document cxx-asan and cxx-opt, but it is mentioned a bit here - https://github.com/oils-for-unix/oils/wiki/Oil-Dev-Cheat-Sheet - let me know if we should put it somewhere else

build/ninja_lib.py lists some of the build variants

@akinomyoga
Copy link
Collaborator

akinomyoga commented Feb 23, 2025

~/.mwg/git/oilshell/oil/_bin/cxx-asan/osh --rcfile oshrc.source

Oh actually don't test with cxx-asan !

Ah, yeah. I actually wasn't sure about how I should generate the non-ASAN binary. That's the reason I also tested the behavior with the tarball of 0.27.0 from the release page. As I've already written in #1069 (comment), the same behavior is observed with the release tarball of 0.27.0. It still uses 1.1 GB memory with shopt -s eval_unsafe_arith.

You should do ninja _bin/cxx-opt/osh, and then run that binary.

Without eval_unsafe_arith

$ ~/.mwg/git/oilshell/oil/_bin/cxx-opt/osh --rcfile oshrc.source
  shopt -s checkwinsize
  ^~~~~
out/ble.osh:2815: 'shopt' got invalid option 'checkwinsize'
osh warning: The 'RETURN' hook isn't implemented
[??? no location ???] warning: Invalid start of UTF-8 sequence
[??? no location ???] warning: Invalid start of UTF-8 sequence
RSS: 38244
[ble: elapsed 930ms, CPU 98.9% (self)usr4ms/sys0+(child)usr530ms/sys388ms]

With eval_unsafe_arith

$ ~/.mwg/git/oilshell/oil/_bin/cxx-opt/osh --rcfile oshrc.source
  shopt -s checkwinsize
  ^~~~~
out/ble.osh:2815: 'shopt' got invalid option 'checkwinsize'
osh warning: The 'RETURN' hook isn't implemented
[??? no location ???] warning: Invalid start of UTF-8 sequence
[??? no location ???] warning: Invalid start of UTF-8 sequence
RSS: 1087120
[ble: elapsed 964ms, CPU 98.9% (self)usr3ms/sys2ms+(child)usr538ms/sys413ms]

So, the updated table would be this:

Python version osh (cxx-asan) osh (cxx-opt) Bash 5.3-beta (for ref)
Without eval_unsafe_arith 69.23 MB (41.64s) 157.9 MB (6.364s) 38.24 MB (0.930s) 14.77 MB (0.178s)
With eval_unsafe_arith 69.17 MB (41.45s) 1.207 GB (6.421s) 1.087 GB (0.964s) N/A

OK, the memory footprint without eval_unsafe_arith is smaller than the Python version and only 2x as large as Bash 5.3. Also, the execution time was much improved to be 5x slower compared to Bash. However, with eval_unsafe_arith, it still uses 1.1 GB.

@akinomyoga
Copy link
Collaborator

akinomyoga commented Feb 23, 2025

Note: About Bash 5.3-beta

To be precise, for "Bash 5.3-beta" in the previous replies, I use the devel branch of Bash (which is numbered as Bash 5.3-beta yet contains more recent changes compared to the official tarball of Bash 5.3-beta). In addition, memory debugging code is turned on in Bash 5.3-beta, so I turn it off by using the following configure option:

$ cd ~/prog/ext/bash-dev
$ ./config.status --config
--with-bash-malloc=no --prefix=/home/murase/.opt/bash/dev 'CFLAGS=-march=native -O3'

@akinomyoga
Copy link
Collaborator

56: BUG: declare -p unset does not print any error message

I noticed this before, but I haven't reported so far. However, this is very confusing. Bash prints the error message when the specified variable name is not found. OSH fails with exit status 1, but it prints nothing.

$ bash -c 'declare -p a'
bash: line 1: declare: a: not found
$ echo $?
1
$ osh -c 'declare -p a'
$ echo $?
1
$

57. BUG: variable v is invisible after IFS= eval 'local v=...'

Now, ble.sh seems to load fine, so I tried to run ble.sh''s emulation of the read` command, but nothing is output. This was caused by this OSH bug.

$ bash -c 'function f { IFS= eval "local v=\"\$*\""; declare -p v; }; f h e l l o'
declare -- v="hello"
$ osh -c 'function f { IFS= eval "local v=\"\$*\""; declare -p v; }; f h e l l o'
$

The variable is defined in a temporary variable scope and disappears after the eval command finishes. However, local v should create a variable in the function scope.

This problem does not happen when no tmpenv is specified (though the result is different from the desired one as expected):

$ osh -c 'function f { eval "local v=\"\$*\""; declare -p v; }; f h e l l o'
declare -- v='h e l l o'

This problem does not happen when eval is not used (though the result is different from the desired one as expected):

$ osh -c 'function f { IFS= local v="$*"; declare -p v; }; f h e l l o'
declare -- v='h e l l o'

@akinomyoga
Copy link
Collaborator

58. BUG: mapfile -t stops reading the input with an empty line

$ bash -c 'mapfile -t lines; declare -p lines' <<< $'hello\n\nworld'
declare -a lines=([0]="hello" [1]="" [2]="world")
$ osh -c 'mapfile -t lines; declare -p lines' <<< $'hello\n\nworld'
declare -a lines=(hello)

Also, it doesn't preserve the initial empty line or the ending newline.

$ bash -c 'mapfile -t lines; declare -p lines' <<< $'\nhello'
declare -a lines=([0]="" [1]="hello")
$ bash -c 'mapfile -t lines; declare -p lines' <<< $'hello\n'
declare -a lines=([0]="hello" [1]="")
$ osh -c 'mapfile -t lines; declare -p lines' <<< $'\nhello'
declare -a lines=()
$ osh -c 'mapfile -t lines; declare -p lines' <<< $'hello\n'
declare -a lines=(hello)
$

@akinomyoga
Copy link
Collaborator

59. Assigning Str to BashArray/BashAssoc removes BashArray/BashAssoc

$ bash -c 'ret=(1 2 3); ret=3; declare -p ret'
declare -a ret=([0]="3" [1]="2" [2]="3")
$ osh -c 'ret=(1 2 3); ret=3; declare -p ret'
declare -- ret=3
$

@andychu
Copy link
Contributor Author

andychu commented Mar 2, 2025

Thanks for finding these issues!

  1. we should add this error message
  2. I am not sure why this is the case - I thought this might be related to a case where we chose dash-like tmpenv rather than bash-like? I remember in 2020 or so we discovered that shells behaved very differently with tmpenv. Or maybe that was unset with tmpenv? I think we need a failing spec test case to see what other shells do. If OSH disagrees with other shells (not just bash), then it's a bug
  3. Ah sorry I think mapfile -t is just not implemented, we can do that
  4. To me this behavior is fairly confusing ... that means we have to check the LHS when doing ret=3 ... gah! I also wonder what other shells do (zsh has arrays). Or maybe it means we need change some APIs in core/state.py

@andychu
Copy link
Contributor Author

andychu commented Mar 2, 2025

I will look at #58 - mapfile -t

Maybe you can add failing cases to spec/ble-idioms for 57 and 59 ? To see what other shells do

And I actually I wonder how many of the above issues are still open. Do we have 50 left or 30 or 20 ?? :-) It would be nice if we can collect the failing test cases (although I know it's sometimes hard to find them all)


I also remember builtin local, we can do something about that I think

@akinomyoga
Copy link
Collaborator

And I actually I wonder how many of the above issues are still open. Do we have 50 left or 30 or 20 ?? :-)

I keep track of everything needed to run ble.sh in https://github.com/oils-for-unix/oils/wiki/Running-ble.sh-With-Oil.

I also remember builtin local, we can do something about that I think

builtin local is fine for ble.sh. What's not working is \builtin local.

@andychu
Copy link
Contributor Author

andychu commented Mar 2, 2025

OK thanks for keeping it up to date!

Related, this was just reported by a user:

On the page, there is a bug for both \builtin local, and another one for builtin declare, which is tagged with


I think we should probably do something about this, but I am not sure what exactly yet ...

I kinda liked that control flow was static rather than dynamic, i.e. break continue return and even exit ... It relates to a thread "Which languages features need a static subset?"

But yeah I guess we probably disagree with every shell there

@akinomyoga
Copy link
Collaborator

On the page, there is a bug for both \builtin local, and another one for builtin declare, which is tagged with

I checked the original report of item 22. The command that caused the error was builtin declare a=1, which is now fixed in the current version of osh. I think I can mark item 22 as solved.

$ osh -c 'builtin declare a=1; declare -p a'
declare -- a=1

On the other hand, as far as I see #1069 (comment), I think what you refer to with "builtin local" or "builtin declare" is word splitting performed with builtin local and builtin declare.

$ bash -c 'x="1 b=2"; builtin declare a=$x; declare -p a b'
declare -- a="1"
declare -- b="2"
$ osh -c 'x="1 b=2"; builtin declare a=$x; declare -p a b'
declare -- a='1 b=2'

However, ble.sh doesn't use this for now, so I haven't included it on the list.

andychu pushed a commit that referenced this issue Mar 2, 2025
Caused by a broken underlying ReadLineSlowly()

This is issue #2287, reported originally with ble.sh in #1069.
@andychu
Copy link
Contributor Author

andychu commented Mar 2, 2025

I fixed mapfile, that was a bad bug -- recorded in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants