-
Notifications
You must be signed in to change notification settings - Fork 0
mbucc/erlfmt
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Read Erlang source on stdin, reformat with erl_tidy, and write to stdout. No options, you get what erl_prettypr:format/2 gives you. Back story The problem seemed simple: read Erlang source code on stdin and send a nicely formatted result to stdout. In vim, I got used to this when I was learning Go: :%!gofmt Just type the code without any concern for formatting, and then format the entire file in one command. I now have xmlfmt, jsfmt, javafmt and I wanted a similar utility for Erlang. I started with erl_tidy, which Tidies and pretty-prints Erlang source code, removing unused functions, updating obsolete constructs and function calls, etc. Seems perfect ... except that erl_tidy does not read from stdin. I posted a question to the Erlang questions mailing list asking if I missed something and the answer was a clear no. People pointed me to alternative approaches: an Emacs elisp module, a rebar3 module that wraps erl_tidy, and a state-machine/parser written in Erlang that the vim Erlang module uses. The first two didn't solve my problem, and I didn't like the complexity of the third approach. First attempt: use basic Erlang modules While Googling around, I found a short post showing how you can format Erlang code using erl_scan (string -> tokens), erl_parse (tokens -> form) and erl_pp (form -> string). So I started coding that up. It worked great. Initially. The first problem I hit was that erl_parse:parse_form/1 does not handle white space or comment tokens. OK, no problem. To keep moving forward I implemented the quick hack of only dropping comments that came inside a form, and keeping the ones that came before. Not great, but I wanted to get something working. The next problem I hit was pre-processor constructs. It turns out that erl_parse does not understand pre-processor bits either (macros, imports, etc). So, another hack: when we hit a dot-terminated token sequence that includes a pre-processor construct, just print out the raw text that generated those tokens and leave it at that. When that was done, I decided dropping comments was not acceptable. While searching around for how to re-insert comments, I came across a reference to the epp_dodger module and a function that re-inserts comments. Both of which is used by erl_tidy. It seemed stupid to re-write erl_tidy so I went back to square one. Second attempt: teach erl_tidy about stdin Erlang can read stdin just fine. In fact, most of the input functions in the io module read from stdin by default. Erlang provides the standard_io atom that you can use for an IODevice argument. I started hacking on erl_tidy, intending to use the "special" file name of a single dash (-) to tell erl_tidy to read from stdin. It was simple to add a new read_module("-", Opts) and pass standard_io to epp_dodger:parse/3. But, the next chunk of erl_tidy logic reads comments from the same file again, which is not possible with stdin, since the stream was already consumed by parsing. I briefly looking to see if I could somehow turn a string into an IODevice (like you can in Java), but nothing turned up. So, it seems like I have to write stdin to a file in order to use erl_tidy. Third attempt: call erl_tidy:file/1 from a shell script. It was trivial to write a shell script that pipes stdin to a file. Calling erl_tidy from the shell script was not. My first attempt looked like this: $ erl -run erl_tidy file $TMPFN which produced: =ERROR REPORT==== 9-May-2016::20:13:30 === erl_comment_scan: bad filename: `['hmmmm_sup.erl']' and hung there, waiting in the Erlang interpreter. It turns out that when use the -run flag and pass it arguments, Erlang assumes the receiving function has one argument---a list. That's why the filename in the error message has brackets around it ... it's an list not a string.. Final attempt: call escript from a shell script The final result is what you see in the repository now. It's so easy to pipe stdin to a file in a shell script that I saw no reason to re-write that in Erlang. So I added a short escript that unpacks the file name from the list and passes that to erl_tidy. The result: I was able to erlfmt all 1,795 source files under /usr/local/Cellar/erlang/18.2.1, with only one error: /usr/local/Cellar/erlang/18.2.1/lib/erlang/lib/wx-1.6/src/gen/gl.erl ./erlfmt: line 6: 25399 User defined signal 2: 31 ./erlfmt.escript "$TMPF" That file is a generated one, and is 971KB in size. I got the above error when I had erl_tidy write the reformatted file to the temporary file. When instead I told erl_tidy to write the result to stdout, I got this error: escript: exception exit: badarg in function erl_tidy:file/2 (erl_tidy.erl, line 295) in call from erl_eval:local_func/6 (erl_eval.erl, line 557) in call from escript:interpret/4 (escript.erl, line 787) in call from escript:start/1 (escript.erl, line 277) in call from init:start_it/1 in call from init:start_em/1 It successfully parsed the other 1,794 files, which gives success rate of 0.99945. Not too hot in the Erlang world, but I don't expect to be formatting files that big. Good enough. The back story back story. Why such a long README? Erlang questions mailing list recently had an interesting thread on code documentation ("rhetorical structure of code"). It was a long thread and a lot of things were said, but one in particular stuck with me: code documentation rarely shows the starts and stops, the litany of failed experiments that you encounter on the way to the final product. And that those failed experiements are sometimes useful in the future.
About
A command-line utility for formatting Erlang code.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published