Skip to content
This repository has been archived by the owner on Jun 26, 2020. It is now read-only.

Port the DSL to Rust #342

Closed
sunfishcode opened this issue May 18, 2018 · 25 comments
Closed

Port the DSL to Rust #342

sunfishcode opened this issue May 18, 2018 · 25 comments
Assignees
Labels
E-compiler-design Open-ended compiler design issues.

Comments

@sunfishcode
Copy link
Member

Cretonne's Python-based DSL allows target ISAs to be described in a concise manner, and can generate tables, code, and documentation from them. It's common for compilers to use DSLs for such purposes; see also LLVM's tablegen, GCC's machine description language, libfirm's perl (example) and python (example), and others.

Python compares pretty favorably to obvious alternatives, in particular by being a real programming language, and a fairly friendly one at that. However, using Python in Cretonne has some downsides too:

  • There's extra overhead for developers that don't work with Python on a regular basis. They may not have mypy, flake8, or python-aware or even mypy-aware IDE plugins installed, they may not remember how variable scoping works, and so on.
  • The current Python code makes use of some "magical" features, such as __getitem__, isinstance, etc. which may be fine for people who work with Python regularly, but which make it harder to follow the code for people who aren't.
  • The current Python code also behaves in "magical" ways sometimes, such as PUT_OP in the x86 target. It's used in Rust code, but it's not a Rust construct; it's edited out by replace_put_op. In some cases, it's replaced by put_mp2, but it's not easy to determine when that name is used, as there are no other occurrences of put_mp2 or mp2 in the repository. As another example, the DSL infers instruction formats from instruction fields implicitly, in a way that makes it confusing when one is adding a new format and accidentally colliding with an existing format.
  • Cretonne has ambitions of being able to compile Rust code in the future, so it would be cooler if it could compile all of itself.

None of these are show-stoppers, and one option is to just keep the current code and refactor it to help out with some of these issues. But we should also consider where we want to be in the long term; it'd be easier to do a large architectural change now, while the code is (relatively) simple, as it'll grow more complex in the future as we add more features. One option is Rust's procedural macros, except that they're not yet stable. A possible intermediate step could be to use a build.rs.

The instruction descriptions are fairly elaborate, so one way to start experimenting with this would be to start by rewriting some of the simper parts, such as just gen_types.py, to get started setting up the infrastructure. gen_types.py's output file is target/debug/build/cretonne-codegen-*/out/types.rs, which is pretty simple and demonstrates what it does.

If anyone's interested in working on this, I'd be happy to mentor.

@sunfishcode sunfishcode added the E-compiler-design Open-ended compiler design issues. label May 18, 2018
@data-pup
Copy link
Member

data-pup commented May 24, 2018

Implementing this in Rust seems like it would be nice, given that the rest of the codebase is in Rust. At least from afar, I can't think of any reasons that would make switching to another language similar to Python beneficial.

Whatever is chosen as the direction forward however, happy to help out with this 😄

@sunfishcode
Copy link
Member Author

Thinking about this more, I think using Rust, with a build.rs approach, is the best one right now.

So as I mentioned above, I think gen_types.py is a good initial goal. And looking at it more closely, gen_types.py depends on srcgen.py. And, srcgen.py is a very simple file that should be very straightforward.

So the very first step here is to make a Rust version of srcgen.py. It's a simple file that just makes it easy to produce Rust source code, managing things like indent levels.

@data-pup Are you interested? :-).

@data-pup
Copy link
Member

I'm in, that sounds fun! I'll start looking into this, and I can follow up if I have further questions. I'll keep you posted in IRC too :o)

@sunfishcode
Copy link
Member Author

Woohoo! The first step, #403, is now merged!

The next logical step is to look at the compiler settings, and in particular settings.rs. It's generated by gen_settings.py, which also generates target-specific settings files, but we can split those out into a separate step. That way, we won't need the isas in this step.

https://github.com/CraneStation/cranelift/blob/master/lib/codegen/meta/base/settings.py shows what settings are all about. Note that SettingsGroup's constructor registers the created object in a static _current variable, and all the settings are automatically collected from Python's globals() at the end. This is an example of the "DSL" approach, and it does make the file a little more streamlined. A less magical approach would be to just add settings to a SettingsGroup object explicitly, which would also be more Rust-friendly.

@data-pup How does this sound to you? :-)

@data-pup
Copy link
Member

I can start looking into this further this weekend, but that sounds good to me! I'll let you know if I have any further questions about that 😺

@lachlansneff
Copy link
Contributor

lachlansneff commented Jul 27, 2018

Procedural macros are stable now. (Or will be soon).

rust-lang/rust#52081

It looks like you can define a procedural macro, but not import it yet.

@data-pup
Copy link
Member

I think the plan is definitely to move towards procedural macros in the future. Currently however, cranelift supports 1.25, which I do not believe would support this? @sunfishcode would know more regarding that, but currently we'll have to work around that :)

@sunfishcode
Copy link
Member Author

That's right. However, if there's a major advantage to switching to procedural macros now, let's talk about it, as we may have options.

@sunfishcode
Copy link
Member Author

Ok, after experimenting with several approaches with cargo publish, it seems we really need to publish the meta crate, so I've now made that change (cranelift-codegen-meta).

The next step here is the settings.rs file, which @data-pup expressed an interest in.

@data-pup
Copy link
Member

data-pup commented Aug 4, 2018

🎉 Yay! Glad we've figured out the publishing thing. I've started looking into the settings.rs thing, I can keep you updated on progress over IRC or here :)

@lachlansneff
Copy link
Contributor

lachlansneff commented Aug 6, 2018

Another way to do this would be to use a dynamic language like dyon (which is written in rust) in build.rs.

@bnjbvr
Copy link
Member

bnjbvr commented Oct 15, 2018

Hey @data-pup, I'd like to help on this issue, but I wouldn't want to step onto your toes. Are you still working on the settings.rs file? Or have you been working on other files? If not, I'll start migrating other stuff, and will make sure to announce it here first.

@data-pup
Copy link
Member

Hi @bnjbvr thanks for checking in! Feel free to start working on this. I have been a little caught up with life things the past couple weeks, and would happily let you take care of this if you have the time :)

@bnjbvr
Copy link
Member

bnjbvr commented Oct 16, 2018

Thanks for letting me know! I'll try to work on the Registers now.

@bnjbvr
Copy link
Member

bnjbvr commented Nov 8, 2018

Work on registers has landed, and there's now basic support for multiple ISAs! I'll now work on settings.

@bnjbvr
Copy link
Member

bnjbvr commented Feb 15, 2019

Settings have partially landed (they're blocked by instruction recipes which use unnamed ISA predicates).

For what it's worth, unless somebody else has started work on this (in which case it'd be nice to show up :)), I'll be working on the rest of the DSL migration to Rust. I am not sure it is easy to distribute this work now, because all the remaining concepts seem to be entangled in some ways (although I suspect there's a subset of code related to binemit that can be migrated separately (Encodings etc), which is what i'm working on now).

@bnjbvr bnjbvr self-assigned this Feb 15, 2019
@bjorn3
Copy link
Contributor

bjorn3 commented Feb 27, 2019

because all the remaining concepts seem to be entangled in some ways

Maybe port one part to rust and add code to write the original python for use by meta-python. Eg. port instructions replacing https://github.com/CraneStation/cranelift/blob/2a3b0467ed36a655608bac6eb067311e24a4d8f1/cranelift-codegen/meta-python/gen_instr.py and generate python code to mimic https://github.com/CraneStation/cranelift/blob/2a3b0467ed36a655608bac6eb067311e24a4d8f1/cranelift-codegen/meta-python/base/instructions.py.

@bnjbvr
Copy link
Member

bnjbvr commented Mar 12, 2019

Maybe port one part to rust

That's a great idea! That being said, with the recent progress (TypeVar, Operand, Instruction, InstructionGroup ported), it might be as straightforward to migrate the rest in a few steps directly.

My understanding is that the following order is possible:

  1. In parallel:
    a. Recipe definitions / gen_binemit (although gen_binemit first collects the recipes from the encodings, but it might be fine to just do it independently of encodings)
    b. Legalize actions (TransformGroups) / gen_legalizer
  2. Encodings (which depend on both recipes and legalize action indices) and gen_encodings (+ gen_binemit if there's an actual dependency).

I'll be now working on legalization actions and generating them; if somebody is interested in porting recipes, please let us know here so we don't duplicate work!

@bjorn3
Copy link
Contributor

bjorn3 commented Mar 12, 2019

if somebody is interested in porting recipes, please let us know here so we don't duplicate work!

I would like to do it.

@bnjbvr
Copy link
Member

bnjbvr commented Mar 13, 2019

I would like to do it.

That's great to hear, thank you! If you're not afraid of rebasing code, I'd recommend basing your work on top of #694 (last part of the Instructions port), or even wait for it to be landed to minimize bad rebases. There's some code that might be redundant with respect to CpuMode and other stuff, so we can:

  • use the same repository (your clone or mine, whatever) to work on this, and then make sure to reorder commits and fuse them.
  • make dummy structures for CpuMode and committing this before we start the next migrations
  • or just repeat some code, and one of us would just rebase and adapt to the other's code

Any preference? Happy to discuss this in real time over irc (#cranelift on irc.mozilla.org).

@bjorn3
Copy link
Contributor

bjorn3 commented Mar 13, 2019

If you're not afraid of rebasing code, I'd recommend basing your work on top of #694 (last part of the Instructions port),

+1

or just repeat some code, and one of us would just rebase and adapt to the other's code

CpuMode is very small, so repeating it won't be much of a waste I think.

@bnjbvr
Copy link
Member

bnjbvr commented Mar 13, 2019

Sounds good to me, cheers!

@bnjbvr bnjbvr changed the title Choice of base language for the DSL Port the DSL to Rust May 20, 2019
@bnjbvr
Copy link
Member

bnjbvr commented May 20, 2019

I'm now focusing on this. If anyone is about to implement a new architecture, I'd strongly advise to wait for this to be finished first. I expect around one month of work on this topic, considering the time it took for the previous parts.

bnjbvr added a commit to bnjbvr/cranelift that referenced this issue Jun 21, 2019
@bnjbvr
Copy link
Member

bnjbvr commented Jul 5, 2019

Fixed and landed!

@bnjbvr bnjbvr closed this as completed Jul 5, 2019
@aleksmelnikov
Copy link
Contributor

I will be good to release a new version after this change.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
E-compiler-design Open-ended compiler design issues.
Projects
None yet
Development

No branches or pull requests

6 participants