Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala Preprocessor / Conditional Compilation #640

Open
szeiger opened this issue Jul 1, 2019 · 67 comments
Open

Scala Preprocessor / Conditional Compilation #640

szeiger opened this issue Jul 1, 2019 · 67 comments

Comments

@szeiger
Copy link

szeiger commented Jul 1, 2019

Scala has done quite well so far without any preprocessor but in some situations it would be quite handy to just drop an #ifdef or #include into the source code. Let's resist this temptation (of using cpp) and focus instead on solving the actual problems that we have without adding too much complexity.

Goals

  • Conditional compilation which is more fine-grained than conditional source files.
  • Well integrated into the compiler: No change to build toolchains required. Positions work normally.

Non-goals

  • Lexical macros
  • Template expansion
  • Advanced predicate language

Status quo in Scala

  • Conditional source files
  • Code generation
    • Various code generation tools in use: Plain Scala code, FMPP, M4, etc.
  • https://github.com/sbt/sbt-buildinfo as a lightweight alternative for getting config values into source code

All of these require build tool support. Conditional source files are supported out of the box (for simple cross-versioning in sbt) or relatively easy to add manually. sbt-buildinfo is also ready to use. Code generation is more difficult to implement. Different projects use various ad-how solutions.

Conditional compilation in other languages

C

Using the C preprocessor (cpp):

  • Powerful
  • Low-level
  • Error-prone (macro expansion, hygiene)
  • Solves many problems (badly) that Scala doesn't have (e.g. imports, macros)

HTML

Conditional comments:

  • Allows simple conditional processing
  • Dangerous errors possible when not supported by tooling (because it appears to be backwards compatible but is really not)

Rust

Built-in conditional compilation:

  • Predicates are limited to key==value checks, exists(key), any(ps), all(ps), not(p)
  • Configuration options set by the build system (some automatically, like platform and version, others user-definable)
  • Keys are not unique (i.e. every key is associated with a set of values)
  • 3 ways of conditional compilation:
    • cfg attribute (annotation in Scala) allowed where other attributes are allowed
    • cfg_attr generated attributes conditionally
    • cfg macro includes config values in the source code
  • Syntactic processing: Excluded source code must be parseable

Java

  • No preprocessor or conditional compilation support
  • static final boolean flags can be used for conditional compilation of well-typed code
  • Various preprocessing hacks based on preprocessor tools or conditional comments are used in practice

Haskell

Conditional compilation is supported by Cabal:

  • Using cpp with macros provided by Cabal for version-specific compilation

Design space

At which level should conditional compilation work?

  1. Before parsing: This keeps the config language separate from Scala. It is the most powerful option that allows arbitrary pieces of source code to be made conditional (or replaced by config values) but it is also difficult to reason about and can be abused to create very unreadable code.

  2. After lexing: This option is taken by cpp (at least conceptually by using the same lexer as C, even when implemented by a separate tool). If avoids some of the ugly corner cases of the first option (like being able to make the beginning or end of a comment conditional) while still being very flexible. An implementation for Scala would probably be limited to the default tokenizer state (i.e. no conditional compilation within XML expressions or string interpolation). Tokenization rules do not change very often or very much so that cross-compiling to multiple Scala versions should be easy.

  3. After parsing: This is the approach taken by Rust. It limits what can be made conditional (e.g. only single methods but not groups of multiple methods with a single directive) and requires valid syntax in all conditional parts. It cannot be used for version-dependent compilation that requires new syntax not supported by the older versions. An additional concern for Scala is the syntax. Using annotations like in Rust is possible but it would break existing Scala conventions that annotations must not change the interpretation of source code. It is also much harder to justify now (rather than from the beginning when designing a new language) because old tools would misinterpret source code that uses this new feature.

  4. After typechecking: This is too limiting in practice and can already be implemented (either using macros or with Scala's optimizer and compile-time constants, just like in Java).

From my experience of cross-compiling Scala code and using conditional source directories, I think that option 3 is sufficiently powerful for most use cases. However, if we have to add a new syntax for it anyway (instead of using annotations), option 2 is worth considering.

Which features do we need?

Rust's cfg attribute + macro combination looks like a good solution for most cases. I don't expect a big demand for conditional annotations, so we can probably skip cfg_attr. The cfg macro can be implemented as a (compiler-intrinsic) macro in Scala, the attribute will probably require a dedicated syntax.

Sources of config options

Conditions for conditional compilation can be very complex. There are two options where this complexity can be expressed:

  • Keep the predicates in the Scala sources simple (e.g. only key==value checks), requiring the additional logic to be put into the build definition.
  • Or keep the build definition simple and allow more complexity in the predicates.

I prefer the first option. We already have a standard build tool which allows arbitrary Scala code to be run as part of the build definition. Other build tools have developed scripting support, too. The standalone scalac tool would not have to support anything more than allow configuration options to be set from the command line. We should consider some predefined options but even in seemingly simple cases (like the version number) this could quickly lead to a demand for a more complex predicate language.

@dwijnand
Copy link
Member

dwijnand commented Jul 2, 2019

Looks really good, Stefan!

Do you think you could expand a bit on what is meant by Rust's cfg attribute and macro behaviour? Either just describe it or better yet with examples. Thanks!

@lrytz
Copy link
Member

lrytz commented Jul 2, 2019

Yes, very nice writeup! Thanks for doing the hard work and not just dumping out some syntax ideas :-)

@szeiger
Copy link
Author

szeiger commented Jul 2, 2019

The cfg annotation (or "attribute" in Rust) conditionally enables a piece of code (where an attribute is allowed, e.g. a function definition but not arbitrary places). In Scala it could be something like this:

@cfg(""" binaryVersion = "2.13" """)
def foo: Int = ... // 2.13 version

@cfg(""" not(binaryVersion = "2.13") """)
def foo: Int = ... // 2.11/2.12 version

binaryVersion in this example is a config option. They live in a namespace which is distinct from any regular one in Scala code. These annotations are processed logically after parser but before typer (probably not quite so in practice because I expect you'll need to do some typing just to recognize the name cfg) so the disabled versions of the method have to parse but not typecheck.

The cfg macro provides a way to bring config values into Scala terms, e.g.

println("The binary version is " + cfg("binaryVersion"))

Values produced by the macro are expanded into literals at compile time.

@szeiger
Copy link
Author

szeiger commented Jul 4, 2019

A possible way to avoid the namer issue (especially at the top level) without too much complexity would be a new annotation-like syntax like @if(...). This would also allow us to avoid the quotes and instead treat all names within the predicate as config names.

@lrytz
Copy link
Member

lrytz commented Jul 4, 2019

These annotations are processed logically after parser but before typer

Could this express, for example

  • if (binaryVersion > 2.13) import a.X else import b.X
  • if (...) class A extends X else class A extends Y

The cfg macro provides a way to bring config values into Scala terms

Do we need / want that? :-)

@szeiger
Copy link
Author

szeiger commented Jul 4, 2019

  • In the scheme with the simple predicate language more complex predicates like binaryVersion > 2.13 need to be split up into a flag that can be checked by the predicate and some code in the build script to compute the flag. Additional operators could be added to the predicate language (but not user-definable).

  • I don't think normal annotations can be used on imports at the moment but this should be easy to add (especially if we go with an annotation-like special syntax instead of a real annotation).

  • The macro could replace sbt-buildinfo. We're adding a standard way of defining config variables and passing them to the compiler. I think it makes sense to use this mechanism for reifying them at the term level if we already have it.

@lrytz
Copy link
Member

lrytz commented Jul 4, 2019

Thanks!

Can you think of cases where the annotation based syntax would not work well enough? My example above is a superclass, that could be worked around with a type alias. But for example if I want to add a parent conditionally (and not extend anything in the other case), I don't see how that could be done (except making two copies of the entire class).

@szeiger
Copy link
Author

szeiger commented Jul 4, 2019

But for example if I want to add a parent conditionally (and not extend anything in the other case)

You can always extend AnyRef or Any. This doesn't work anymore if you need to pass arguments to the superclass. You'd have to write two separate versions.

@szeiger
Copy link
Author

szeiger commented Jul 5, 2019

Here's my prototype so far: https://github.com/szeiger/scala/tree/wip/preprocessor

I'm not quite happy with the set-based key/value checks. It doesn't feel correct with Scala syntax.

Supporting imports will need a bit of refactoring in the parser. It's not as straight-forward to add as I had hoped.

I wanted to try it with collections-compat but discovered that annotations do not work for package objects. This is also a limitation of the parser, to it affects my pseudo-annotations as well. I'm not sure if this is intentional or a bug. Fixing it should be on the same order of difficulty as supporting imports.

Except for these limitations it should be fully functional.

@lrytz
Copy link
Member

lrytz commented Jul 8, 2019

The patch has

         //case t @ Annotated(annot, arg) => t

so supporting annotation ascriptions is planned, right?

@szeiger
Copy link
Author

szeiger commented Jul 8, 2019

I assume it's trivial to implement but didn't get around to testing it yet.

@szeiger
Copy link
Author

szeiger commented Jul 8, 2019

Looks like the restriction on disallowing annotations in general for package objects is intentional: https://www.scala-lang.org/files/archive/spec/2.13/09-top-level-definitions.html#compilation-units. But since @if is not a real annotation we can special-case it for package objects the same way as for imports.

@szeiger
Copy link
Author

szeiger commented Jul 8, 2019

The latest update supports imports, package objects and annotated expressions.

@szeiger
Copy link
Author

szeiger commented Jul 9, 2019

Here's a version of scala-collection-compat that does all the conditional compilation with the proprocessor: https://github.com/szeiger/scala-collection-compat/tree/wip/preprocessor-test. This shows the limits of what is possible. In practice I would probably keep 2.13 completely separate but use conditional compilation for the small differences between 2.11 and 2.12.

@lrytz lrytz transferred this issue from another repository Jul 12, 2019
@lrytz lrytz transferred this issue from another repository Jul 12, 2019
@nafg
Copy link

nafg commented Jul 12, 2019

What are the concrete use cases for this?

IMO proposals should always start with a set of use cases, and their design should be driven and guided by how well they solve those use cases.

@olafurpg
Copy link

Thanks for the detailed write up! Some quick questions.

How do you envision that the code editing and navigation experience would work in IDEs for conditionally compiled statements?

Can you maybe elaborate on the goal below with an example situation where conditional source files have been insufficient in practice?

Conditional compilation which is more fine-grained than conditional source files.

I am concerned that preprocessing introduces one more way to solve the same problem that conditional source files already solve. Conditional source files have their flaws but they work mostly OK with IDE tooling.

@mdedetrich
Copy link

mdedetrich commented Jul 12, 2019

I would love this, biggest pain as library maintainers is having to have (mostly) redundant branches because we can't do conditionals based on the current scala version

How do you envision that the code editing and navigation experience would work in IDEs for conditionally compiled statements?

The conditionals should use the value that corresponds to the current compiler version that is set by the IDE?

What are the concrete use cases for this?

Migrating to the new scala collections is a major one if you use CanBuildFrom and stuff like breakOut in your code.

https://github.com/mdedetrich/scalajson/blob/master/build.sbt#L98 is another example

@szeiger
Copy link
Author

szeiger commented Jul 12, 2019

How do you envision that the code editing and navigation experience would work in IDEs for conditionally compiled statements?

The same way that different source folders work. An IDE that imports the sbt build (like IntelliJ currently does) would also see the config options that are passed to scalac (and computed in the build in the same way as the source folders).

@szeiger
Copy link
Author

szeiger commented Jul 12, 2019

The motivating use case is indeed the collections migration where we see the need for separate source files in many cases. I neglected to put that into the proposal because the proposal "we should have a preprocessor for conditional compilation" already existed when I adopted it to create a design.

Here is a version of scala-collection-compat that takes my current preprocessor prototype to the limit: https://github.com/szeiger/scala-collection-compat/tree/wip/preprocessor-test. Note that this is not a style I would recommend. For collection-compat, assuming that 2.11 and 2.12 already had the preprocessor, I would have used the preprocessor to combine and simplify the 2.11 and 2.12 versions (which are very similar) and kept the entirely different 2.13 version separate.

@lihaoyi-databricks
Copy link

lihaoyi-databricks commented Jul 12, 2019

I am personally somewhat doubtful of this. Cross-version sources have worked well enough, are supported by every build tool (SBT, Mill, our Bazel build at work), and encourage the best practice of keeping your version-specific stuff encapsulated in a small number of files rather scattering if-defs all over the codebase.

No change to build toolchains required. Positions work normally. is already the case right now with version-dependent folders. No change to anyone's build toolchains are necessary - everything already works - and positions are correct. Even IDE support works great, better than it does in #ifdef-heavy CSharp/C++ code anyway!

Not mentioned in this proposal is Scala.js. The Scala.js community has been working with platform-specific source folders forever. It's worked well. I don't think I've heard any great groundswell of demand for #ifdef preprocessor directives (i believe only one project in the past 5 years cared enough to even try implementing them)

@szeiger
Copy link
Author

szeiger commented Jul 12, 2019

Here's a summary of my AST-based preprocessor prototype (https://github.com/szeiger/scala/tree/wip/preprocessor). It's the same approach that Rust uses for the cfg attribute and macro.

Syntax

Conditional compilation is done with a pseudo-annotation called @if. Note that if is a keyword, which makes this illegal as regular annotation syntax (you would have to write @`if` instead). It takes one argument, which is a preprocessor predicate (see below).

@if can be used in the following places:

  • Wherever normal annotations are allowed
  • In front of package objects
  • In front of packge p { ... } style package definitions (but not package p; ...)
  • In front of import statements

Note that the source code must be completely parseable into an AST before preprocessing. For example, this is allowed:

@if(scala213) val x = 1
@if(!scala213) val x = 0

Whereas this is not:

val x = (1: @if(scala213))
        (0: @if(!scala213))

Configuration Options

Configuration options consist of string keys associated with a set of string values. They are passed to scalac with -Ckey=value. In sbt they can be set via scalacOptions. For example:

scalacOptions ++= Seq("-Cfeature=foo", "-Cfeature=bar")

This gives the config option feature the value Set("foo", "bar").

Preprocessor Predicates

Predicates for the @if pseudo-annotation are parsed as Scala expressions (like any other annotation argument) but they are processed by a special interpreter which supports only a limited type of expressions:

  • Identifier == String Literal: Evaluates to true if the config option designated by the identifier has the string literal as one of its values, false otherwise.
  • Identifier: Evaluates to true if the config option designated by the identifier is defined (i.e. it has a non-empty set of values), false otherwise.
  • Boolean expressions on predicates using &&, || and ! with the usual meaning.

Preprocessing

The preprocessor runs in the new preprocessor phase directly after parser. It evaluates all preprocessor annotations, removing both the annotations themselves and all trees for which the predicates evaluate to false. The effect is the same as if the annotated part was not there in the first place. No typechecking is attempted on the removed parts and no names are entered into symbol tables.

Reifying Configuration Options

The scala.sys.cfg macro can be used to expand a config option at compile-time into its values. For example, using the previous definition of feature,

val features = sys.cfg("feature")

is expanded into

val features = Set[String]("foo", "bar")

@julienrf
Copy link

julienrf commented Jul 12, 2019

I don’t think having a preprocessor is a good idea. It adds another meta-language layer above the Scala language, which increases the cognitive load for reading source code. Also, unless such a meta-language is as powerful as a general-purpose language (which is something we don’t want!), we will still need to rely on the build system to accommodate some specific needs that are not supported by the few preprocessor directives of the system. I believe such a system would have very limited applicability compared to its cost.

@nafg
Copy link

nafg commented Jul 12, 2019

@szeiger can you can spill a bit more ink on that motivating use case? And are there others?

For instance,

  1. What about the collections migration requires it? Is it a special case? Is it a design flaw?
  2. What is wrong with current solutions, which clearly exist, for example separate source directories? Is this an overall win over it, and why?
  3. Who is affected by it? How large of an audience is it for?

@mdedetrich
Copy link

I am personally somewhat doubtful of this. Cross-version sources have worked well enough, are supported by every build tool (SBT, Mill, our Bazel build at work), and encourage the best practice of keeping your version-specific stuff encapsulated in a small number of files rather scattering if-defs all over the codebase.

In my experience, cross version sources have resulted in massive amounts of code duplication. You basically have duplicate the entire source file/s minus the difference you are targeting. In some cases you can get around this by using traits, but that then opens other problems

@nafg
Copy link

nafg commented Jul 12, 2019

@som-snytt I don't mean to come across that way, but shouldn't said research be done before making such a significant investment?

@mdedetrich that's interesting. Can you explain why separate directories results in so much duplication?

Part of why I'm asking is because the deeper you understand a problem, the more you understand the solution space. There could be solutions that haven't been thought of or explored fully.

But partially, it's because if we don't write these things down, people won't appreciate the direction being taken.

Anyway I seem to recall such a case mentioned recently by @sjrd on gitter that may have pointed in this direction. I'm not sure it generalizes though.

@lihaoyi-databricks
Copy link

lihaoyi-databricks commented Jul 12, 2019

Building for two platforms should mean just building two branches.

I've tried this, it's not a great solution. You lose all sorts of things doing things in multiple git branches v.s. just having separate folders:

  • No more parallel builds in your build tool
  • No more find-and-replace in your editor
  • No more search on Github, which only lets you search master
  • No dependencies across the cross-axis: what if I want my Scala-JVM server to depend on my Scala.js executable, with shared code? What if my Scala 2.12 deploy script needs to use the assembly of my Scala 2.11 Spark 2.3 job?
  • No just running publishAll without interleaving your publishing with a whole lot of git-fu

Using git branches for cross versioning always sounds great initially, but you are giving up a lot of commonly-used functionality and workflows in order to get there.


Version specific sources files are pretty easy, and is the de-facto convention throughout the entire community. Matthew doesn't specify why he doesn't like splitting things into version-specific traits, but I've done it for years over a million lines of cross-versioned code across three different cross axes and it's never been particularly difficult.

I don't think it would be an exaggeration to say I maintain more cross-built Scala code than anyone else in the community. Even at the extremes, like when Ammonite has to deal with incompatibilities in path-dependent nested-trait-cakes inside scala.tools.nsc.Global, it's always broken down pretty easily into separate static methods or separate traits for each cross-version. Maybe there are cases where you actually do have to duplicate large amounts of code, but I haven't seen them

IMO this proposal as stated suffers from the same problem as many others: the proposed solution is clearly stated, well analyzed and thoroughly studied, but the problem it is trying to solve is relegated to a single throwaway sentence without elaboration. That seems very much like the wrong way of approaching things, and is an approach very prone to ending up with a solution looking for a problem

@mdedetrich
Copy link

mdedetrich commented Jul 12, 2019

Matthew doesn't specify why he doesn't like splitting things into version-specific traits, but I've done it for years over a million lines of cross-versioned code across three different cross axes and it's never been particularly difficult.

Well for starters, its a workaround. Traits are a language level abstraction for structuring your code. Its not designed for dealing with breaking differences between different Scala versions. Its used this way because its the only sane way to handle differences between platforms and scala versions if you are able to do so (apart from Git branches which you already expanded upon). They are kind of like pollution, they wouldn't normally be there if it wasn't for the fact you were targeting another Scala version/platform which doesn't support feature X.

There are also difficulties in using traits, they can cause issues with binary compatibility in non trivial circumstances. And they also don't abstract over everything cleanly, as shown in the repo I linked earlier.

@szeiger
Copy link
Author

szeiger commented Sep 6, 2019

I've pushed some bug fixes to the prototype. And I converted the existing cross-building setup of akka-http to use the preprocessors:

Some observations from converting akka-http:

  • The annotation-based syntax is easier in simple cases but for this kind of cross-compilation where you usually have two different versions it loses its advantage over the lexical syntax because the latter's if...else...endif form does not require a repetition of the predicate and makes the intent ("either this or that") clearer:
    @if(scala213)
    def foo = ...
    
    @if(!scala213)
    def foo = ...
    vs
    #if scala213
      def foo = ...
    #else
      def foo = ...
    #endif
  • The VarArgsFunction1 issue is easily avoided with the lexical syntax because the skipped parts of the input most only be tokenizable (which should always be the case) but nor parseable. On the other hand, these kinds of syntactical changes are rare and this particular one could be handled in a compatible way by rejecting the old syntax in a later compiler phase after preprocessing.
  • Putting many annotations on individual features is quite ugly when the changes between the different source versions are big but works well for isolated differences in large source files. This is evident in the original akka-http setup which already had two macro annotations @pre213 and @since213 (corresponding to @if(!scala213) and @if(scala213)) which are used in parts of the codebase, but other parts still use separate source folders. I removed all separate source folders in both versions but in practice I would only do this with the lexical syntax.
  • Being able to use the lexical syntax everywhere (and not only in certain places where they annotate a predefined scope) is less of an advantage than I expected. Due to the interaction with semicolon inference you need to add parentheses (where supported) to avoid this and the preprocessor directives still need to be on their own lines. There were a few cases of one- or two-line functions that only differed in return types where I still opted to use separate versions of the whole function instead of trying to swap out only the return type.
  • When there are two separate versions of a method, the lexical syntax allows you to keep a common scaladoc comment instead of having to duplicate it.

Overall, I'm coming around to preferring the lexical syntax, but the annotation-based syntax is a lot simpler, both in terms of surface area, implementation complexity and interaction with other language features.

And then there's the elephant in the room: Dotty's new indentation-based syntax. IMHO this would rule out the lexical preprocessor. You should be able to use indentation to structure your code between preprocessor directives but this indentation would then interfere with Scala's own indentation-based parser.

@olafurpg
Copy link

olafurpg commented Oct 2, 2019

I believe this proposal is a serious step backwards with regards to tooling and I don't think the advertised benefits have been adequately motivated.

There's a significant difference between this proposal and cfg in Rust or conditional compilation based on source files: it's not possible to typecheck a source file with pre-processing using one compiler instance. Consider the following example

@if(scala213)
def foo = ...

@if(!scala213)
def foo = ...

The classpath is different for the 2.13 and non-2.13 foo implementations. To respond to a "goto definition" or a "completion" request at a given position, an IDE will first need to detect which version of the compiler can typecheck the expression at the request position. In the case of Rust's cfg, this is not a problem.

Conditional compilation based on source files has its drawbacks but I believe it still is the best solution to address the problem statement of this issue.

@julienrf
Copy link

julienrf commented Oct 2, 2019

Given the number of negative comments and reactions on this thread, I’d like to know better what is the plan. How far does Lightbend want to experiment with this idea?

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2019

Of course these changes are subject to a decision by the SIP committee. Stefan has prepared a few prototypes and a proposal. Complications to tooling are an important consideration in evaluating the complexity of this proposal. We are willing to keep refining until we find consensus (or it gets voted down).

I hear your concerns, but I'd also like to remind everyone that tooling authors are outnumbered by users by a large factor. How big is the effort to implement this compared to the benefits to a large user base? Currently our users are forced to maintain duplicated code in different source files -- an approach not taken by any other language they are likely familiar with. We've gotten feedback that large shops have implemented their own pre-processor.

Technically speaking, I think the tooling burden is limited if we treat "inactive" code as comments: you can still edit them, but you don't get the same level of code assist. Definitions in skipped fragments shouldn't show up in find symbol, IMO. This means the tool/IDE only needs to know which properties are passed to the compiler (I'd imagine those are canonically specified in the build and can be exported using BSP) and either use the official parser or implement the same logic that skips parts of the code for which the if directive evaluates to false.

@Blaisorblade
Copy link

Re tooling, one issue for many refactorings is that they should ideally act on all variants. And they need to modify sources before preprocessing, based on preprocessing results. Tooling authors are fewer, but the real issue is whether all users get worse tools.

  • A potential concern: With lexical preprocessors, checking that all variants of a project even parse is exponential in the number of features: if you have N flags, you'd have to check that 2^N variants parse. And many C projects have N > 100 — what do large Scala shops use? (With preprocessing after parsing the problem moves later). That can also affect tooling that deals with all variants.

    There is research on dealing with this (e.g. I've collaborated on https://github.com/ckaestne/TypeChef years ago), but it's pretty complex, and you don't want to create the need for it: We implemented parsing combinators that produce conditional ASTs, using SAT solvers for excluding impossible combinations of conditions.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2019

That's a good point, but again -- I'm not convinced of the hypothesis that tools should act on all possible combinations. I would say they only act on the currently active one. Since we already have logic in sbt for cross-versioning, they could easily be run in sequence on the small selection of valid combinations (usually I'd expect you only key on the targeted Scala version).

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2019

To be clear, that reduces the complexity to our current one, where each valid combination is represented by a version-name-mangled folder that contains the stuff that varies between combinations, as well as everything that happened to be in the same file without being sensitive to different configurations.

Our preprocessor proposal inverts this: only the things that change are duplicated, and we keep the same source folder structure. Your build determines a sequence of valid combinations, and your tools will have to run on all of those, just like the build already runs for different target versions of Scala (cross-building).

@dwijnand
Copy link
Member

dwijnand commented Oct 2, 2019

Personally I hope that Scala 3.x will be backwards-compatible (in the language and the standard library) so users can write their code in the minimum Scala 3.x version their project supports.

Thus dropping the need for variant source files/directories or a source file preprocessor.

@szeiger
Copy link
Author

szeiger commented Oct 2, 2019

FYI: I'm currently working on a SIP document based on my earlier posts and other information in this thread. The main proposal will be the lexical preprocessor that I did 2nd.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2019

The potential differences you'd want to abstract over with a preprocessor are not just about syntactic changes. Migration to libraries that make incompatible API changes is probably the most common, and that's often just one method call that needs to be written in two ways.

@adriaanm
Copy link
Contributor

adriaanm commented Oct 2, 2019

No, we were not pressured or incentivized financially to propose/implement this. 🤷‍♂ How about the simpler explanation -- that I'm genuinely trying to make Scala a language with wide appeal across the industry!?

PS: As soon as our VCs ask us to implement Scala features, I'll be sure to let you know!

@mslinn
Copy link

mslinn commented Oct 2, 2019

If Scala had conditional compilation, the library update that I've been working on for the last several days would have only taken a few hours, and I would not have had to drop support for older versions of Scala, and the library would not have had duplicated code. Right now about 80% of the code must be duplicated for each major Scala release.

I would love to see conditional compilation in Scala, ASAP, and I believe it is possible to implement such that reasoning about types is deterministic.

@som-snytt
Copy link

I also needed CoCo recently. The OP omits the other Scala idiom

// TODO: uncomment in Scala 2.14
//def apiNeededYesterday = ???

I'd assume it's sufficient to slip Adriaan something under the table?

It's probably even more efficient to slip Adriaan under the table.

@martijnhoekstra
Copy link

An assumption I had was that the use-case for this feature was primarily cross-building against different (versions of) dependencies, particularly the stdlib.

I've been pointed to this may not actually being the case.

That raises the question, what is the actual problem that conditional compilation aims to solve? I somewhat fear that the answer there has to do with encoding business logic in compile-time conditionals as some sort of dependency injection framework, but that's still just guesswork.

So what are we looking at exactly? What are the cases that people want to use a pre-processor for? And does a plain boolean conditional compilation preprocessor adequately solve the problems of those use cases?

@adriaanm
Copy link
Contributor

adriaanm commented Oct 3, 2019

The org that I've talked to most that has experience with an (in-house) preprocessor actually bans encoding business logic or debug/release modes. It's purely about managing migration in the presence of breaking changes to external (to the current project) changes.

@mslinn
Copy link

mslinn commented Oct 3, 2019

Here is a section of build.sbt for the project I mentioned:

libraryDependencies ++= scalaVersion {
  case sv if sv.startsWith("2.13") =>
    Seq(
      "javax.inject"           %  "javax.inject"       % "1"     withSources(),
      "com.typesafe.play"      %% "play"               % "2.7.3" % Provided,
      "com.typesafe.play"      %% "play-json"          % "2.7.4" % Provided,
      "org.scalatestplus.play" %% "scalatestplus-play" % "4.0.3" % Test,
      "ch.qos.logback"         %  "logback-classic"    % "1.2.3"
    )

  case sv if sv.startsWith("2.12") =>
    Seq(
      "com.typesafe.play"      %% "play"               % "2.6.23" % Provided,
      "com.typesafe.play"      %% "play-json"          % "2.7.4"  % Provided,
      "org.scalatestplus.play" %% "scalatestplus-play" % "3.1.2"  % Test,
      "ch.qos.logback"         %  "logback-classic"    % "1.2.3"
    )

  case sv if sv.startsWith("2.11") =>
    Seq(
      "com.typesafe.play"      %% "play"               % "2.5.16" % Provided,
      "com.typesafe.play"      %% "play-json"          % "2.7.4"  % Provided,
      "org.scalatestplus.play" %% "scalatestplus-play" % "2.0.1"  % Test,
      "ch.qos.logback"         %  "logback-classic"    % "1.2.3"
    )

  case sv if sv.startsWith("2.10") =>
    Seq(
      "com.typesafe.play"     %% "play"                % "2.2.6" % Provided,
      "org.scalatestplus"     %% "play"                % "1.5.1" % Test
    )
}.value

From this I observe:

  1. Dependencies might demand imports that vary and type definitions that vary between major releases of the Scala compiler and/or dependencies. Conditional compilation would help smooth the differences out.
  2. The use of multiway branching would probably dominate over binary choices (if/then) as code bases mature.
  3. The build system (in this case SBT) should be considered together with the Scala program code when designing conditional compilation. Conditional compilation is really just a new type of meta-project.

And a questions arises: should conditional compilation be accomplished by a tool that "knows" Scala, as opposed to a dumb generic macro processor such as provided in C?

I think conditional compilation should be performed by a proxy with a well-defined interface that could be driven by the build system. This proxy could be a new phase of the Scala compiler, or it might be a Scala-knowledgeable precompiler.

Conditional compilation might include:

  1. Injection of generated code
  2. New or conditionally derived type definitions
  3. Responding to dependency versions
  4. Responding to type definitions in dependencies

It would also be wonderful if the build system had more use of expressions and less statements. Perhaps the design of conditional compilation might drive that change.

@nafg
Copy link

nafg commented Oct 3, 2019 via email

@szeiger
Copy link
Author

szeiger commented Oct 7, 2019

Here's the PR for the SIP: scala/docs.scala-lang#1541

@jrudolph
Copy link

jrudolph commented Feb 5, 2020

I am personally somewhat doubtful of this. Cross-version sources have worked well enough, are supported by every build tool (SBT, Mill, our Bazel build at work), and encourage the best practice of keeping your version-specific stuff encapsulated in a small number of files rather scattering if-defs all over the codebase.

In my experience, cross-building is hard and error-prone. Cross-version source files have made cross-building possible but have also introduced bugs in every project where we used them so far (every project has only a low single digit number of files cross-built). The reason is that maintaining near-copies of files just does not work in practice. People always forget to cross-check to make sure all instances of fixes have been applied everywhere. You could say that this would be detected early because of test coverage and that's somewhat right but only if you have 100 % test coverage of all branches, versions, platforms, OSs, etc. Even if you have that you probably won't cover all the potential performance issues you might have (this is amplified by the fact cross-building is often necessary because of scala collection API incompatibilities where you then run into subtle scala collection performance incompatibilties uncovered only much later).

In general, the less code is duplicated the better. However, because of other constraints, some files cannot be split up. In that case, some preprocessor would help a lot. In akka-http, we introduced some macros as proposed here. By now, I'm not sure that is the best idea because

  • macros are somewhat brittle
  • they mix up compilation stages and prevent that you can generate a final source file to go into the source jar
  • it encourages cross-building where, in fact, you should keep branching to the very minimum to prevent maintenance headaches.

For similar reason, I find that support in Scala itself might not necessarily be the best solution.

I will now experiment with a preprocessor plugin on the sbt level which will apply the preprocessor to select files in special source directories and just generate final sources into src_managed. That way IDEs should work without changes (by including src_managed into the source path), sbt just manages well and puts the final files into the right source jars and you can still do very targeted cross-building changes.

@lrytz
Copy link
Member

lrytz commented Feb 5, 2020

It seems that would give a bad user experience when editing the original source files (with the conditional compilation statements) in the IDE. This solution would also be build-tool dependent.

Conditional compilation would not be the only feature where the compiler flags need to be known to be able to compile the source files (classpath, Yimports, Xsource). We can maybe find a good way to persist this information.

@clee704
Copy link

clee704 commented Apr 28, 2021

As a C++ programmer who recently started using Scala, I needed to write a library that cross-builds against multiple versions of other libraries and thought this kind of feature would be useful. Personally, I hate having multiple Git branches because it's actually duplicating a lot of code in different branches and it increases the maintenance burden. Currently I'm using version-specific source directories but I think something like enable_if/is_detected in C++ will be useful because sometimes it's awkward to extract the small differences into separate source files and a small amount of duplication is inevitable with version-specific sources. Note that enable_if/is_detected are implemented with C++ templates (exploiting the SFINAE "feature"), not with preprocessor macros. Maybe this is already possible with Scala's macros?

@jrudolph
Copy link

@clee704

Maybe this is already possible with Scala's macros?

That's what we do in akka-http in places: https://github.com/akka/akka-http/blob/master/akka-parsing/src/main/scala/akka/http/ccompat/pre213macro.scala

However, how it's done is more of an implementation detail. The important thing would be to have a standardized solution that avoids duplicating code.

@lrytz

It seems that would give a bad user experience when editing the original source files (with the conditional compilation statements) in the IDE. This solution would also be build-tool dependent.

I agree that standardization would be good for that reason. That said, given that IDEs usually don't have a concept of compiling/analyzing the same code under different configurations, I don't expect a big opportunity for improvement here.

Existing solutions like version-based directories or letting sbt generating files into src_managed (like sbt-boilerplate does), also work in IDEs to some degree. In most cases, IDEs can at least understand the version for the selected compile configuration (e.g. "Scala 2.13") but may not be helping when editing (especially when editing code belonging to an inactive compile configuration).

@eed3si9n
Copy link
Member

I've implemented something - https://eed3si9n.com/ifdef-macro-in-scala/

@julian-a-avar-c
Copy link

julian-a-avar-c commented Oct 19, 2023

I have a quick question. Why are we not using something like @match instead of @if? Specially for definitions, I would prefer definitions to always exist, and I also would like some sort of exhaustive check, and an escape hatch in case I know what I'm doing. Sorry if this is a dumb question, I just saw this, and got excited, but confused also.

@eed3si9n
Copy link
Member

I have something more - ifdef in Scala via pre-typer processing

Why are we not using something like @match instead of @if?

I guess because no one has implemented it. I wasn't sure if conditional compilation is possible at all, so I opted to use simple String checking.

@lrytz
Copy link
Member

lrytz commented Oct 20, 2023

Previous SIP for reference (also linked above in this ticket): scala/docs.scala-lang#1541

@bjornregnell
Copy link

bjornregnell commented Oct 20, 2023

@eed3si9n Interesting work!! I think it would be good if you either revive the previous SIP linked by @lrytz or start a new one (or maybe start a thread on Contributors?) to have a discussion on the implications of ifdef for tooling etc.

Also, if I understood @sjrd and @smarter right from a discussion at today's SIP meting, it is a bug that it is possible to run compiler plugins before the typer outside of nightly versions? So, for this to fly, there needs to be some explicit support from the compiler/language, if I have understood this correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests