Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-50 - Struct Classes #50

Closed
wants to merge 5 commits into from
Closed

Conversation

julienrf
Copy link
Contributor

Based on the pre-SIP discussion.

@Ichoran
Copy link

Ichoran commented Oct 20, 2022

I am not convinced that such modifications are wise instead of an anti-pattern, so I'm not convinced of the wisdom of including it as a core language feature.

The reason is that adding optional fields is not a neutral operation when it comes to very many likely operations. It's good to make people recompile and rethink.

For instance, suppose you add an optional version field, and suddenly there are a profusion of versions...and...you store the thing in a TreeMap. Now what? If the Ordering doesn't respect the new field, the version will get randomly lost--it's not just that you default to the old behavior, but rather that you'll get a random selection from the new things available. But there's no mechanism by which to automatically respect the new field (especially if it's a custom Ordering).

Suppose you have a Permissions class with two fields that represent superuser level for two different areas, and you decide to add a third area. You want to have the idea of a maximal superuser, to test the full capacity you've given users, which you implement like so:

val superuser = Permissions(
  allUsers.map(u => u.level1).max,
  allUsers.map(u => u.level2).max
)

Except you're now silently broken, with no clue even by the requirement to recompile suggesting that your logic is broken.

Or what about custom serialization?

Or what about transformation of a data field where the code doesn't use copy?

user match
  case User(first, last) => User(capitalize(first), capitalize(last))

Oh boy, I just lost all my addresses!

It seems that to get this right, you require a whole list of best practices, and leaning on best practices is the least desirable route to correctness. And you have to expect people will have done them in anticipation of possible changes, without a recompile (which is at least slightly hinted at by having it be a struct class, but maybe not, since there might be other reasons to like the more limited set of synthetic methods).

I grant that all these issues already exist, and that the boilerplate-heavy manual method allows you to get around them (as do the other mechanisms), with the same risks. But just like .asInstanceOf[X] discourages a danger-prone pattern, to me, it seems like the difficulty of doing this is entirely appropriate given the danger of silently adding fields without even so much as a warning of binary incompatibility to alert people to re-check their logic.

content/struct-classes.md Outdated Show resolved Hide resolved
content/struct-classes.md Outdated Show resolved Hide resolved
content/struct-classes.md Outdated Show resolved Hide resolved
A struct class definition `c[tps](ps_1)...(ps_n)` with type parameters `tps` and
value parameters `ps` is handled as follows. A `private` method named `copy` is
implicitly added to the class definition unless the class already has a member
(directly defined or inherited) with that name, or the class has a repeated parameter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just the name copy used to decide generation? Or would other parts of an existing function signature, such as the return type used to determine if it needs to generate a copy method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied and adapted the specification of case classes from here. The reason is that I want the same schema to be applied both to struct classes and case classes.

In that specification, only the name is used, but not the signature. I’ve noticed that the Scala 2 compiler does not synthesize a copy method if I already define one (even if it’s signature does not match the case class fields). However, I’ve just noticed that the Scala 3 compiler does not behave the same here: if you add a method copy, the compiler adds another one as an overload. I’ve created scala/scala3#16309 to clarify whether this is expected or not.


Besides the two solutions shown in the Motivation section (based on regular classes
or case classes), we also considered a more powerful variant of `struct class` that
would also automatically generate the transformation methods (`withName`, `withAge`,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel not having this kind of feature will be frustrating to users. Most will reach out to struct classes as a way to deal with lighter-weight case classes or solve the bin compact issues, but the problem of updating and dealing with immutable structures does not go away.

We have a similar problem in Kotlin in the context of value classes where they may include special syntax for immutable copy updates https://github.com/Kotlin/KEEP/blob/master/notes/value-classes.md#updating-immutable-classes. I don't propose we follow a similar approach; just pointing out that not having a feature like this and forcing users into considerable boilerplate to get it working may introduce more issues than struct classes solve. In contrast, case classes that at least give you copy are not great to work with and with bin compact issues, but they are still an alternative to writing a bunch of boilerplate.

I think struct classes will be even better if they improve on case classes not just from a bin compact standpoint but also the state of what working with immutable data looks like in Scala.

@bjornregnell
Copy link

bjornregnell commented Oct 31, 2022

I'd like to consider this from a beginner perspective: is yet-another kind of class really worth it?

We already need to explain the difference between plain classes and case classes, which digs into tricky things such as equality.

But does a beginner need to bother about such struct classes? Well, if it is in the language learners will sooner or later encounter the strange word struct and need to have at least a superficial understanding of what they are and why the are in the language. But trying to understand binary compatibility is probably not the best spending of a learner's grit...

Perhaps an annotation similar to @struct case class would be less "intrusive" in terms of generating less "conceptual load" when learning the language? The rationale for having "only" an annotation, instead of a whole new kind of class, would be that this is a specialized thing for those that care about implementation details such as binary compatibility. This has after all something to do with code generation, similar to @tailrec.

@sjrd
Copy link
Member

sjrd commented Oct 31, 2022

It changes the public API of the class. As such, it falls way beyond what an annotation is supposed to be able to do.

@odersky
Copy link
Contributor

odersky commented Oct 31, 2022

I share Bjorn's sentiment. Adding another kind of class with all its special rules adds considerable complexity to the language.

But I am also dubious about "hiding" the complexity under an annotation @struct. This could open the floodgates to all sorts of complicating additions that seem easy since it's just an annotation that is added.

In my mind, an annotation can

  • influence how code is generated
  • add some restrictions how a feature is used

@targetName is a good example, which does both. But it should not cause API changes.

Which leads me to ask: Do we really need this? What do other languages have to offer in this respect? Is this an area where Scala falls clearly short, or is it an area where we think that Scala should be leading the pack? If the latter, why?

Otherwise, it looks like a major complication for a relatively minor use case. One could argue that we could just keep using normal classes and implement structural operations by hand. Or, use case classes with the tweaks described in the proposal. Adding a third kind of class is a very large complication. Is it really worth it?

@nafg
Copy link

nafg commented Nov 1, 2022

I think it's mostly useful for library authors who want to provide case classes and evolve them. Most languages don't have something comparable.

Maybe it should be a keyword modifier of case class.

Or maybe we should have a more general mechanism to limit what case class normally generates.

Regarding generating withXXX methods it would be nice to have a way to do that but maybe orthogonal to this. But I'd actually rather modXXX which take a function instead of a constant. Or even better, lenses, which doesn't currently seem possible in Scala 3. But it's not specific to this new kind of class.

I also don't think struct is the best name. It's not more of a value class than case classes. I guess without an extractor "case" may not be the best name. I guess it's just a simple product type, or data class.

Coming back to the issue of learning a new keyword, I think data class would be much less of an issue than struct class. It's basically self explanatory. And besides, Kotlin has it, although I'm not sure how it compares.

Anyway it's not hard to teach that case classes are data class plus pattern matching and a public copy method.

@Ichoran
Copy link

Ichoran commented Nov 1, 2022

If it is really important to support this use case with less boilerplate, can we extend derives to also derive parts of the case class functionality?

So if you want just copy and element-by-element equality, but nothing else, you

class Foo(i: Int) {
  derives def copy(_): Foo
  derives override def equals(_): Boolean
}

where _ in this case means "let the deriving fill in the argument list". (Arguably, it could just be derives def copy and the return type would be filled in too.)

If it's really really important to have even less boilerplate, then a predictable assemblage of such things could be created, maybe something like

transparent trait DataClass derives {
  override def equals
  override def hashCode
}

which you would then deploy with

class Foo(val i: Int, val j: Int) derives DataClass {}

as if it were a typeclass, except it would work for this quasi-structural-type instead.

Then we support more use-cases, we don't have to argue about exactly what does and does not go into a struct class, we punt on anything difficult and make the user implement it themselves, and what is entailed by "data class" becomes somewhat more explicit.

I'm still not sure this is a very good idea, but I think it has substantially fewer dangers and greater benefits than a struct class feature. (One big downside--it's a bigger change to the language and more work to implement.)

defined as `x_ij: T_ij`. In all cases, `x_ij` and `T_ij` refer to the name and type
of the corresponding class parameter `ps_ij`.

Every struct class implicitly overrides some method definitions of class `scala.AnyRef`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall struct classes be implicitly final?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a good question. I am tempted to stay conservative and follow the same behavior as case classes (ie, forbid struct-to-struct inheritance, struct-to-case inheritance, and case-to-struct inheritance, but not make struct classes final).

That being said, if we allow a class to extend a struct class, we should probably make struct classes extend Equals and use its canEqual method to implement equals to make sure we don’t generate an implementation of equals that is not an equivalence relation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take would be that these should be final. What would be the usecase for subclassing these?

@sjrd
Copy link
Member

sjrd commented Nov 9, 2022

Which leads me to ask: Do we really need this? What do other languages have to offer in this respect? Is this an area where Scala falls clearly short, or is it an area where we think that Scala should be leading the pack? If the latter, why?

As mentioned in the Motivation section (under "Using Code Generation or Meta-Programming"), the need is clearly there. Several libraries have reinvented the pattern using a variety of ways. And that's not even counting libraries who simply write everything by hand (e.g., the Scala.js linker library).

This is a very well-known pattern, reinvented many times, but requiring a lot of boilerplate if done in user space, as well as a lot of care not to mess up. Isn't that the best motivation we could possibly have for putting something into the language? Isn't it the same kind of motivation that led to derives? Or even object in the first place, as opposed to the singleton pattern?

@kubukoz
Copy link

kubukoz commented Nov 9, 2022

I wonder if we should have a structEnum in Scala 3 as well 🤔 - allow library authors to benefit from struct class but also the conciseness of enums.

@sjrd
Copy link
Member

sjrd commented Nov 9, 2022

I wonder if we should have a structEnum in Scala 3 as well 🤔 - allow library authors to benefit from struct class but also the conciseness of enums.

enums without pattern matching?

@kubukoz
Copy link

kubukoz commented Nov 9, 2022

Riiight. Forget that one. :)

@bjornregnell
Copy link

I guess the SIP should mention its relation to Java's record https://openjdk.org/jeps/359 with similarities and differences?

@bjornregnell
Copy link

bjornregnell commented Nov 10, 2022

If we think binary compatible case classes without copy etc is important enough to include in the language, I'd like to challenge the keyword struct which I think is a bit cryptic. The terms record or data are less cryptic IMHO - perhaps there are others even better. I think the SIP should discuss pros and cons of different soft keyword proposals. Or even if it should be a single hard keyword. like just record (remember Pascal...)

@morgen-peschke
Copy link

The terms record or data are less cryptic IMHO - perhaps there are others even better.

It's probably better to avoid "record", depending on how sure we are that Java is going to use it. There's enough similarity in what https://openjdk.org/jeps/359 is proposing that using the same keyword will likely be more confusing than helpful.

@julienrf
Copy link
Contributor Author

julienrf commented Dec 12, 2022

I would rather use the workaround mentioned in the motivation section:

case class Person private (name: String, age: Int):
  def withName(newName: String): Person = copy(name = newName)
  def withAge(newAge: Int): Person = copy(age = newAge)

object Person:
  def apply(name: String, age: Int): Person = new Person(name, age)
  private def unapply(person: Person): Person = person

@odersky
Copy link
Contributor

odersky commented Dec 12, 2022

@julienrf Looking at it again, I now agree with you. This looks like an OK solution to me. I even think you don't need the secondary constructor. Here's the setup of an extensible Person class.

case class Person private (name: String, age: Int):
  def withName(name: String) = copy(name = name)
  def withAge(age: Int) = copy(age = age)

object Person:
  def apply(name: String, age: Int) = new Person(name, age)
  private def unapply(p: Person) = p

and here's the delta if you want to add a new field:

case class Person private (name: String, age: Int, address: String = ""):
  ...
  def withAddress(address: String) = copy(address = address)

object Person:
  ...
 def apply(name: String, age: Int, address: String) = new Person(name, age, address)

It does not look too burdensome to me.

@bjornregnell
Copy link

@odersky Does that mean that no change is needed for this or is there some changed behavior of the compiler assumed by your last proposal to make the generated code of the updated with "delta" binary compatible?

@odersky
Copy link
Contributor

odersky commented Dec 12, 2022

@bjornregnell That works out of the box. No compiler change is needed.

@bjornregnell
Copy link

@odersky Ok great!

I guess there is still the problem that you might start off with a "normal" case class and then realize that it should be evolve-able in a bincompat fashion, so maybe there is still some value in introducing some kind of marker trait or something that can signal this intent? And perhaps then it would still be valuable to scrap some boilerplate by including some compiler magic? If it's worth it, the SIP proposal could then perhaps be transformed to something like "less boilerplate for binary compatible evolution of case classes" or similar.

@sideeffffect
Copy link

@odersky

Here's the setup of an extensible Person class.

In Scala, we have the @tailRec annotation. It ensures, that the annotated method is compilable using tail call optimization. In an analogous manner, can we have @compatible which verifies that an annotated class doesn't contain members which pose issues for backward compatibility when further evolving the class (adding more fields)?

@smarter
Copy link
Member

smarter commented Dec 12, 2022

I've written up more on how I envision a "built-in codegen" approach working in https://contributors.scala-lang.org/t/scala-3-macro-annotations-and-code-generation/6035

@julienrf
Copy link
Contributor Author

Thank you everyone for your feedback on this proposal!

I do understand the concern of introducing yet another type of type definition that would be complicated to explain to beginners. I think the workaround mentioned in this comment is acceptable, although not ideal. Therefore, I withdraw the proposal.

If someone else wants to revive the proposal, or add anything to the discussion, feel free to comment here.

@sideeffffect
Copy link

I think the workaround mentioned in this comment is acceptable, although not ideal.

Maybe we could have this pattern in official documentation, explaining how do design evolvable/compatible case classes. It doesn't look overly complicated, just private constructor and private unapply in companion object. But it surely isn't something easy to figure out for novices (and maybe not even novices).

@sideeffffect
Copy link

Let's document this pattern: scala/docs.scala-lang#2662

@julienrf
Copy link
Contributor Author

I agree that the pattern should be well documented. Thank you @sideeffffect for going ahead with a PR!

@sideeffffect
Copy link

So it turns out we have to do 3 things now:

  • add a new constructor, private to the companion object, with the original fields
  • add a custom withAddress method and
  • add an apply factory method to the companion.

This pattern doesn't seem as attractive as it did in the beginning... 😕
scala/docs.scala-lang#2662 (comment)

@keynmol
Copy link

keynmol commented Feb 16, 2023

I was asked to post here an example of me trying to use a cross-compilable builder pattern (can't dataclass because I need Scala 3 as well).

The end result is very verbose, requires discipline, and from what I can tell isn't even going far enough to maintain compatibility (I didn't apply all the suggestions from the referenced docs PR).

I would ask the SIP committee to reconsider this SIP or a version thereof.

Very long screenshots of the code

@bjornregnell
Copy link

Thanks @keynmol for this example "from the wild" illustrating the current boilerplate situation well. I will bring this up at the SIP meeting tomorrow (if time permits @julienrf )

@julienrf
Copy link
Contributor Author

Indeed, we don’t have a good solution when you need to cross-compile your code with Scala 2 and 3 (which is common when you write a library…).

But I am not sure even if we had a good proposal that it would be implemented in both Scala 2 and Scala 3?

@SethTisue
Copy link
Member

Indeed, we don’t have a good solution when you need to cross-compile your code with Scala 2 and 3

This was discussed at the meeting today — apparently -Xsource:3 might help here...?

@julienrf
Copy link
Contributor Author

julienrf commented Feb 17, 2023

Yes, I was checking that and indeed with -Xsource:3 it is possible to write a solution that cross-compiles. However, I have read here scala/bug#11661 that -Xsource should not be used in production, so I am not sure we should recommend using it.

@keynmol
Copy link

keynmol commented Feb 17, 2023

IMO there are two problems here:

  1. The Scala 3 specific solution is verbose in its own right and absolutely not scalable beyond trivial number of fields
  2. Even if possible to just use that solution, seems like it doesn't cross-compile without a flag that carries its own dangers?

That seems like a lose-lose situation, IMO.

@julienrf
Copy link
Contributor Author

  1. The Scala 3 specific solution is verbose in its own right and absolutely not scalable beyond trivial number of fields

Could you please elaborate on that point?

@smarter
Copy link
Member

smarter commented Feb 17, 2023

However, I have read here scala/bug#11661 that -Xsource should not be used in production

-Xsource:3 has improved quite a bit since this issue was posted, so this could be revisited.

@bjornregnell
Copy link

@smarter How far can the coming annotation macros in Scala 3 take us with this (also when cross-compiling to Scala 2)?

@smarter
Copy link
Member

smarter commented Feb 17, 2023

  • With macro definitions as they exist in both Scala 2 and Scala 3, one can avoid having to write down the body of methods to reduce the boilerplate:
    class Foo(n: Int, s: String) {
      def withN(n: Int) = data.generated()
      def withS(s: String) = data.generated()
    
      override def equals(x: Any): Boolean = data.generated()
      override def hashCode: Int = data.generated()
    }
  • With macro annotations as they already exist in Scala 2 and will exist in Scala 3, one can check that the boilerplate we write is correct:
    @data class Foo(n: Int, s: String) { // error: missing method `def withS(s: String) = data.generated()`
      def withN(n: String) = data.generated() // error: method parameter should have type `Int`
      ...
    }
  • In https://contributors.scala-lang.org/t/scala-3-macro-annotations-and-code-generation/6035 I further proposed reusing the rewrite mechanism present in Scala 3 to semi-automate the boilerplate generation.

I've prototyped these ideas in scala/scala3#16545.

@SethTisue
Copy link
Member

-Xsource:3 has improved quite a bit since this issue was posted, so this could be revisited.

Agree, and I've added a note to scala/bug#11661 to say so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.