Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Rename T{...} and T[_] to task{...} and Task[_], standardize "Task"/"Target" terminology around "Task" #3338

Closed
lihaoyi opened this issue Aug 5, 2024 · 11 comments · May be fixed by #3356
Milestone

Comments

@lihaoyi
Copy link
Member

lihaoyi commented Aug 5, 2024

T{...} and T[_] syntaxes are kind of weird. They are concise, but I don't think the conciseness they bring pays for the weirdness. And the weirdness does get noticed, e.g. recently on Reddit. I notice it too. Part of the reason it's there is that Target[_] and Target{...} is too verbose, but maybe that's fixable

This proposal would be to leave the T aliases in place but deprecated, add a new task alias for the expressions:

  • T{...} -> task{...}
  • T.command{...} -> task.command{...}
  • T.task{...} -> task.anon{...}
  • T.dest -> task.dest, ...)
  • T[_] -> Task[_]

Task[_] and task{...} are maybe short enough (compared to Target) that we can do without the T alias

This would make the codebase a lot more approachable for folks not used to weird Scala DSLs

There's some messiness around how the Task/Target hierarchy is arranged right now that may require a breaking change to make this work well.e.g. Target may need to become Task, if we want to use the Task terminology everywhere instead of T. But then Task would need to be renamed to avoid confusion (BaseTask?). This would basically be cleaning up some of the messiness that we failed to totally resolve earlier in #2402, as the status quo today the terminology of whether we should call things tasks or targets is still a bit muddled.

e.g. we may need to go from today's:

  • Task
    • Task.Sequence, Task.TraverseCtx, Task.Mapped, Task.Zipped, T.task, etc.
    • NamedTask
      • Command
      • Worker
      • Target
        • TargetImpl
          • PersistentImpl
        • InputImpl
          • SourceImpl
          • SourcesImpl

To something like

  • BaseTask
    • Task.Sequence, Task.TraverseCtx, Task.Mapped, Task.Zipped, T.task, etc.
    • NamedBaseTask
      • Command
      • Worker
      • Task
        • TargetImpl
          • PersistentImpl
        • InputImpl
          • SourceImpl
          • SourcesImpl

That would allow us to consistently use the term Task everywhere, with BaseTask and NamedBaseTask being niche interfaces that are mostly interesting only for advanced users.

@lihaoyi lihaoyi changed the title Rename T{...} and T[_] to task{...} and Task[_]? Rename T{...} and T[_] to task{...} and Task[_], standardize "Task"/"Target" terminology around "Task" Aug 5, 2024
@lihaoyi lihaoyi changed the title Rename T{...} and T[_] to task{...} and Task[_], standardize "Task"/"Target" terminology around "Task" [RFC] Rename T{...} and T[_] to task{...} and Task[_], standardize "Task"/"Target" terminology around "Task" Aug 5, 2024
@lefou
Copy link
Member

lefou commented Aug 6, 2024

There is a high change to produce collisions hard to resolve in older external code bases or projects.

E.g.

  • T[_] -> Task[_]

is ambiguous, since today we already have a Task type.

I'm also not so sure whether task.command{...} is better than Task.command{...}. For the lower-case version I would expect it being a inherited def, maybe inside Module. But anonymous tasks can also be defined outside of Module. The upper-case version is easier to comprehend, since I just need to find Task and inspect it.

@Ichoran
Copy link

Ichoran commented Aug 6, 2024

I would tend to just leave it alone. Target is only six characters; any argument you have going from one character to four will apply even better going from four to six. You can use Target in docs if you want less T weirdness. The only reason to change it would be if Target really ought to always have been called Task.

@lefou
Copy link
Member

lefou commented Aug 6, 2024

That's a good point. Target{..} or Target.command{..} isn't that worse than Task{..} or Task.command{..}.

And since T is just an alias to Target, users can already write Target{..} instead of T{..}.

Personally, I'd like to not deprecate the short T alias. It's not that weird after all. IMHO, The argument from reddit, that it isn't "normal" code is lame, since this is in fact normal code, even for Java devs.

@lihaoyi
Copy link
Member Author

lihaoyi commented Aug 8, 2024

I'd like to push on this just a bit more: It's not just that one random guy on r/java, folks on r/scala have called Mill alien and cryptic as well. Obviously as maintainers we do not feel the same since we are 100% comfortable with living and breathing Mill, and some amount of unfamiliarity is to be expected from outsiders. But I'd like to take this as an opportunity to consider if we can do better.

What is cryptic or alien about Mill code for Scala developers? Although Mill looks mostly like normal Scala code - method calls, blocks, inheritance, etc. - I can list a few things:

  1. Agg rather than Seq. This is already something I brought up in the other thread Planning for breaking changes in 0.12.x / 0.13.x ? #3209, and I plan to remove in 0.13.0. Agg as an abstraction is cool, and we can keep it around for internal code the uses its correctness guarantees, but it definitely is not really giving us any benefits in most user-facing code, and costs a lot in terms of unfamiliarity. We can leave aliases type Agg[+T] = Seq[T]; val Agg = Seq for source compatibility, and let people import the real thing from Strict.Agg or Loose.Agg if they really want to use this data structure in user code

  2. T { ... } and T.foo. Sure there is precedence: Lift used S.foo, FastParse uses P { ... } and P.foo. But Lift and FastParse are very much on the more-cryptic/alien part of the Scala style spectrum. Some Java code also uses single-letter variables for convenience - e.g. I've seen Foo.v() to reference to Foo's singleton - but I'd argue that is also on the more-cryptic side of the Java coding style.

  3. Having to juggle def foo(x: ???) = T.task { ... }s: calling them with foo(???)() or operating on them with various combinators T.sequence(...)(), T.traverse(...)(), etc.. The double parens (...)() thing in particular is super weird. We'll never get rid of this completely because sometimes you do need to operate on the task graph before it is fully constructed, but I hope to relegate this to mostly advanced use cases, and have been trying to get them out of "common" use cases like in Add overloads to try and simplify resolveDeps workflows #3330. We also may be able to suggest coding styles that avoid the double parens (e.g. by assigning the result of T.sequence/T.traverse to a name before calling it later)

Other than these things listed above, Mill code looks exactly like vanilla Scala code: function calls taking blocks foo { ... }, calling methods with parens foo(), etc.. (1) and (3) are already being addressed, and IMO it would be great if we could address (2). T { ... } isn't the end of the world, but IMO it would be really nice if people could write def foo = task { ... }. Especially for folks coming to Mill for the first time, def foo = task { ... } would be immediately familiar with people coming from other build tools (e.g. Gradle's task.named("foo"){ ...}, Rake's task :foo do ... end).


This whole T alias is also a bit of a wart semantically, even ignoring the syntax and spelling. T stands for Target, but you also use it to spawn T.commands and T.workers which are not targets, and access T.dest which isn't unique to targets. Semantically T should really stand for Task, with T { ... } or Task { ... } being a factory apply method that constructs the default type Target, but whose other factory methods construct other subtypes like Command or Worker. The only reason T looks reasonable today is because it is kind of fuzzy whether it is short for Task or Target, but if we started spelling out Target.command or Target.dest that becomes totally nonsensical given the class hierarchy.

As of now, Task as an abstraction is mostly used inside Mill internals or in advanced use cases, while Target is used in user-facing code. Task is used in most of the documentation because it makes a lot more sense to a newcomer (e.g. every build tool has tasks, only Bazel has targets), and because for many user-facing purposes non-target tasks are treated the same as target tasks (graph visualization, execution scheduling, etc.)

I don't think I have a full solution to propose here yet. But I'd like to discuss this and see if this can go anywhere.

@lefou
Copy link
Member

lefou commented Aug 9, 2024

I agree, that mentally T stands for Task. I really wish all those apply-like methods would be defined in Task and not Target, so I'm in the same boat already. It's then up to some user preference to use Task.command or T.command, and I'm comfortable with using the longer version if that helps co-workers.

I also see all those Mill projects and plugins in the wild and really fear a rough migration path paired with frustration. Since we run the normal Scala compiler, pure compiler error messages is what our users face once they hit a changed API. We need to make that experience smooth. I've seen abandoned older Mill plugin projects on GitHub and trying to get them to work in newer Mill versions was rather frustrating, even for me how really knows the Mill API. Once you have an old build.sc you sometimes even don't known which Mill version was supposed to properly handle it, so we should limit the guess game to a minimum.

Improving for new users is great, but we should not leave existing users behind.

@lefou
Copy link
Member

lefou commented Aug 9, 2024

What always bugged me is the fact that a T produces a T but a T.worker produces a Worker. From a consistency standpoint, we want to produce either a T.worker or we want to rename the factory to Worker. (At least in documentation or in chats, whenever I said "you need a T.command", I thought, damn, that's not the type she wants.)

So, wouldn't it make sense to move some types like Worker and PersistentImpl into the T (or Task or whatever we come up with) companion.

def worker: Task.Worker[Work] = Task.Worker { .. }
def persistent: Task[String] = Task { .. }
def anon: Task.Amon[String] = Task.Anon { .. }

We could also make it lower case, but it would be alien to have lower-case class names, wouldn't it?

def worker: Task.worker[Work] = Task.worker { .. }
def persistent: Task[String] = Task { .. }
def anon: Task.amon[String] = Task.anon { .. }

And if we move the factories into the respective companion object apply methods this would be equivalent (given the proper imports):

def cmd(): Task.Command[String] = Task.Command { .. }
def cmd(): Command[String] = Command { .. }
def cmd(): Task.Command[String] = Command { .. }
def cmd(): Command[String] = Task.Command { .. }

@lihaoyi
Copy link
Member Author

lihaoyi commented Aug 9, 2024

For migration, there are things we can do since this is such a core API. e.g.

  1. A longer deprecation cycle: 1 big release without warnings, 1 release turning on warnings, before removal. If we do one big release every year, that gives people 3 years to update

  2. We can @compileTimeOnly the removed APIs before full removal. That would give people a good error message when the APIs stop working, so they don't just hit a brick wall.

  3. Given how core these APIs are, I would be honestly fine leaving them deprecated forever. Doesn't cost us much, but allows backwards compat forever

Migration is definitely a concern, especially for such core APIs, but I think we can do something reasonable to address it as long as we are careful

@lihaoyi
Copy link
Member Author

lihaoyi commented Aug 9, 2024

I have also thought about making T.worker into just Worker { ... } or Task.Worker { ... } and the type Task.Worker[T]. Apart from matching the type more closely, this would also better differentiate the factory methods like .Worker or .Persistent from contextual API methods like .dest or .log.

Some of the weirdness around def foo: Target[T] = Task { ... } would also be remedied by making it def foo: Task.Target[T] = Task { ... }. This would make it clear that Target[T] is some kind of Task, although at the expense of verbosity

@lefou
Copy link
Member

lefou commented Aug 10, 2024

The existing T.command / Command[R] always played a special role, since it's some factory for tasks, based on their parameters. Its "constructor" accepts a task and mentally, it's more some wrapping around a task with some user convenience. To re-use or override a command programmatically, special precautions need to be taken, e.g. making all argument itself tasks. To support CLI parsing, special @mainargs annotations and limitations need to be considered. Binary-compatible evolution and overloads are hard or impossible.

So here is my question. Can we simplify it's design or make it more principled? Since it's at its core two components, an anonymous task and a CLI module wrapping it, maybe we should make it either two defs or a class containing these two defs?

@lihaoyi
Copy link
Member Author

lihaoyi commented Aug 10, 2024

Took another pass at this at #3356. I think I'm pretty happy with the result. It does make some tradeoffs, but I think the approach taken is solid and the improvement in user experience is worth it.

For T.commands, I think there is something we can do there, but it would require more involved work.

  1. Splitting it into a task/entrypoint pair is possible, but would add a lot of boilerplate to the definition sites
  2. Another possibility would be to use a compiler plugin or annotation macro to automatically wrap the arguments in Task[_]s and automatically wrap the use sites in (). I think this could work and be really nice to use.

I think (2) could work and be a pretty seamless experience. Worth exploring

@lihaoyi
Copy link
Member Author

lihaoyi commented Oct 14, 2024

Calling this done; half done in 0.12.0 standardizing on Task for factories and helpers, and the term "task" over "target. We can follow up with the type-hierarchy re-organization in 0.13.0 in #3356

@lihaoyi lihaoyi closed this as completed Oct 14, 2024
@lefou lefou modified the milestones: 0.13.0, 0.12.0 Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants