chore: refactor go schema extraction #1700

worstell · 2024-06-07T03:50:36Z

initial step towards #1518

integrates with (forked) go/analysis to run static analysis tasks for schema extraction
implements first extraction task, type aliases
refactors existing build stack to use the new extractor, combining results with legacy so we can incrementally migrate
removes typealias extraction logic from legacy code

worstell · 2024-06-07T03:55:21Z

go-runtime/compile/parser.go

@@ -1,5 +1,8 @@
 package compile

+// TODO: This file is now duplicated in go-runtime/schema/analyzers/parser.go. It should be removed
+// from here once the schema extraction refactoring is complete.


just duplicated this file to avoid having to export everything from it's new location (so the legacy extractor can continue using it) only to make it all private again once the refactor is complete

worstell · 2024-06-07T04:04:41Z

go-runtime/compile/schema.go

@@ -106,44 +107,34 @@ type ParseResult struct {
 	Errors []*schema.Error
 }

-// ExtractModuleSchema statically parses Go FTL module source into a schema.Module.
-func ExtractModuleSchema(dir string, sch *schema.Schema) (optional.Option[ParseResult], error) {
+func legacyExtractModuleSchema(dir string, sch *schema.Schema, out *analyzers.ExtractResult) error {


got a bit hairy attempting to run the legacy extractor as a completely isolated process since things are so entangled

instead modified the legacy code to accept an output pointer as param (out *analyzers.ExtractResult). the initial caller provides the partial result from the prior extraction step, which gives the legacy extractor access to details (native names and decls) that it needs. the output/result is updated in-place during the legacy extraction

worstell · 2024-06-07T05:34:19Z

so many go.mod changes 😭 separated them into their own commit for easier reviewing

alecthomas

This is amazing Lizzy. This will be so much cleaner.

A couple of thoughts/questions/comments:

This is not an immediate concern, but if we could switch to using the analysis package's Diagnostic system, we'd get cleaner integration with the package.

Typically with the analysis package each pass will produce a distinct type that other analysers can require, and they will then be run in the correct order, and values passed through. It seems we could utilise this system rather than having a single extractorContext passed through and mutated, and I think (though am not sure) that we would benefit from parallelisation then. For example, the TypeAliasExtractor could output a []*schema.TypeAlias.

Another one for later, but I think we could also utilise the facts system. My first thought along those lines is that a first pass could add a schema.Ref fact to every node that is a reference to an FTL type or value. This would allow latter passes to just retrieve these facts directly from the node, rather than having to recompute them.

But yeah, overall, absolutely awesome.

alecthomas · 2024-06-07T05:32:33Z

go-runtime/compile/build.go

+	if err != nil {
+		return optional.None[analyzers.ExtractResult](), err
+	}
+	result, ok := maybeResult.Get()


Should we just return an error from extract.Extract() in this case, so we can make these results not be optional?

i think i got a bit optional-happy 😅

alecthomas · 2024-06-07T06:16:49Z

so many go.mod changes 😭 separated them into their own commit for easier reviewing

Fear not, I just found this which made it actually quite okay :)

worstell · 2024-06-07T06:47:50Z

This is not an immediate concern, but if we could switch to using the analysis package's Diagnostic system, we'd get cleaner integration with the package.

ah in place of shoving the schema errors into the result? will check it out!

Typically with the analysis package each pass will produce a distinct type that other analysers can require, and they will then be run in the correct order, and values passed through. It seems we could utilise this system rather than having a single extractorContext passed through and mutated, and I think (though am not sure) that we would benefit from parallelisation then.

currently the extractors actually do produce their own results (rather than mutate a single result), they just share the same type (extractorContext, maybe a confusing name)— so i believe we're already getting parallelization except for dependencies, which may end up being just finalizer -> everything else

For example, the TypeAliasExtractor could output a []*schema.TypeAlias.

i initially had the same thought, actually, to return []*schema.TypeAlias directly. i moved to the current approach to handle transitive visibility, e.g.:

pass A: export typealias Alias foo.Thing, export data Thing{}
pass B: data Thing{}, data Other{}
final result (merged): export typealias Alias foo.Thing, export data Thing{}, data Other{}

basically everyone producing a partial view of the world; then these get merged and the highest visibility Decl wins out in the case of conflicts. now in thinking about it more, this does feel cleaner as a separate calculation in finalizer that looks at the complete schema and uses refs to update visibility as needed. will take a look at updating tomorrow

Another one for later, but I think we could also utilise the facts system. My first thought along those lines is that a first pass could add a schema.Ref fact to every node that is a reference to an FTL type or value. This would allow latter passes to just retrieve these facts directly from the node, rather than having to recompute them.

amazing idea!!

worstell · 2024-06-07T19:54:41Z

@alecthomas this has been updated with the following:

uses diagnostics instead of manually propagating schema errors
extractors just return list of relevant Decls (also native names for now, but we might be able to get rid of this down the line by managing it as a separate pass)
Decl visibility is updated transitively in finalizer — at the moment we're also running the same logic in compile/build.go after the legacy extractor runs, but we will remove this once we're migrated and just rely on the finalizer

initial step towards #1518 - integrates with (forked) go/analysis to run static analysis tasks for schema extraction - implements first extraction task, type aliases - refactors existing build stack to use the new extractor, combining results with legacy so we can incrementally migrate - removes typealias extraction logic from legacy code

worstell requested a review from alecthomas as a code owner June 7, 2024 03:50

worstell requested review from a team and wesbillman and removed request for a team June 7, 2024 03:50

worstell commented Jun 7, 2024

View reviewed changes

ftl-robot mentioned this pull request Jun 7, 2024

Dashboard #728

Open

worstell commented Jun 7, 2024

View reviewed changes

worstell force-pushed the worstell/20240604-refactor-go-schema-parsing branch 3 times, most recently from 4546309 to 751069c Compare June 7, 2024 05:33

worstell force-pushed the worstell/20240604-refactor-go-schema-parsing branch 2 times, most recently from 0f2c7ec to 7850af9 Compare June 7, 2024 06:03

alecthomas approved these changes Jun 7, 2024

View reviewed changes

worstell force-pushed the worstell/20240604-refactor-go-schema-parsing branch 4 times, most recently from a51bf8c to 6ecedcd Compare June 7, 2024 19:49

worstell force-pushed the worstell/20240604-refactor-go-schema-parsing branch from 6ecedcd to cb33912 Compare June 7, 2024 20:02

worstell merged commit 21930e3 into main Jun 7, 2024
37 checks passed

worstell deleted the worstell/20240604-refactor-go-schema-parsing branch June 7, 2024 20:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: refactor go schema extraction #1700

chore: refactor go schema extraction #1700

worstell commented Jun 7, 2024 •

edited

Loading

worstell Jun 7, 2024

worstell Jun 7, 2024

worstell commented Jun 7, 2024 •

edited

Loading

alecthomas left a comment

alecthomas Jun 7, 2024

worstell Jun 7, 2024

alecthomas commented Jun 7, 2024

worstell commented Jun 7, 2024 •

edited

Loading

worstell commented Jun 7, 2024

chore: refactor go schema extraction #1700

chore: refactor go schema extraction #1700

Conversation

worstell commented Jun 7, 2024 • edited Loading

worstell Jun 7, 2024

Choose a reason for hiding this comment

worstell Jun 7, 2024

Choose a reason for hiding this comment

worstell commented Jun 7, 2024 • edited Loading

alecthomas left a comment

Choose a reason for hiding this comment

alecthomas Jun 7, 2024

Choose a reason for hiding this comment

worstell Jun 7, 2024

Choose a reason for hiding this comment

alecthomas commented Jun 7, 2024

worstell commented Jun 7, 2024 • edited Loading

worstell commented Jun 7, 2024

worstell commented Jun 7, 2024 •

edited

Loading

worstell commented Jun 7, 2024 •

edited

Loading

worstell commented Jun 7, 2024 •

edited

Loading