Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoscoping for Table aggregate #9614

Closed
JaroslavTulach opened this issue Apr 3, 2024 · 21 comments
Closed

Autoscoping for Table aggregate #9614

JaroslavTulach opened this issue Apr 3, 2024 · 21 comments
Assignees

Comments

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Apr 3, 2024

Once #9277 works we need to get Table.aggregate columns working in the IDE.

@JaroslavTulach JaroslavTulach converted this from a draft issue Apr 3, 2024
@JaroslavTulach JaroslavTulach self-assigned this Apr 3, 2024
@JaroslavTulach JaroslavTulach moved this from ❓New to ⚙️ Design in Issues Board Apr 3, 2024
@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Apr 4, 2024

The IDE doesn't render autoscoped constructor arguments in Vector of Aggregate_Column elements in TableAggregateAutoConstructor.zip project:

Missing autoscoping arguments

To reproduce unzip TableAggregateAutoConstructor.zip and either inspect it in the IDE or rather execute as enso --run TableAggregateAutoConstructor. The important code is:

from Standard.Base import all
from Standard.Table import all
from Standard.Database import all
from Standard.AWS import all
from Standard.Google_Api import all
import Standard.Visualization

collapsed operator16985 =
    operator58114 = operator16985.aggregate ['Organisation'] columns=[(..Count), ..Sum]
    operator35300 = operator16985.aggregate ['Organisation'] columns=[(Aggregate_Column.Count), Aggregate_Column.Sum]
    operator58114

and the output of running the program with IDE's instrumentation is:

enter  c6fe8aaf-e963-4f63-a990-d15d230bc9ab
enter  c6fe8aaf-e963-4f63-a990-d15d230bc9ab
enter  c6fe8aaf-e963-4f63-a990-d15d230bc9ab
callfn c6fe8aaf-e963-4f63-a990-d15d230bc9ab fn: (Count '') args: [Nothing]
result c6fe8aaf-e963-4f63-a990-d15d230bc9ab result: (Count '') type: Aggregate_Column
enter  00e33517-6d19-4f5a-84c5-126f59a93b67
enter  788f7b33-54ae-4452-85ff-4b7b5688b598
result 788f7b33-54ae-4452-85ff-4b7b5688b598 result: Aggregate_Column type: Aggregate_Column.type
enter  d1844837-30e7-46b7-bde3-72f7afec52cf
result d1844837-30e7-46b7-bde3-72f7afec52cf result: UnresolvedSymbol<Count> type: Function
enter  00e33517-6d19-4f5a-84c5-126f59a93b67
callfn 00e33517-6d19-4f5a-84c5-126f59a93b67 fn: null self=_ args: [Aggregate_Column]
result 00e33517-6d19-4f5a-84c5-126f59a93b67 result: (Count '') type: Aggregate_Column
[2, 3]

The UUID c6fe8aaf-e963-4f63-a990-d15d230bc9ab represents ..Count all the other UUIDs are related to (Aggregate_Column.Count). 00e33517-6d19-4f5a-84c5-126f59a93b67 represents the whole Aggregate_Column.Count constructor. The 788f7b33-54ae-4452-85ff-4b7b5688b598 stands just for Aggregate_Column, while UUID d1844837-30e7-46b7-bde3-72f7afec52cf is assigned to UnresolvedSymbol<Count>.

Clearly the autoscoped constructor case gives IDE less information. Now we need to find out what is the important missing information that IDE needs to render the constructor arguments? CCing @4e6 and @farmaazon.

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

I think the issue is that the autoscope constructors are not resolved inside vector

from Standard.Base import all

type T
    A

main =
    test [..A]

test t:(Vector T) = t

Result

$ ./built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --no-ir-caches --no-global-cache --run ~/enso/projects/Autoscope_Demo_2/src/Main.enso
[WARN] [2024-04-04T10:44:13+01:00] [org.enso.librarymanager.local.DefaultLocalLibraryProvider] Local library search path [***/libraries] does not exist.
[..A]

Expected

[A]

@JaroslavTulach
Copy link
Member Author

I think the issue is that the autoscope constructors are not resolved inside vector

No, that's not the cause. Autoscope constructors are resolved, if one requests that. Just modify your test function to:

test t:(Vector T) = t.map (e-> e:T)

that's actually something I had to do in #9630 in 2c4b2c3 commit. The new test verifies that both Tables contain the same elements - e.g. resolution works fine.

The problem must be in differences in instrumentation between c6fe8aaf-e963-4f63-a990-d15d230bc9ab expression and the other expressions.

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

Also, Table.aggregate does not have type ascriptions for its arguments

aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

There's also a different result for columns=[Aggregate_Table.Count] vs columns=[..Count]

from Standard.Base import all
from Standard.Table import all
from Standard.Database import all
from Standard.AWS import all
from Standard.Google_Api import all
import Standard.Visualization

main =
    t = Table.from_rows ['Organisation']  [[1], [2], [3]]
    r = t.aggregate ['Organisation'] columns=[..Count]
    r.print

Result

$ ./built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --no-ir-caches --no-global-cache --run ~/enso/projects/TableAggregateAutoConstructor/src/Main.enso
[WARN] [2024-04-04T11:59:05+01:00] [org.enso.librarymanager.local.DefaultLocalLibraryProvider] Local library search path [***/libraries] does not exist.
Execution finished with an error: Inexhaustive pattern match: no branch matches Function.
        at <enso> Aggregate_Column_Helper.resolve_aggregate(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Aggregate_Column_Helper.enso:191-213)
        at <enso> Function.<<(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Function.enso:48:26-38)
        at <enso> wrapped_function(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso:89:18-27)
        at <enso> Array_Like_Helpers.vector_from_function(Internal)
        at <enso> Array_Like_Helpers.vector_from_function(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso:104:15-68)
        at <enso> Array_Like_Helpers.map(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso:244:5-74)
        at <enso> Vector.map(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso:617:9-56)
        at <enso> Aggregate_Column_Helper.prepare_aggregate_columns<arg-0>(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Aggregate_Column_Helper.enso:70:48-120)
        at <enso> Aggregate_Column_Helper.prepare_aggregate_columns<arg-1>(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Aggregate_Column_Helper.enso:70:48-150)
        at <enso> case_branch(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Errors/Problem_Behavior.enso:88-89)
        at <enso> Problem_Behavior.attach_problems_before(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Errors/Problem_Behavior.enso:81-89)
        at <enso> case_branch(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Problem_Builder.enso:66:17-76)
        at <enso> Problem_Builder.attach_problems_before(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Problem_Builder.enso:64-66)
        at <enso> Aggregate_Column_Helper.prepare_aggregate_columns<arg-2>(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Aggregate_Column_Helper.enso:66-104)
        at <enso> Aggregate_Column_Helper.prepare_aggregate_columns(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Internal/Aggregate_Column_Helper.enso:56-104)
        at <enso> Table.aggregate<arg-2>(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Table.enso:786:25-182)
        at <enso> Table.aggregate(/home/dbushev/projects/luna/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Table/0.0.0-dev/src/Table.enso:785-795)
        at <enso> Main.main(Main.enso:10:9-54)

Expected

With Aggregate_Column.Count it prints the table contents

   | Organisation | Count
---+--------------+-------
 0 | 1            | 1
 1 | 2            | 1
 2 | 3            | 1

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

the example with [..Count] works if I explicitly resolve the vector in the aggregate method

--- a/distribution/lib/Standard/Table/0.0.0-dev/src/Table.enso
+++ b/distribution/lib/Standard/Table/0.0.0-dev/src/Table.enso
@@ -780,7 +780,8 @@ type Table
     @group_by Widget_Helpers.make_column_name_vector_selector
     @columns Widget_Helpers.make_aggregate_column_vector_selector
     aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Aggregate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Invalid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
-    aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
+    aggregate self group_by=[] columns_raw=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
+        columns = columns_raw.map(e -> e:Aggregate_Column)
         normalized_group_by = Vector.unify_vector_or_element group_by
         if normalized_group_by.is_empty && columns.is_empty then Error.throw (No_Output_Columns.Error "At least one column must be specified.") else
             validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper normalized_group_by columns self error_on_missing_columns=error_on_missing_columns

@JaroslavTulach
Copy link
Member Author

There's also a different result for columns=[Aggregate_Table.Count] vs columns=[..Count]

There was a different result. That's why I created

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

I made a runtime test checking the instrumentation of autoscope constructors inside vector

"""from Standard.Base import all
|
|type T
| A
|
|main =
| a = test [..A]
| a
|
|test t:(Vector T) = t

In case of the test [..A] call, the instrumentation only sees the UnresolvedConstructor (which is being ignored). It seems that autoscope constructors inside vectors are instrumented the same way as ordinary autoscope constructors before they were fixed in #9452

@JaroslavTulach
Copy link
Member Author

The test is wrong. It doesn't even deliver [T.A] result, does it? You have to define test method as test t:(Vector T) = t.map (e-> e:T) to make the main return [T.A]. Once the right result is returned, some instrumentation will trigger.

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

It doesn't even deliver [T.A] result, does it?

It does. If you replace the autoscope call with [T.A], it will be instrumented, and the test will pass.

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Apr 4, 2024

It doesn't even deliver [T.A] result, does it?

It does. If you replace the autoscope call with [T.A], it will be instrumented, and the test will pass.

Running:

from Standard.Base import all

type T
    A

main =
    a = test [..A]
    a

test t:(Vector T) = t

yields [..A]. That's not the right answer - e.g. that's not [T.A]. Altering the code with test t:(Vector T) = t.map(e-> e:T) and running again yields [A] - that's the right answer. I am pretty sure that in the second case the instrumentation will get triggered. Btw. try:

from Standard.Base import all

type T
    A

main =
    a = test [..A]
    a == [T.A]

test t:(Vector T) = t #.map(e-> e:T)

yields False. Uncomment the comment on the last line to get True.

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

The question is not what the right answer is, but how ..A appears in the instrumentation.

Now it is clear that for ..A to be resolved, it has to be consumed (with a proper type ascription) at some point in the program. And this will make the test succeed.

Then, in the case of Table.aggregate, I can assume that the columns vector is never properly ascribed when it is consumed, and its elements stay unresolved.

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

I wonder if this will fix the instrumentation of Table.aggregate

--- a/distribution/lib/Standard/Table/0.0.0-dev/src/Table.enso
+++ b/distribution/lib/Standard/Table/0.0.0-dev/src/Table.enso
@@ -778,9 +778,10 @@ type Table

               table.aggregate ["Key"] [Aggregate_Column.Count]
     @group_by Widget_Helpers.make_column_name_vector_selector
-    @columns Widget_Helpers.make_aggregate_column_vector_selector
+    @columns_raw Widget_Helpers.make_aggregate_column_vector_selector
     aggregate : Vector (Integer | Text | Regex | Aggregate_Column) | Text | Integer | Regex -> Vector Agg
regate_Column -> Boolean -> Problem_Behavior -> Table ! No_Output_Columns | Invalid_Aggregate_Column | Inv
alid_Column_Names | Duplicate_Output_Column_Names | Floating_Point_Equality | Invalid_Aggregation | Unquoted_Delimiter | Additional_Warnings
-    aggregate self group_by=[] columns=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
+    aggregate self group_by=[] columns_raw=[] (error_on_missing_columns=False) (on_problems=Report_Warning) =
+        columns = columns_raw.map(e-> e:Aggregate_Column)
         normalized_group_by = Vector.unify_vector_or_element group_by
         if normalized_group_by.is_empty && columns.is_empty then Error.throw (No_Output_Columns.Error "At least one column must be specified.") else
             validated = Aggregate_Column_Helper.prepare_aggregate_columns self.column_naming_helper normalized_group_by columns self error_on_missing_columns=error_on_missing_columns

@JaroslavTulach
Copy link
Member Author

columns vector is never properly ascribed when it is consumed, and its elements stay unresolved.

It is consumed by 5f9aa25 - it has to be consumed, otherwise the two Table instances wouldn't be identical!

@4e6
Copy link
Contributor

4e6 commented Apr 4, 2024

Does it mean 5f9aa25 fixes the issue?

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Apr 5, 2024

Does it mean 5f9aa25 fixes the issue?

It does "fix" the issue in runtime. But the IDE isn't fixed yet. @kazcw reported

to address the necessary changes on the IDE side. As @farmaazon puts it:

As far as I recognize, we don't need anything more from engine, we need to fix IDE

@enso-bot
Copy link

enso-bot bot commented Apr 6, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-04-05):

Progress: - Dataflow_Stack_Trace: #9625 (comment)

Next Day: Table.join & co.

@enso-bot
Copy link

enso-bot bot commented Apr 9, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-04-08):

Progress: - emails, reviews: #9642 (review)

Next Day: Table.join & co.

Discord
Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
Discord
Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

@JaroslavTulach JaroslavTulach moved this from ⚙️ Design to 🌟 Q/A review in Issues Board Apr 16, 2024
@JaroslavTulach
Copy link
Member Author

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Apr 29, 2024

I decided to reopen issue #9635 as I don't see the drop down with column selection for ..Sum autoscoped constructor. CCing @vitvakatu

@JaroslavTulach
Copy link
Member Author

Will be fixed once #9916 (review) gets in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

2 participants