Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Table.replace for the database backend #8986

Merged
merged 94 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from 80 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
756dfa9
one test
GregoryTravis Jan 30, 2024
58da2fd
hack
GregoryTravis Jan 31, 2024
f2cd94a
Merge branch 'develop' into wip/gmt/8578-Table.replace
GregoryTravis Jan 31, 2024
0501d05
revert hack
GregoryTravis Jan 31, 2024
fcab4e0
use merge
GregoryTravis Jan 31, 2024
f2a58f6
unhack
GregoryTravis Jan 31, 2024
5ed3fed
example
GregoryTravis Jan 31, 2024
8caac00
tests
GregoryTravis Jan 31, 2024
74bc949
tests
GregoryTravis Jan 31, 2024
69753fa
scramble lookup table order in tests
GregoryTravis Jan 31, 2024
4cf1309
duplicate inputs
GregoryTravis Jan 31, 2024
aae2bd2
self-lookup
GregoryTravis Jan 31, 2024
d458ed8
type test
GregoryTravis Jan 31, 2024
f924690
remove incorrect test
GregoryTravis Feb 1, 2024
bf142c0
remove materialize
GregoryTravis Feb 1, 2024
b97841c
db stub
GregoryTravis Feb 1, 2024
b88633e
unused imports, widgets
GregoryTravis Feb 1, 2024
2a02eb0
cleanup
GregoryTravis Feb 1, 2024
2b7ad13
Merge branch 'develop' into wip/gmt/8578-Table.replace
GregoryTravis Feb 1, 2024
8b7c361
rename replace_column to column
GregoryTravis Feb 1, 2024
9a084d0
docs, comment, cleanup
GregoryTravis Feb 1, 2024
64d43f3
comment
GregoryTravis Feb 1, 2024
4acb007
fix docs
GregoryTravis Feb 1, 2024
df35418
convert from Map
GregoryTravis Feb 1, 2024
5ac57ce
changelog
GregoryTravis Feb 1, 2024
d01152d
cleanup
GregoryTravis Feb 1, 2024
69cdb47
Merge branch 'develop' into wip/gmt/8578-Table.replace
GregoryTravis Feb 2, 2024
76368e7
review
GregoryTravis Feb 2, 2024
4ea9c22
review
GregoryTravis Feb 2, 2024
a269166
db except map
GregoryTravis Feb 2, 2024
b6cb8cc
wip
GregoryTravis Feb 2, 2024
940bfd3
move implementation to Replace_Helpers
GregoryTravis Feb 2, 2024
bb7cb2f
make_table_from_map method
GregoryTravis Feb 2, 2024
6c1d530
wip
GregoryTravis Feb 2, 2024
a475905
wip
GregoryTravis Feb 2, 2024
f9aaf71
review
GregoryTravis Feb 5, 2024
7ad9723
Merge branch 'develop' into wip/gmt/8578-Table.replace
GregoryTravis Feb 5, 2024
fe389f8
review
GregoryTravis Feb 5, 2024
645805b
update docs
GregoryTravis Feb 5, 2024
5450561
merge
GregoryTravis Feb 5, 2024
a6523c0
one col
GregoryTravis Feb 5, 2024
4dec328
wip
GregoryTravis Feb 5, 2024
d56bb09
two columns
GregoryTravis Feb 5, 2024
947ebcf
tests pass
GregoryTravis Feb 5, 2024
8ad298d
vector.transpose
GregoryTravis Feb 5, 2024
e9c1a2c
cleanup
GregoryTravis Feb 5, 2024
324cbef
wip
GregoryTravis Feb 5, 2024
7a5854c
parser error
GregoryTravis Feb 6, 2024
02a0d54
Merge branch 'develop' into wip/gmt/8578-Table.replace
GregoryTravis Feb 6, 2024
de5f91e
Merge branch 'wip/gmt/8578-Table.replace' into wip/gmt/8578-Table.rep…
GregoryTravis Feb 6, 2024
ea2876d
fix parser failure
GregoryTravis Feb 6, 2024
6fff368
wip
GregoryTravis Feb 6, 2024
80dbc76
Merge branch 'develop' into wip/gmt/8578-Table.replace
GregoryTravis Feb 6, 2024
8afbb74
Merge branch 'wip/gmt/8578-Table.replace' into wip/gmt/8578-Table.rep…
GregoryTravis Feb 6, 2024
e74b67d
wip
GregoryTravis Feb 6, 2024
4b34631
one test passes
GregoryTravis Feb 6, 2024
b241fbe
more tests
GregoryTravis Feb 6, 2024
21b6b80
wip
GregoryTravis Feb 6, 2024
fd9212a
max size tests
GregoryTravis Feb 6, 2024
fa7c024
from map tests
GregoryTravis Feb 6, 2024
fbea5a9
docs
GregoryTravis Feb 6, 2024
0073e46
Literal_Values
GregoryTravis Feb 6, 2024
a5bf58a
cleanup
GregoryTravis Feb 6, 2024
5a1b6d0
no table_builder
GregoryTravis Feb 6, 2024
ebf5087
Merge branch 'wip/gmt/8578-Table.replace' into wip/gmt/8578-Table.rep…
GregoryTravis Feb 6, 2024
cdddcc8
merge
GregoryTravis Feb 7, 2024
76ff386
wip
GregoryTravis Feb 7, 2024
22a76bb
wip
GregoryTravis Feb 7, 2024
0090a20
changelog
GregoryTravis Feb 7, 2024
b99627c
merge
GregoryTravis Feb 8, 2024
0028687
use proxy
GregoryTravis Feb 8, 2024
5eaf502
Merge branch 'develop' into wip/gmt/8578-Table.replace-db
GregoryTravis Feb 9, 2024
5a772f0
better error for length mismatch
GregoryTravis Feb 9, 2024
2c47cfe
wip
GregoryTravis Feb 9, 2024
32452e8
move transpose into Vector
GregoryTravis Feb 9, 2024
9bce18d
vector/array docs match
GregoryTravis Feb 9, 2024
bd3a300
merge
GregoryTravis Feb 12, 2024
4174e9b
warning on empty lookup table, tests
GregoryTravis Feb 12, 2024
ae66716
merge
GregoryTravis Feb 12, 2024
a1558b7
edge cases in empty lookup table
GregoryTravis Feb 12, 2024
3aeb792
merge
GregoryTravis Feb 13, 2024
5c0a54a
enable empty-table tests for db backend
GregoryTravis Feb 13, 2024
6af697b
review
GregoryTravis Feb 14, 2024
12a4c8e
Merge branch 'develop' into wip/gmt/8578-Table.replace-db
GregoryTravis Feb 14, 2024
88e022f
merge
GregoryTravis Feb 14, 2024
cd3fab6
no widgets for from/to columns
GregoryTravis Feb 16, 2024
046488c
use parameter name, not variable name
GregoryTravis Feb 16, 2024
2a7d8ae
Merge branch 'develop' into wip/gmt/8578-Table.replace-db
GregoryTravis Feb 16, 2024
fe72e3b
merge
GregoryTravis Feb 20, 2024
ea47095
fix merge
GregoryTravis Feb 20, 2024
7df7499
Merge branch 'develop' into wip/gmt/8578-Table.replace-db
GregoryTravis Feb 26, 2024
7cd0bb8
Merge branch 'develop' into wip/gmt/8578-Table.replace-db
GregoryTravis Feb 29, 2024
6a4ad4b
move implementation to Array_Like_Helpers
GregoryTravis Feb 29, 2024
72f36c7
fix tests
GregoryTravis Feb 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,7 @@
- [Allow removing rows using a Filter_Condition.][8861]
- [Added `Table.to_xml`.][8979]
- [Implemented Write support for `S3_File`.][8921]
- [Implemented `Table.replace` for the database backend.][8986]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -884,8 +885,9 @@
[8865]: https://github.com/enso-org/enso/pull/8865
[8935]: https://github.com/enso-org/enso/pull/8935
[8861]: https://github.com/enso-org/enso/pull/8861
[8979]: https://github.com/enso-org/enso/pull/8979
[8921]: https://github.com/enso-org/enso/pull/8921
[8979]: https://github.com/enso-org/enso/pull/8979
[8986]: https://github.com/enso-org/enso/pull/8986

#### Enso Compiler

Expand Down
35 changes: 35 additions & 0 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Array.enso
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import project.Errors.Common.Incomparable_Values
import project.Errors.Common.Index_Out_Of_Bounds
import project.Errors.Common.Not_Found
import project.Errors.Empty_Error.Empty_Error
import project.Errors.Illegal_Argument.Illegal_Argument
import project.Errors.Problem_Behavior.Problem_Behavior
import project.Errors.Unimplemented.Unimplemented
import project.Internal.Array_Like_Helpers
Expand Down Expand Up @@ -559,6 +560,40 @@ type Array
partition_with_index : (Integer -> Any -> Boolean) -> Pair (Vector Any) (Vector Any)
partition_with_index self predicate = Vector.partition_with_index self predicate

## GROUP Selections
Swaps the rows and columns of a matrix represented by an array of arrays.

! Error Conditions

- If the rows (subarrays) do not all have the same length, an
`Illegal_Argument` error is raised.

> Example
Transpose an array of arrays.

matrix = [[0, 1, 2].to_array, [3, 4, 5].to_array, [6, 7, 8].to_array].to_array
# +---+---+---+
# | 0 | 1 | 2 |
# +---+---+---+
# | 3 | 4 | 5 |
# +---+---+---+
# | 6 | 7 | 8 |
# +---+---+---+

transposed = [[0, 3, 6].to_array, [1, 4, 7].to_array, [2, 5, 8].to_array].to_array
# +---+---+---+
# | 0 | 3 | 6 |
# +---+---+---+
# | 1 | 4 | 7 |
# +---+---+---+
# | 2 | 5 | 8 |
# +---+---+---+

matrix.transposed == transposed
# => True
transpose : Array (Array Any) ! Illegal_Argument
transpose self = Vector.transpose self

## Applies a function to each element of the array, returning the `Vector`
of results.

Expand Down
57 changes: 57 additions & 0 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import project.Any.Any
import project.Data.Array.Array
import project.Data.Array_Proxy.Array_Proxy
import project.Data.Filter_Condition.Filter_Condition
import project.Data.List.List
import project.Data.Map.Map
Expand Down Expand Up @@ -580,6 +581,47 @@ type Vector a
False -> Pair.new acc.first (acc.second.append elem)
pair.map .to_vector

## GROUP Selections
Swaps the rows and columns of a matrix represented by a vector of vectors.

! Error Conditions

- If the rows (subvectors) do not all have the same length, an
`Illegal_Argument` error is raised.

> Example
Transpose a vector of vectors.

matrix = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
# +---+---+---+
# | 0 | 1 | 2 |
# +---+---+---+
# | 3 | 4 | 5 |
# +---+---+---+
# | 6 | 7 | 8 |
# +---+---+---+

transposed = [[0, 3, 6], [1, 4, 7], [2, 5, 8]]
# +---+---+---+
# | 0 | 3 | 6 |
# +---+---+---+
# | 1 | 4 | 7 |
# +---+---+---+
# | 2 | 5 | 8 |
# +---+---+---+

matrix.transposed == transposed
# => True
transpose : Vector (Vector Any) ! Illegal_Argument
transpose self =
if self.is_empty then [] else
length = self.length
first_subvector_length = self.at 0 . length
check_same_length self <|
inner i = Vector.from_polyglot_array (Array_Proxy.new length j-> ((self.at j).at i))
proxy = Array_Proxy.new first_subvector_length inner
Vector.from_polyglot_array proxy

## ICON dataframe_map_column
Applies a function to each element of the vector, returning the `Vector` of
results.
Expand Down Expand Up @@ -1432,3 +1474,18 @@ check_start_valid start length function =
used_start = if start < 0 then start + length else start
if used_start < 0 || used_start > length then Error.throw (Index_Out_Of_Bounds.Error start length+1) else
function used_start

## PRIVATE
Check that all vectors have the same length and return an informative message
if they don't.

Compares all vectors to the first one and reports the first one that differs.
check_same_length : Vector (Vector Any) -> Any -> Any ! Illegal_Argument
check_same_length vecs ~action =
if vecs.is_empty then action else
num_vecs = vecs.length
len = vecs.at 0 . length
go i = if i >= num_vecs then action else
if vecs.at i . length == len then @Tail_Call go (i+1) else
Error.throw (Illegal_Argument.Error "Transpose requires that all vectors be the same length, but rows 0 and "+i.to_text+" had different lengths ("+len.to_text+" and "+(vecs.at i . length).to_text+")")
go 0
87 changes: 81 additions & 6 deletions distribution/lib/Standard/Database/0.0.0-dev/src/Data/Table.enso
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ import Standard.Table.Internal.Aggregate_Column_Helper
import Standard.Table.Internal.Column_Naming_Helper.Column_Naming_Helper
import Standard.Table.Internal.Constant_Column.Constant_Column
import Standard.Table.Internal.Problem_Builder.Problem_Builder
import Standard.Table.Internal.Replace_Helpers
import Standard.Table.Internal.Table_Helpers
import Standard.Table.Internal.Table_Helpers.Table_Column_Helper
import Standard.Table.Internal.Unique_Name_Strategy.Unique_Name_Strategy
Expand Down Expand Up @@ -929,6 +930,74 @@ type Table
on_problems.attach_problems_before problems <|
Warning.set result []

## PRIVATE
A helper that creates a two-column table from a Map.

The keys of the `Map` become the first column, with name
`key_column_name`, and the values of the `Map` become the second column,
with name `value_column_name`.

For the in-memory database, the `Map` can be empty. For the database
backends, it must not be empty.

Arguments:
- map: The `Map` to create the table from.
- key_column_name: The name to use for the first column.
- value_column_name: The name to use for the second column.
make_table_from_map : Map Any Any -> Text -> Text -> Table
make_table_from_map self map key_column_name value_column_name =
total_size = map.size * 2

if map.is_empty then Error.throw (Illegal_Argument.Error "Map argument cannot be empty") else
if total_size > MAX_LITERAL_ELEMENT_COUNT then Error.throw (Illegal_Argument.Error "Map argument is too large ("+map.size.to_text+" entries): materialize a table into the database instead") else
keys_and_values = map.to_vector
self.make_table_from_vectors [keys_and_values.map .first, keys_and_values.map .second] [key_column_name, value_column_name]

## PRIVATE
A helper that creates a literal table from `Vector`s.

For the in-memory database, the columns can be empty. For the database
backends, they must not be empty.

Arguments:
- column_vectors: A `Vector` of `Vector`s; each inner `Vector` becomes a
column of the table.
- column_names: The names of the columns of the new table.
make_table_from_vectors : Vector (Vector Any) -> Vector Text -> Table
make_table_from_vectors self column_vectors column_names =
Runtime.assert (column_vectors.length == column_names.length) "column_vectors and column_names must have the same length"

# Assume the columns are all the same length; if not, it will be an error anyway.
total_size = if column_vectors.is_empty || column_vectors.at 0 . is_empty then 0 else
column_vectors.length * (column_vectors.at 0 . length)

if total_size == 0 then Error.throw (Illegal_Argument.Error "Vectors cannot be empty") else
if total_size > MAX_LITERAL_ELEMENT_COUNT then Error.throw (Illegal_Argument.Error "Too many elements for table literal ("+total_size.to_text+"): materialize a table into the database instead") else
type_mapping = self.connection.dialect.get_type_mapping

values_to_type_ref column_vector =
value_type = Value_Type_Helpers.find_common_type_for_arguments column_vector
sql_type = case value_type of
Nothing -> SQL_Type.null
_ -> type_mapping.value_type_to_sql value_type Problem_Behavior.Ignore
SQL_Type_Reference.from_constant sql_type

literal_table_name = self.connection.base_connection.table_naming_helper.generate_random_table_name "enso-literal-"

from_spec = From_Spec.Literal_Values column_vectors column_names literal_table_name
context = Context.for_subquery from_spec

internal_columns = 0.up_to column_vectors.length . map i->
column_vector = column_vectors.at i
column_name = column_names.at i

type_ref = values_to_type_ref column_vector.to_vector
generated_literal_column_name = "column"+(i+1).to_text
sql_expression = SQL_Expression.Column literal_table_name generated_literal_column_name
Internal_Column.Value column_name type_ref sql_expression

Table.Value literal_table_name self.connection internal_columns context

## PRIVATE

Create a constant column from a value.
Expand Down Expand Up @@ -1397,7 +1466,7 @@ type Table
In the Database backend, there are no guarantees related to ordering of
results.

? Error Conditions
! Error Conditions

- If this table or the lookup table is lacking any of the columns
specified in `key_columns`, a `Missing_Input_Columns` error is raised.
Expand Down Expand Up @@ -1453,7 +1522,7 @@ type Table
In the Database backend, there are no guarantees related to ordering of
results.

? Error Conditions
! Error Conditions

- If this table or the lookup table is lacking any of the columns
specified by `from_column`, `to_column`, or `column`, a
Expand Down Expand Up @@ -1499,10 +1568,12 @@ type Table
# 1 | 20 | b | f
# 2 | 30 | c | g
# 3 | 40 | d | h
replace : Table | Map -> (Text | Integer) -> (Text | Integer) -> (Text | Integer) -> Boolean -> Problem_Behavior -> Table ! Missing_Input_Columns | Non_Unique_Key | Unmatched_Rows_In_Lookup
replace self lookup_table:(Table | Map) column:(Text | Integer) from_column:(Text | Integer)=0 to_column:(Text | Integer)=1 allow_unmatched_rows:Boolean=True on_problems:Problem_Behavior=Problem_Behavior.Report_Warning =
_ = [lookup_table, column, from_column, to_column, allow_unmatched_rows, on_problems]
Error.throw (Unsupported_Database_Operation.Error "Table.replace is not implemented yet for the Database backends.")
@column Widget_Helpers.make_column_name_selector
@from_column Widget_Helpers.make_column_name_selector
@to_column Widget_Helpers.make_column_name_selector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't these from the lookup_table so the selector is wrong?
We need to have widgets derived from first argument for this I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the notation would be for this -- is there documentation for the @ clauses? Or, where is it implemented? I don't see an example of a widget attached to a value other than self.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

replace : Table | Map -> (Text | Integer) -> (Text | Integer | Nothing) -> (Text | Integer | Nothing) -> Boolean -> Problem_Behavior -> Table ! Missing_Input_Columns | Non_Unique_Key | Unmatched_Rows_In_Lookup
replace self lookup_table:(Table | Map) column:(Text | Integer) from_column:(Text | Integer | Nothing)=Nothing to_column:(Text | Integer | Nothing)=Nothing allow_unmatched_rows:Boolean=True on_problems:Problem_Behavior=Problem_Behavior.Report_Warning =
Replace_Helpers.replace self lookup_table column from_column to_column allow_unmatched_rows on_problems

## ALIAS join by row position
GROUP Standard.Base.Calculations
Expand Down Expand Up @@ -2786,3 +2857,7 @@ Table.from (that:Materialized_Table) =

## PRIVATE
Table_Ref.from (that:Table) = Table_Ref.Value that

## PRIVATE
The largest dataset that can be used to make a literal table, expressed in number of elements.
MAX_LITERAL_ELEMENT_COUNT = 256
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be bigger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any DB limit? I guess it is limited by the query size... I wonder if such literal table will be able to use hashjoins or will it always fall back to a linear scan. I guess for <256 it doesn't matter much. For larger values the size of the SQL query may start being quite problematic.

I think ideally we shouldn't cut off but instead we should be creating temporary tables - but definitely an improvement for a separate PR.
(I also wonder, given how often currently Enso re-evaluates the expressions, if we could create too much garbage in such way)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was my concern. My first thought was to create a temporary table, but that will be happening repeatedly during each evaluation.

I was able to create tables with ~18000 rows this way in both postgres and sqlite, and at that point I stopped testing. I'm more concerned about the size of the query being sent over the wire, but I don't really know at what point that becomes a problem. I figured that giving it a small would be a good first step. I think it covers the majority of use-cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that it worked for such large sizes.

I guess the primary concern indeed is the query size, especially as a temp table is likely sent in a more compact binary compact.

Anyway, seems all ok for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

256 a reasonable literal default. Agee with @radeusgd should be a temp table when gets bigger!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this result in a lot of temporary tables being created and not eagerly cleaned up? That was my concern when doing it the literal way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this result in a lot of temporary tables being created and not eagerly cleaned up? That was my concern when doing it the literal way.

Yes indeed that is a concern.

I think we should implement such a feature as a separate PR, as it has non-trivial complexity.

Also, I think the concern of 'too many' temporary tables may also be increased due to the fact that currently the IDE seems to recompute unrelated nodes 'too often'. I think that if we resolve that, the amount of re-computations could be lower and thus the problem would not be as big.

As for implementing these temporary tables efficiently, we can leverage a few tricks:

  1. I expect that very often when the operation is re-run, the in-memory lookup table will actually be the same between re-runs. We can exploit that and try keeping a cache of already uploaded temporary tables, indexed by the hashcode of the table's contents (uploading the table requires us to scan its whole contents anyway, so the additional cost of computing the hashcode is negligible in this case). This way we can avoid uploading a new temporary table on each run, if we can detect that a 'matching' table was already uploaded before.
  2. We can exploit the Managed_Resource framework to try to clean up the tables once the references to them are GCed. We actually already implement a very similar feature - Hidden_Table_Registry. It is used to be able to re-use temporary hidden tables for dry-run operations, and clean them up once they are no longer needed. We could extend this registry to also support such temporary tables that are not used for dry-run (and accessed by name) but accessed by e.g. content's hash.

In fact we can merge the 2 approaches. If between runs the tables are the same we can avoid re-uploading them. Once all references to them are GCed we can clean them up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,10 @@ generate_from_part dialect from_spec = case from_spec of
dialect.wrap_identifier name ++ alias dialect as_name
From_Spec.Query raw_sql as_name ->
Builder.code raw_sql . paren ++ alias dialect as_name
From_Spec.Literal_Values vecs column_names as_name ->
Runtime.assert (vecs.length == column_names.length) "Vectors and column names must have the same length"
values = Builder.join ", " (vecs.transpose.map (vec-> Builder.join ", " (vec.map Builder.interpolation) . paren))
Builder.code "(VALUES " ++ values ++ ")" ++ alias dialect as_name
From_Spec.Join kind left_spec right_spec on ->
left = generate_from_part dialect left_spec
right = generate_from_part dialect right_spec
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,17 @@ type From_Spec
the same table.
Query (raw_sql : Text) (alias : Text)

## PRIVATE

A query source consisting of a literal VALUES clause.

Arguments:
- column_vectors: the contents of the literal table's columns.
- column_names: the names of the literal table's columns,
- alias: the name by which the table can be referred to in other parts of
the query.
Literal_Values (column_vectors : Vector (Vector Any)) (column_names : Vector Text) (alias : Text)
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved

## PRIVATE

A query source that performs a join operation on two sources.
Expand Down
68 changes: 40 additions & 28 deletions distribution/lib/Standard/Table/0.0.0-dev/src/Data/Table.enso
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ import project.Internal.Lookup_Helpers
import project.Internal.Lookup_Helpers.Lookup_Column
import project.Internal.Parse_Values_Helper
import project.Internal.Problem_Builder.Problem_Builder
import project.Internal.Replace_Helpers
import project.Internal.Split_Tokenize
import project.Internal.Table_Helpers
import project.Internal.Table_Helpers.Table_Column_Helper
Expand Down Expand Up @@ -1672,6 +1673,42 @@ type Table
on_problems.attach_problems_before problems <|
Warning.set result []

## PRIVATE
A helper that creates a two-column table from a Map.

The keys of the `Map` become the first column, with name
`key_column_name`, and the values of the `Map` become the second column,
with name `value_column_name`.

For the in-memory database, the `Map` can be empty. For the database
backends, it must not be empty.

Arguments:
- map: The `Map` to create the table from.
- key_column_name: The name to use for the first column.
- value_column_name: The name to use for the second column.
make_table_from_map : Map Any Any -> Text -> Text -> Table
make_table_from_map self map key_column_name value_column_name =
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
keys_and_values = map.to_vector
self.make_table_from_vectors [keys_and_values.map .first, keys_and_values.map .second] [key_column_name, value_column_name]

## PRIVATE
A helper that creates a literal table from `Vector`s.

For the in-memory database, the columns can be empty. For the database
backends, they must not be empty.

Arguments:
- column_vectors: A `Vector` of `Vector`s; each inner `Vector` becomes a
column of the table.
- column_names: The names of the columns of the new table.
make_table_from_vectors : Vector (Vector Any) -> Vector Text -> Table
make_table_from_vectors self column_vectors column_names =
# Assume the columns are all the same length; if not, it will be an error anyway.
if column_vectors.is_empty then Error.throw (Illegal_Argument.Error "Vectors cannot be empty") else
Runtime.assert (column_vectors.length == column_names.length) "column_vectors and column_names must have the same length"
Table.new (column_vectors.zip column_names (v-> n-> Column.from_vector n v))

## PRIVATE

Create a constant column from a value.
Expand Down Expand Up @@ -1908,7 +1945,7 @@ type Table
In the Database backend, there are no guarantees related to ordering of
results.

? Error Conditions
! Error Conditions

- If this table or the lookup table is lacking any of the columns
specified in `key_columns`, a `Missing_Input_Columns` error is raised.
Expand Down Expand Up @@ -1993,7 +2030,7 @@ type Table
In the Database backend, there are no guarantees related to ordering of
results.

? Error Conditions
! Error Conditions

- If this table or the lookup table is lacking any of the columns
specified by `from_column`, `to_column`, or `column`, a
Expand Down Expand Up @@ -2044,32 +2081,7 @@ type Table
@to_column Widget_Helpers.make_column_name_selector
replace : Table | Map -> (Text | Integer) -> (Text | Integer | Nothing) -> (Text | Integer | Nothing) -> Boolean -> Problem_Behavior -> Table ! Missing_Input_Columns | Non_Unique_Key | Unmatched_Rows_In_Lookup
replace self lookup_table:(Table | Map) column:(Text | Integer) from_column:(Text | Integer | Nothing)=Nothing to_column:(Text | Integer | Nothing)=Nothing allow_unmatched_rows:Boolean=True on_problems:Problem_Behavior=Problem_Behavior.Report_Warning =
case lookup_table of
_ : Map ->
if from_column.is_nothing.not || to_column.is_nothing.not then Error.throw (Illegal_Argument.Error "If a Map is provided as the lookup_table, then from_column and to_column should not also be specified.") else
self.replace (map_to_lookup_table lookup_table 'from' 'to') column 'from' 'to' allow_unmatched_rows=allow_unmatched_rows on_problems=on_problems
_ : Table ->
from_column_resolved = from_column.if_nothing 0
to_column_resolved = to_column.if_nothing 1
selected_lookup_columns = lookup_table.select_columns [from_column_resolved, to_column_resolved]
self.select_columns column . if_not_error <| selected_lookup_columns . if_not_error <|
unique = self.column_naming_helper.create_unique_name_strategy
unique.mark_used (self.column_names)

## We perform a `merge` into `column`, using a duplicate of `column`
as the key column to join with `from_column`.

duplicate_key_column_name = unique.make_unique "duplicate_key"
duplicate_key_column = self.at column . rename duplicate_key_column_name
self_with_duplicate = self.set duplicate_key_column set_mode=Set_Mode.Add

## Create a lookup table with just `to_column` and `from_column`,
renamed to match the base table's `column` and its duplicate,
respectively.
lookup_table_renamed = selected_lookup_columns . rename_columns (Map.from_vector [[from_column_resolved, duplicate_key_column_name], [to_column_resolved, column]])

merged = self_with_duplicate.merge lookup_table_renamed duplicate_key_column_name add_new_columns=False allow_unmatched_rows=allow_unmatched_rows on_problems=on_problems
merged.remove_columns duplicate_key_column_name
Replace_Helpers.replace self lookup_table column from_column to_column allow_unmatched_rows on_problems

## ALIAS join by row position
GROUP Standard.Base.Calculations
Expand Down
Loading
Loading