Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support non-struct type class structure #328

Merged
merged 2 commits into from
Sep 27, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 30 additions & 6 deletions site/docs/types/type_classes.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,18 +44,42 @@ Compound type classes are type classes that need to be configured by means of a

## User-Defined Types

User-defined type classes can be created using a combination of pre-defined types. User-defined types are defined as part of [simple extensions](../extensions/index.md#simple-extensions). An extension can declare an arbitrary number of user defined extension types.
User-defined type classes can be created using a combination of pre-defined types. User-defined types are defined as part of [simple extensions](../extensions/index.md#simple-extensions). An extension can declare an arbitrary number of user defined extension types. Once a type has been declared, it can be used in function declarations.

A YAML example of an extension type is below:

```yaml
name: point
structure:
longitude: i32
latitude: i32
name: point
structure:
longitude: i32
latitude: i32
```

This declares a new type (namespaced to the associated YAML file) called "point". This type is composed of two `i32` values named longitude and latitude.

### Structure and opaque types

The name-type object notation used above is syntactic sugar for `NSTRUCT<longitude: i32, latitude: i32>`. The following means the same thing:

```yaml
name: point
structure: "NSTRUCT<longitude: i32, latitude: i32>"
```

This declares a new type (namespaced to the associated YAML file) called "point". This type is composed of two `i32` values named longitude and latitude. Once a type has been declared, it can be used in function declarations. [TBD: should field references be allowed to dereference the components of a user defined type?]
The structure field of a type is only intended to inform systems that don't have built-in support for the type how they can transfer the data type from one point to another without unnecessarily serialization/deserialization *and* without loss of type safety. Note that it is currently not possible to "unpack" a user-defined type class into its structure type or components thereof without also defining a function to do that packing/unpacking.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The structure field of a type is only intended to inform systems that don't have built-in support for the type how they can transfer the data type from one point to another without unnecessarily serialization/deserialization *and* without loss of type safety. Note that it is currently not possible to "unpack" a user-defined type class into its structure type or components thereof without also defining a function to do that packing/unpacking.
The structure field of a type is only intended to inform systems that don't have built-in support for the type how they can transfer the data type from one point to another without unnecessary serialization/deserialization *and* without loss of type safety. Note that it is currently not possible to "unpack" a user-defined type class into its structure type or components thereof without also defining a function to do that packing/unpacking.

Small typo but also, I don't really understand this last sentence. Can you expand on what you mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed typo and reworded sentence in ad75c2a. Does it make more sense when written that way? The point is to get rid of the TBD w.r.t. whether FieldRefs can index into non-opaque user-defined types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you. The new wording is clear.


The structure field is optional. If not specified, the type class is considered to be fully opaque. This implies that a systems without built-in support for the type cannot manipulate values in any way, including moving and cloning. This may be useful for exotic, context-sensitive types, such as raw pointers or identifiers that cannot be cloned.

Note however that the vast majority of types can be trivially moved and copied, even if they cannot be precisely represented using Substrait's built-in types. In this case, it is recommended to use `binary` or `FIXEDBINARY<n>` (where n is the size of the type) as the structure type. For example, an unsigned 32-bit integer type could be defined as follows:

```yaml
name: u32
structure: "FIXEDBINARY<4>"
```

In this case, `i32` might also be used.

### Literals

Literals for user-defined types are represented using protobuf [Any](https://developers.google.com/protocol-buffers/docs/proto3#any) messages.

Expand Down
8 changes: 5 additions & 3 deletions text/simple_extensions_schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,11 @@ properties:
name:
type: string
structure:
type: object
additionalProperties:
$ref: "#/$defs/type"
oneOf:
- type: string # any data type
- type: object # syntactic sugar for a non-nullable named struct
additionalProperties:
$ref: "#/$defs/type"
parameters: # parameter list for compound types
$ref: "#/$defs/type_param_defs"
variadic: # when set, last parameter may be specified one or more times
Expand Down