Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes generating a separate class/struct for nested sequence and scalar type #111

Merged
merged 8 commits into from
Jun 10, 2024

Conversation

desaikd
Copy link
Contributor

@desaikd desaikd commented May 30, 2024

Issue #110:

Description of changes:

This PR works on generating nested sequence and scalar types in the same class/struct.

What is the change?

Previously, code generator would generate nested sequence or scalar types as a separate nested class or struct in a nested module. But this new change does not create a new class rather maps these nested sequence or scalar type in the same class/struct that uses it.

e.g. For a schema as below:

type::{
   name: struct_type,
   fields:  {
      foo: { type: list, element: int }, // nested sequence type
      bar: { type: string } // nested scalar type
   }
}

Previously generated code:

class StructType {
  private NestedType1 foo;
  private NestedType2 bar; 
   ...

   class NestedType1 {
      private ArrayList<Integer> value;
      ...
   }

   class NestedType1 {
      private String value;
      ...
   }
}

Code generated with this PR changes:

class StructType {
  private ArrayList<Integer> foo;
  private String bar; 
   ...
}

(Note: For example purpose I have only used java code but same is the case with Rust. Also, top level sequence/scalar top would still generate a separate class/struct. For more information on complete newly generated code see here)

Note for reviewer:

This PR also adds a new data model that can be used by code generator to render templates. The current approach can not generate nested sequence types like {type: list, element: {type: sexp, element: int} }. This will be resolved with the new data model as well as the new data model makes it easier to store all the information used by templates in a single place. This Pr just introduces the model but doesn't use it in implementation yet. The change for usage will be added in a separate PR.

Generated code:

Generated code for Java can be found here.
Generated code for Rust can be found here.
(You can find corresponding ISL files and Ion files in code-gen-projects directory)

List of changes:

Test:

Added new test schema and input files for this change.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

* Adds changes to `Field` to store `AbstractDataType` information of the
field value
* Modified to `map_constraint_to_abstract_data_type` to consider nested
sequence/scalar type and change `AsbtractDataType` and `tera_fields`
accordingly
* Modified template to call `util_macros` for reading or writing a field
as sequence
* Modified Rust templates to consider `util_macros` when reading or
writing a nested sequence type
@desaikd desaikd marked this pull request as ready for review May 30, 2024 16:09
@desaikd desaikd requested a review from popematt May 30, 2024 16:12
@@ -56,70 +58,36 @@ public class {{ target_kind_name }} {
null
{% endif %};
{% endfor %}
{% if abstract_data_type == "Value"%}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This template used to contain all abstract data types templates but now it separated into 3 templates.

  • class.templ (For struct type)
  • scalar.templ
  • sequence.templ

Comment on lines 77 to 78
{% if field.value_type is containing("ArrayList") %}
{{ util_macros::read_as_sequence(field=field) }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This statement is modified to read fields that are of type ArrayList using util_macros write_as_sequence. Previously it was not possible for a field in struct to be of sequence type(e.g. It would be inside some nested type NestedtypeX). But with this change it is possible to have fields with sequence type (e.g. ArrayList<Integer>).

Comment on lines +110 to +112
{% if field.value_type | is_built_in_type == false %}
{% if field.value_type is containing("ArrayList") %}
{{ util_macros::write_as_sequence(field=field) }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This statement is modified to write fields that are of type ArrayList using util_macros write_as_sequence. Previously it was not possible for a field in struct to be of sequence type(e.g. It would be inside some nested type NestedtypeX). But with this change it is possible to have fields with sequence type (e.g. ArrayList<Inetger>).

@@ -0,0 +1,69 @@
package {{ namespace }};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Contents of files java/scalar.templ, java/sequence.templ, rust/scalar.templ, rust/sequence.templ are copied from java/class.templ and rust/struct.templ respectively.

@@ -29,106 +31,52 @@ pub mod {{ target_kind_name | snake }} {

pub fn read_from(reader: &mut Reader) -> SerdeResult<Self> {
let mut abstract_data_type = {{ target_kind_name }}::default();
{% if abstract_data_type == "Value"%}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This template used to contain all abstract data types templates but now it separated into 3 templates.

  • struct.templ (For struct type)
  • scalar.templ
  • sequence.templ

Comment on lines +40 to +42
{% if field.value_type | is_built_in_type == false %}
{% if field.value_type is containing("Vec") %}
"{{ field.name }}" => { {{ util_macros::read_as_sequence(field=field) }} }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ PR tour) This statement is modified to read fields that are of type Vec using util_macros write_as_sequence. Previously it was not possible for a field in struct to be of sequence type(e.g. It would be inside some nested type NestedtypeX). But with this change it is possible to have fields with sequence type (e.g. Vec<i64>).

Comment on lines +68 to +70
{% if field.value_type | is_built_in_type == false %}
{% if field.value_type is containing("Vec") %}
{{ util_macros::write_as_sequence(field=field) }}
Copy link
Contributor Author

@desaikd desaikd May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This statement is modified to write fields that are of type Vec using util_macros write_as_sequence. Previously it was not possible for a field in struct to be of sequence type(e.g. It would be inside some nested type NestedtypeX). But with this change it is possible to have fields with sequence type (e.g. Vec<i64>).

Comment on lines +230 to +232
Struct, // Represents a template for a Rust struct or Java class with Ion struct value
Sequence, // Represents a template for a Rust struct or Java class with Ion sequence value
Scalar, // Represents a template for a Rust struct or Java class with Ion scalar value
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) Add new templates names here to use template based on abstract data type.

tests/cli.rs Outdated
@@ -279,8 +279,8 @@ fn test_write_all_values(#[case] number: i32, #[case] expected_output: &str) ->
}
}
"#,
&["nested_type: NestedType1"],
&["pub fn nested_type(&self) -> &NestedType1 {"]
&["nested_type: i64"],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(🗺️ PR tour) This file contains changes for having nested sequence type/scalar type in the same struct/class without generating separate structure.

let mut isl_type_name = value.type_reference().name();

if let Some(type_reference_name) = &type_name {
if type_reference_name.contains("NestedType") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a short-term hack, it's okay. It's a problem if this is here long term because there's nothing stopping a user from creating a type named NestedType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with the model change(Likely next PR) I think we can add a field in the model suggesting whether this is an inline type or not. And that should help with these kind of checks.

/// A target-language-agnostic data type that determines which template(s) to use for code generation.
#[allow(dead_code)]
#[derive(Debug, Clone, PartialEq, Serialize)]
pub enum CodeGenType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion—if you create types for each of the variants, then you can require specific types in the signatures of various functions.

pub enum CodeGenType {
    Value(Value),
    Sequence(Sequence),
    // etc.
}    

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Will change that.

// value: i64
// }
// ```
Value {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion—Scalar?

// }
// ```
Value {
isl_type_name: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is actually the name from ISL, then it should probably be Option<String>. If this is guaranteed to always have a name, then perhaps it should just be type_name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I will update this to be Option<String> instead.

@@ -74,7 +76,11 @@
reader.{{ field.value_type | camel }}Value();
{% endif %}
{% else %}
{{ field.value_type }}.readFrom(reader);
{% if field.value_type is containing("ArrayList") %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this break if someone creates an ISL type like MyArrayListener?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will. I will be changing the data model defined as per the current PR in model.rs. That has a reference to CodeGenType and based on this enum variant these checks can be defined properly.

{% endif %}
{% endfor %}
{% elif abstract_data_type is object and abstract_data_type is containing("Structure") %}
writer.step_in(IonType::Struct)?;
{% for field in fields %}
writer.set_field_name("{{ field.name }}");
{% if field.value_type | is_built_in_type == false %}
self.{{ field.name | snake }}.write_to(writer)?;
{% if field.value_type is containing("Vec") %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I define an ISL type such as FeatureVector or FlowVector, this is probably going to do surprising things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. which is why with the new model we can just check the data model type.

@desaikd
Copy link
Contributor Author

desaikd commented Jun 3, 2024

As per offline discussion, I am removing model.rs from this PR and will be added as a separate PR.

@desaikd desaikd merged commit 4959fdf into amazon-ion:master Jun 10, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants