Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to have null defaults for avro sink schema #21842

Conversation

moulimukherjee
Copy link
Contributor

@moulimukherjee moulimukherjee commented Sep 20, 2023

This adds a way to specify an option NULL DEFAULTS in the CREATE SINK sql. If this is set, then the generated avro schema will have nullable fields with default value of null.

Motivation

Fixes https://github.com/MaterializeInc/database-issues/issues/6549

Tips for reviewer

I added testdrive tests to check some nesting and various combination of how that flag could be specified. Please let me know what else will be valuable to test.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

chaas pushed a commit to chaas/materialize that referenced this pull request Sep 20, 2023
chaas pushed a commit to chaas/materialize that referenced this pull request Sep 20, 2023
@moulimukherjee moulimukherjee force-pushed the support-automatic-nullable-default branch 2 times, most recently from 42f7e04 to 86c615c Compare September 21, 2023 22:08
@moulimukherjee moulimukherjee changed the title WIP: Default nulls for avro sink schema Option to have null defaults for avro sink schema Sep 21, 2023
@moulimukherjee moulimukherjee marked this pull request as ready for review September 21, 2023 22:25
@moulimukherjee moulimukherjee requested a review from a team September 21, 2023 22:25
@moulimukherjee moulimukherjee requested a review from a team as a code owner September 21, 2023 22:25
Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parser/plan parts lgtm.

@moulimukherjee moulimukherjee changed the title Option to have null defaults for avro sink schema WIP: Option to have null defaults for avro sink schema Sep 22, 2023
@moulimukherjee moulimukherjee marked this pull request as draft September 22, 2023 04:27
@moulimukherjee
Copy link
Contributor Author

Moving to draft, looking into ci failures.

@benesch benesch removed their request for review September 22, 2023 06:32
@benesch
Copy link
Member

benesch commented Sep 22, 2023

I like where we landed with NULL DEFAULTS. Feels right to me. Don't wait for my further review—out of bandwidth.

src/sql/src/plan/statement/ddl.rs Outdated Show resolved Hide resolved
src/sql/src/plan/statement/ddl.rs Outdated Show resolved Hide resolved
misc/python/materialize/checks/temporal_types.py Outdated Show resolved Hide resolved
src/interchange/src/avro.rs Outdated Show resolved Hide resolved
@@ -77,7 +77,7 @@ impl fmt::Debug for JsonEncoder {
"schema",
&format!(
"{:?}",
build_row_schema_json(&self.value_columns, "schema", &BTreeMap::new())
build_row_schema_json(&self.value_columns, "schema", &BTreeMap::new(), false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here––using boolean params in the outermost API isn't ideal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am leaving this one in. In the following avro updates, there will be more configs like user defined comments and eventually user defined default values which should live in a struct along with this flag.

src/interchange/src/json.rs Outdated Show resolved Hide resolved
Mouli Mukherjee and others added 5 commits September 22, 2023 12:54
@moulimukherjee moulimukherjee force-pushed the support-automatic-nullable-default branch from e347730 to 86fcbdc Compare September 22, 2023 19:54
@moulimukherjee
Copy link
Contributor Author

moulimukherjee commented Sep 22, 2023

Seeing a "the in-memory state of the catalog does not match its on-disk state" error. Not happening locally though. Checking what's causing it. https://buildkite.com/materialize/tests/builds/64317#018abe9b-c962-4aea-b8d2-b87209dfdee5/1438-1440

src/interchange/src/json.rs Outdated Show resolved Hide resolved
@moulimukherjee moulimukherjee force-pushed the support-automatic-nullable-default branch from 5e632a9 to ec7a329 Compare September 25, 2023 18:54
Copy link
Contributor

@umanwizard umanwizard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Feel free to re-request review if you feel it's necessary after you mark this as non-draft.

src/interchange/src/avro/encode.rs Outdated Show resolved Hide resolved
src/interchange/src/json.rs Show resolved Hide resolved
@moulimukherjee moulimukherjee marked this pull request as ready for review September 26, 2023 00:20
@moulimukherjee moulimukherjee changed the title WIP: Option to have null defaults for avro sink schema Option to have null defaults for avro sink schema Sep 26, 2023
@moulimukherjee moulimukherjee enabled auto-merge (squash) September 26, 2023 00:24
@moulimukherjee
Copy link
Contributor Author

Addressed the feedback, set this to auto-merge.

@moulimukherjee moulimukherjee merged commit e20ae46 into MaterializeInc:main Sep 26, 2023
def- added a commit to def-/materialize that referenced this pull request Sep 27, 2023
Also added 2 small parsing checks

Follow-up to MaterializeInc#21842 which I missed
def- added a commit to def-/materialize that referenced this pull request Sep 27, 2023
Also added 2 small parsing checks

Follow-up to MaterializeInc#21842 which I missed
def- added a commit to def-/materialize that referenced this pull request Sep 27, 2023
Also added 2 small parsing checks

Follow-up to MaterializeInc#21842 which I missed
def- added a commit to def-/materialize that referenced this pull request Sep 27, 2023
Also added 2 small parsing checks

Follow-up to MaterializeInc#21842 which I missed
def- added a commit to def-/materialize that referenced this pull request Sep 27, 2023
Also added 2 small parsing checks

Follow-up to MaterializeInc#21842 which I missed
mgree pushed a commit to mgree/materialize that referenced this pull request Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants