You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Suppose the user issues a series of statements like:
CREATE STREAM IF NOT EXISTS FOO AS <QUERY 1>;
CREATE STREAM IF NOT EXISTS FOO AS <QUERY 2>;
CREATE STREAM IF NOT EXISTS FOO AS <QUERY 3>;
Then, ksql will write a series of entries to the command topic that look like:
CREATE STREAM IF NOT EXISTS FOO w/ DDL 1 and Query Plan 1
CREATE STREAM IF NOT EXISTS FOO w/ DDL 2 and Query Plan 2
CREATE STREAM IF NOT EXISTS FOO w/ DDL 3 and Query Plan 3
...
The second and third statements should not actually execute because the stream FOO does already exist. So we should never be writing them to the command topic.
When the node is restarted, the command runner then picks them up it compact these down to:
CREATE STREAM IF NOT EXISTS FOO w/ DDL 1 and no query plan
CREATE STREAM IF NOT EXISTS FOO w/ DDL 2 and no query plan
CREATE STREAM IF NOT EXISTS FOO w/ DDL 3 and Query Plan 3
This is not intentional - the command runner's replay compaction logic isn’t programmed to handle the IS NOT EXISTS case - it assumes that if another CSAS for the same stream is encountered it’s a CREATE OR REPLACE.
Then, when the engine goes to execute this, DDL1 succeeds. DDL 2 and DDL 3 are not applied because FOO already exists. This is another bug: the implementation of NOT EXISTS (fix: CREATE IF NOT EXISTS does not work at all by hemantgs · Pull Request #6073 · confluentinc/ksql ) changed the handling of DDL execution to not throw an error when trying to create an existing streamtable, and instead just silently return. So now if we accidentally generate a bad series of commands in the log we’ll just silently ignore this.
Finally, because we silently ignore the above case, we never actually start any query with Query Plan 3 and so the user has no query running after the restart.
So there’s 2 bugs here:
NOT EXISTS is not implemented correctly - those statements should just not ever get appended to the log
we shouldn’t silently ignore DDL execution errors - that’s a hack we shoehorned in to make the bad NOT EXISTS implementation work
The text was updated successfully, but these errors were encountered:
Suppose the user issues a series of statements like:
CREATE STREAM IF NOT EXISTS FOO AS <QUERY 1>;
CREATE STREAM IF NOT EXISTS FOO AS <QUERY 2>;
CREATE STREAM IF NOT EXISTS FOO AS <QUERY 3>;
Then, ksql will write a series of entries to the command topic that look like:
CREATE STREAM IF NOT EXISTS FOO w/ DDL 1 and Query Plan 1
CREATE STREAM IF NOT EXISTS FOO w/ DDL 2 and Query Plan 2
CREATE STREAM IF NOT EXISTS FOO w/ DDL 3 and Query Plan 3
...
The second and third statements should not actually execute because the stream FOO does already exist. So we should never be writing them to the command topic.
When the node is restarted, the command runner then picks them up it compact these down to:
CREATE STREAM IF NOT EXISTS FOO w/ DDL 1 and no query plan
CREATE STREAM IF NOT EXISTS FOO w/ DDL 2 and no query plan
CREATE STREAM IF NOT EXISTS FOO w/ DDL 3 and Query Plan 3
This is not intentional - the command runner's replay compaction logic isn’t programmed to handle the IS NOT EXISTS case - it assumes that if another CSAS for the same stream is encountered it’s a CREATE OR REPLACE.
Then, when the engine goes to execute this, DDL1 succeeds. DDL 2 and DDL 3 are not applied because FOO already exists. This is another bug: the implementation of NOT EXISTS (fix: CREATE IF NOT EXISTS does not work at all by hemantgs · Pull Request #6073 · confluentinc/ksql ) changed the handling of DDL execution to not throw an error when trying to create an existing streamtable, and instead just silently return. So now if we accidentally generate a bad series of commands in the log we’ll just silently ignore this.
Finally, because we silently ignore the above case, we never actually start any query with Query Plan 3 and so the user has no query running after the restart.
So there’s 2 bugs here:
NOT EXISTS is not implemented correctly - those statements should just not ever get appended to the log
we shouldn’t silently ignore DDL execution errors - that’s a hack we shoehorned in to make the bad NOT EXISTS implementation work
The text was updated successfully, but these errors were encountered: