Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug log when type_code fails to convert to a data_type #8957

Closed
wants to merge 9 commits into from
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231031-161922.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Debug log when `type_code` fails to convert to a `data_type`
time: 2023-10-31T16:19:22.226267-06:00
custom:
Author: dbeatty10
Issue: "8912"
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ core/dbt/docs/build/html/searchindex.js binary
core/dbt/docs/build/html/index.html binary
performance/runner/Cargo.lock binary
core/dbt/events/types_pb2.py binary
core/dbt/docs/build/html/_static/jquery.js binary
10 changes: 7 additions & 3 deletions core/dbt/events/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,10 @@ The event module provides types that represent what is happening in dbt in `even
When events are processed via `fire_event`, nearly everything is logged. Whether or not the user has enabled the debug flag, all debug messages are still logged to the file. However, some events are particularly time consuming to construct because they return a huge amount of data. Today, the only messages in this category are cache events and are only logged if the `--log-cache-events` flag is on. This is important because these messages should not be created unless they are going to be logged, because they cause a noticable performance degredation. These events use a "fire_event_if" functions.

# Adding a New Event
* Add a new message in types.proto, and a second message with the same name + "Msg". The "Msg" message should have two fields, an "info" field of EventInfo, and a "data" field referring to the message name without "Msg"
* run the protoc compiler to update types_pb2.py: make proto_types
* Add a wrapping class in core/dbt/event/types.py with a Level superclass plus code and message methods
* Install the [`protoc`](https://grpc.io/docs/protoc-installation/) protocol buffer compiler
* Add a new message in core/dbt/events/types.proto, and a second message with the same name + "Msg". The "Msg" message should have two fields, an "info" field of EventInfo, and a "data" field referring to the message name without "Msg"
* run the protoc compiler to update types_pb2.py: `make proto_types`
* Add a wrapping class in core/dbt/events/types.py with a Level mixin superclass (e.g. `DebugLevel`) plus code and message methods
* Add the class to tests/unit/test_events.py

We have switched from using betterproto to using google protobuf, because of a lack of support for Struct fields in betterproto.
Expand Down Expand Up @@ -50,6 +51,9 @@ logger = AdapterLogger("<database name>")

## Compiling types.proto

Install the `protoc` protocol buffer compiler:
- https://grpc.io/docs/protoc-installation/

After adding a new message in `types.proto`, either:
- In the repository root directory: `make proto_types`
- In the `core/dbt/events` directory: `protoc -I=. --python_out=. types.proto`
10 changes: 10 additions & 0 deletions core/dbt/events/types.proto
Original file line number Diff line number Diff line change
Expand Up @@ -861,6 +861,16 @@ message ConstraintNotSupportedMsg {
ConstraintNotSupported data = 2;
}

// E050
message TypeCodeNotFound {
int32 type_code = 1;
}

message TypeCodeNotFoundMsg {
EventInfo info = 1;
TypeCodeNotFound data = 2;
}

// I - Project parsing

// I001
Expand Down
9 changes: 9 additions & 0 deletions core/dbt/events/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -796,6 +796,15 @@ def message(self) -> str:
return line_wrap_message(warning_tag(msg))


class TypeCodeNotFound(DebugLevel):
def code(self) -> str:
return "E050"

def message(self) -> str:
msg = f"The `type_code` {self.type_code} was not recognized, which may affect error messages for enforced contracts that fail as well as `Column.data_type` values returned by `get_column_schema_from_query`"
return msg


# =======================================================
# I - Project parsing
# =======================================================
Expand Down
1,838 changes: 921 additions & 917 deletions core/dbt/events/types_pb2.py

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions plugins/postgres/dbt/adapters/postgres/connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
from dbt.adapters.sql import SQLConnectionManager
from dbt.contracts.connection import AdapterResponse
from dbt.events import AdapterLogger
from dbt.events.functions import fire_event
from dbt.events.types import TypeCodeNotFound

from dbt.helper_types import Port
from dataclasses import dataclass
Expand Down Expand Up @@ -207,4 +209,5 @@ def data_type_code_to_name(cls, type_code: int) -> str:
if type_code in string_types:
return string_types[type_code].name
else:
fire_event(TypeCodeNotFound(type_code=type_code))
return f"unknown type_code {type_code}"
15 changes: 10 additions & 5 deletions tests/functional/contracts/test_nonstandard_data_type.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import pytest
from dbt.tests.util import run_dbt, run_dbt_and_capture
from dbt.tests.util import run_dbt_and_capture


my_numeric_model_sql = """
Expand Down Expand Up @@ -44,7 +44,9 @@ def models(self):
}

def test_nonstandard_data_type(self, project):
run_dbt(["run"], expect_pass=True)
expected_debug_msg = "The `type_code` 790 was not recognized"
_, logs = run_dbt_and_capture(["--debug", "run"], expect_pass=True)
assert expected_debug_msg in logs


class TestModelContractUnrecognizedTypeCodeActualMismatch:
Expand All @@ -57,8 +59,10 @@ def models(self):

def test_nonstandard_data_type(self, project):
expected_msg = "unknown type_code 790 | DECIMAL | data type mismatch"
_, logs = run_dbt_and_capture(["run"], expect_pass=False)
expected_debug_msg = "The `type_code` 790 was not recognized"
_, logs = run_dbt_and_capture(["--debug", "run"], expect_pass=False)
assert expected_msg in logs
assert expected_debug_msg in logs


class TestModelContractUnrecognizedTypeCodeExpectedMismatch:
Expand All @@ -71,6 +75,7 @@ def models(self):

def test_nonstandard_data_type(self, project):
expected_msg = "DECIMAL | unknown type_code 790 | data type mismatch"
_, logs = run_dbt_and_capture(["run"], expect_pass=False)
print(logs)
expected_debug_msg = "The `type_code` 790 was not recognized"
_, logs = run_dbt_and_capture(["--debug", "run"], expect_pass=False)
assert expected_msg in logs
assert expected_debug_msg in logs
1 change: 1 addition & 0 deletions tests/unit/test_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,7 @@ def test_event_codes(self):
types.FinishedRunningStats(stat_line="", execution="", execution_time=0),
types.ConstraintNotEnforced(constraint="", adapter=""),
types.ConstraintNotSupported(constraint="", adapter=""),
types.TypeCodeNotFound(type_code=0),
# I - Project parsing ======================
types.InputFileDiffError(category="testing", file_id="my_file"),
types.InvalidValueForField(field_name="test", field_value="test"),
Expand Down