Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add time/timestamp 6 connector smoke test coverage #12577

Conversation

findepi
Copy link
Member

@findepi findepi commented May 27, 2022

No description provided.

@findepi findepi added test maintenance Project maintenance task labels May 27, 2022
@findepi findepi requested review from wendigo, losipiuk, ebyhr and hashhar May 27, 2022 13:32
@cla-bot cla-bot bot added the cla-signed label May 27, 2022
@findepi findepi force-pushed the findepi/add-time-timestamp-6-connector-smoke-test-coverage-8b0ce1 branch from 9aec51a to 32ede8a Compare May 27, 2022 14:57
Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for Phoenix failure:

Error:  io.trino.plugin.phoenix5.TestPhoenixConnectorTest.testDataMappingSmokeTest[time!](12)  Time elapsed: 1.698 s  <<< FAILURE!
java.lang.AssertionError: 

Expecting code to raise a throwable.
	at io.trino.testing.BaseConnectorTest.testDataMapping(BaseConnectorTest.java:3284)
	at io.trino.testing.BaseConnectorTest.testDataMappingSmokeTest(BaseConnectorTest.java:3258)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

@findepi findepi force-pushed the findepi/add-time-timestamp-6-connector-smoke-test-coverage-8b0ce1 branch from 32ede8a to 5b270e4 Compare May 30, 2022 09:03
@@ -304,7 +304,10 @@ protected Optional<DataMappingTestSetup> filterDataMappingSmokeTestData(DataMapp
{
String typeName = dataMappingTestSetup.getTrinoTypeName();
if (typeName.startsWith("decimal(")
|| typeName.equals("time(6)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Would EnumSet be a better "logical" way to group these ? (this would be a change to how we usually do this - so this is just a question, not specifically about this particular PR/change)

private static final EnumSet UNSUPPORTED_TYPES = EnumSet.of(TIME_6, TIMESTAMP_6, TIMESTAMP_WITH_TZ_3, TIMESTAMP_WITH_TZ_6).addAll(EnumSet.range(DECIMAL_3_0, DECIMAL_38_0));
[...]
TrinoType type = TrinoType.fromName(typeName);
if (UNSUPPORTED_TYPES.contains(type) {
    return Optional.of(dataMappingTestSetup.asUnsupported());
}

Pros:

  • faster and less error prone (they work like a bitset / it's not a string comparison)
  • logic of what's supported is defined at the field level (here it's in a Test class, but still useful to have it grouped/defined in one place, rather than littered about in test - how do we know if we've changed all the if methods to include the new type?,etc,
    Cons:
  • harder to do "fuzzy" matching (e.g., startsWith("decimal")), but it's still possible to have a "fromName(String typeName)" which does that matching.
  • might be difficult to maintain across different DB/drivers (and a lot of boilerplate for types if we duplicate it for each driver).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With parametric types, there could be plenty of enums like this.

Copy link
Member

@leveyja leveyja Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but the set is finite / reflects the types we support / do not support. It's the same as the number of String representations of types, except the metadata about the precision, etc, can be stored with the enum type, and there's a centralised place to see information about the types. (Find Usages can also be used in the IDE, instead of full text searching for logic related to a particular type, etc).

public enum TrinoType {
    [...]
    TEXT(Integer.MAX_VALUE/2, "Text type", ""),
    [...]
    TIME(?, "description", "other properties),
    TIME3(3, "description", "other properties),
    TIME6(6, "description", "other properties),
    [...]
    ;
    int precision;
    String description;
    Function<T, R> converter;
    public TrinoType(int precision, String description, Function<T, R> converter(?)) {
        this.precision = precision; this.description = description; // etc
   }
}

Some benefits:

  • A developer can create a set containing all time types using EnumSet.range(TIME3, TIMESTAMP6_TZ),
  • The set of supported / unsupported types can be declared in an EnumSet for each connector (there's also an EnumSet.noneOf(...) which creates the compliment of a set), allowing for declarative type support (not having to check if statements in multiple places for each connector (e.g., filterDataMappingSmokeTestData, etc).
  • Adding a "new" type would simply be a matter of adding it to "supported" or "unsupported" EnumSets.
  • Right now, in test code we have if (typeName.startsWith(...)) or .contains(...), which means adding a new type is not always safe / will match new types accidentally. It also means modifying if else logic in several places to define whether a type is supported/tested or not, and it's hard to spot "errors of omission" ( e.g., the test returns "Optional.empty()" if the type isn't recognised, and we just don't "see" that a type is missing in the review, etc.)

(Final random example from searching the code)

    @DataProvider
    public static Object[][] timestampTypes()
    {
        // Timestamp with timezone is not supported by the SqlServer connector
        return new Object[][] {
                {"timestamp"},
                {"timestamp(3)"},
                {"timestamp(6)"},
                {"timestamp(9)"},
                {"timestamp(12)"}
        };
    }

I'd prefer to see

public class SqlServerClient
{
...
public static final SUPPORTED_TIMESTAMP_TYPES = EnumSet.of(....);         // Timestamp with timezone is not supported by the SqlServer connector

}

public class TestSqlServerConnectorTest
{

    @DataProvider
    public static Object[][] timestampTypes()
    {
//        return asObjectArray(SqlServerClient.SUPPORTED_TIMESTAMP_TYPES);
        return SUPPORTED_TIMESTAMP_TYPES.stream()
                .map(trinoTypes -> new Object[]{trinoTypes.name()})
                .collect(Collectors.toList()).toArray(Object[][]::new); // could probably use a utility method for this :-)
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but the set is finite / reflects the types we support / do not support

The number of (parameterized) types we do support is more than 2 billion.

Of course, we don't want to test every each of them separately, so we're narrowing the list to types tested here.
Since it's single use, string matching isn't so bad an idea -- because, in the worst case, the test will test too much (not too little).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - but in code I see "time(6)" - I don't see "time(5)" - so these can be enumerated and be < 1000.
It's String Literal versus "Type" - I'm not suggesting we move to Scala, I'm suggesting we enumerate the types we support in a single place (again, this is a high-level question revolving around "is there a better way?", not directly related to this PR, etc). I'll investigate a bit more and if I come up with a way to consolidate this (as we have with BCT feature support) I'll propose something more concrete than these "review comment" suggestions 👍

@findepi findepi merged commit e23e1c4 into master Jun 1, 2022
@findepi findepi deleted the findepi/add-time-timestamp-6-connector-smoke-test-coverage-8b0ce1 branch June 1, 2022 12:33
@github-actions github-actions bot added this to the 383 milestone Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed maintenance Project maintenance task test
Development

Successfully merging this pull request may close these issues.

4 participants