-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add recipes to migrate from TestNG to AssertJ #520
Conversation
c08c364
to
9ed3703
Compare
9ed3703
to
cf77667
Compare
Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven.site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates -pl '!:trino-server-rpm' ``` Some formatting was fixed manually for readability. Then, rewrite remaining TestNG Assertions to AssertJ manually, because some entries were not migrated by `./mvnw rewrite:run`.
Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven.site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates -pl '!:trino-server-rpm' ``` Some formatting was fixed manually for readability. Then, rewrite remaining TestNG Assertions to AssertJ manually, because some entries were not migrated by `./mvnw rewrite:run`.
Awesome work here @ssheikin ! Only saw your PR just now as drafts don't immediately pop up as a notification for me; is there anything you're looking to add before this is ready to review? |
src/main/java/org/openrewrite/java/testing/assertj/testng/TestNgAssertEqualsToAssertThat.java
Outdated
Show resolved
Hide resolved
@timtebeek This solution is a straight-forward copy-paste, which is kind of a 'works for me'. |
Thanks for the response and great start @ssheikin ; reading between the lines I get the sense that this is donation of recipes that you'd be glad for us to take over, but not immediately planning any future work on this yourself; is that correct? If so that'd be perfectly fine(!); just wondering what kind of collaboration to expect on this going forward. Would you be ok for us to rework this into something we can merge more easily? Indeed my thoughts would be to do what we can with Refaster recipes, as some of those are already available, and they greatly simplify the recipes that we'd have to maintain. Adding some tests to this PR (which I'm open to doing) should help test out that theory. The case you listed about assertions in Lambdas is interesting; We'd need to add some unit tests for those specifically, as I suspect there might be an issue with how tree elements are visited that incorrectly skips lambda blocks within other assertion method arguments. Enough fun areas to explore I guess, and thanks again for kicking this off with a first PR! :) |
Thank you for kind words. I'm very happy that during the research to migrate large codebase I've stumbled upon openrewrite and used it successfully. I'd be glad to know more about internal kitchen, but I've limited free cycles now to move this PR over the line. Absolutely! Please take over this effort.
It's not exactly like that. Sometimes lambda blocks are visited and even some occurrences inside are migrated, but not necessarily all. Unfortunately, I have not noticed a pattern, and not sure if such cases are present in |
and improve existing ones Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugins> <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> </plugins> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven .site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates ``` AssertJ dependency to `pom.xml` added manually. Also, some minor compilation/compatibility errors fixed manually. Occurrences, not rewritten by recipe were rewrote manually. Co-authored-by: Mateusz Gajewski <[email protected]> Co-authored-by: Sasha Sheikin <[email protected]>
and improve existing ones Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugins> <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> </plugins> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven .site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates ``` AssertJ dependency to `pom.xml` added manually. Also, some minor compilation/compatibility errors fixed manually. Occurrences, not rewritten by recipe were rewrote manually. Co-authored-by: Mateusz Gajewski <[email protected]> Co-authored-by: Sasha Sheikin <[email protected]>
and improve existing ones Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugins> <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> </plugins> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven .site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates ``` AssertJ dependency to `pom.xml` added manually. Also, some minor compilation/compatibility errors fixed manually. Occurrences, not rewritten by recipe were rewrote manually. Co-authored-by: Mateusz Gajewski <[email protected]> Co-authored-by: Sasha Sheikin <[email protected]>
Unfortunately it seems that for now we can't leverage the Picnic recipes, as those introduce a Java 17 dependency which we don't want to enforce upon our users just yet. 🥲 That means we can likely proceed with the recipe produced here, or reimplement some of those Refaster recipes here, but compile them with Java 8 instead. |
There is a complete checklist of all TestNG assertions here: Philzen/rewrite-TestNG-to-JUnit5#3 (comment) which may be re-used to track feature-completeness of this recipe. Have just implemented migrations to JUnit5 assertions for nearly all of them (with some rather hideously complex stream-based replacements for those that had no direct equivalents, where i often thought to myself „this could be so much prettier using assertJ") – therefore i'd be up for contributing to this recipe soon. |
* Add additional test coverage for integer number to varchar coercer Co-authored-by: Yiqun Zhang <[email protected]> * Allow integer number to varchar coercion in ORC unpartitioned table Co-authored-by: Yiqun Zhang <[email protected]> * Allow integer number to varchar coercion in Parquet unpartitioned table * Quote current test param when test fails * Use enhanced switch statement * Remove redundant explicit generics * Extract a dedicated Double to Varchar Coercer for ORC * Allow double to varchar coercion in Parquet unpartitioned table * Merge Hudi query runners into one * Add support for case insensitive name cache in BigQuery Co-authored-by: regadas <[email protected]> * Fix failure when reading signed timestamp with time zone stats in Delta Lake * Avoid org.locationtech.jts:jts-core exclusion in Pinot * Improve 449 release notes * Add docs for Hive metadata caching config * Remove exceptions related to older version of Hive * Simplify BaseTestHiveCoercion Older version of Hive doesn't support Float/Real type for parquet table format, but Hive 3.0+ doesn't have such restriction. * Allow float to varchar coercion in hive table This coercion would work for partitioned tables for all formats and for unpartitioned tables it would work for ORC and Parquet table format. * Improve error message for resource groups * Gracefully handle missing spooling output stats in FTE scheduler It is rare but possible to get empty spooling output stats for task which completed successfully. This may happen if we observe FINISHED task state based on received TaskStatus but are later on unable to successfully retrieve TaskInfo. In such case we are building final TaskInfo based on last known taskInfo, just updating the taskState field. The spooling output stats will not be present. As we need this information in FTE mode we need to fail such task artificially * Add JDBC property to use current catalog in metadata if none provided Some BI tools don't pass a `catalog` when calling the `DatabaseMetaData` `getTables`, `getColumns` and `getSchemas` methods. This makes the JDBC driver search across all catalogs which can be expensive. This commit introduces a new boolean connection property `assumeNullCatalogMeansCurrentCatalog` (disabled by default) to be used with such BI tools. If enabled the driver will try to use current `catalog` of the JDBC connection when fetching Trino metadata like `getTables`, `getColumns`, `getSchemas` if the `catalog` argument to those methods is passed as `null`. Co-authored-by: Rafał Połomka <[email protected]> Co-authored-by: Ashhar Hasan <[email protected]> * Make sealed version of Avro Logical Types supported natively * Refactor block building decoder creation to connector specific class * Remove unnecessary exclusions from Pinot * Remove unnecessary, commented code * Rewrite JUnit Assertions to AssertJ Changed automatically by openrewrite recipe ``` <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.JUnitToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.10.1</version> </dependency> </dependencies> </plugin> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven.site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates -pl '!:trino-product-tests,!:trino-product-tests-launcher,!:trino-server-rpm' ``` Some formatting was fixed manually for readability. Then, rewrite remaining JUnit Assertions to AssertJ manually, because some entries were not migrated by `./mvnw rewrite:run`. * Rewrite TestNG Assertions to AssertJ Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven.site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates -pl '!:trino-server-rpm' ``` Some formatting was fixed manually for readability. Then, rewrite remaining TestNG Assertions to AssertJ manually, because some entries were not migrated by `./mvnw rewrite:run`. * Prefer AssertJ assertions over TestNG, JUnit * Update airbase to 157 * Update exec-maven-plugin to 3.3.0 * Pass additional type information in OrcColumn OrcTypeKind captures only data types details while other information like precision and scale are not a part of these information. * Fix unsupported reads of decimals as varchar in parquet * Allow decimal to varchar coercion for unpartitioned tables * Fix grammar error in error message * Simplify boolean comparison in getColumnMappingMode() * Allow char to varchar coercion for hive tables * Improve docs for HTTP server config * Verify spark compatibility in native-fs special chars test * Use Identifier.getValue when analyzing table function arguments Otherwise, getPartitionBy and getOrderBy methods in TableArgument SPI returns quoted column names. * Avoid redundant null check before sizeOf() * Remove unused method from BigQueryClient * Convert DeleteFile to record in Iceberg * Convert IcebergMaterializedViewDefinition to record * Convert IcebergPartitioningHandle to record * Convert IcebergWritableTableHandle to record * Convert TrinoSortField to record in Iceberg * Convert MemoryInsertTableHandle to record * Convert MemoryDataFragment to record * Test Memory connector config class * Test reading liquid clustering tables in Delta Lake * Add support for reading UniForm with Iceberg in Delta Lake * Skip unsupported variant type in Delta Lake * Reduce number of row groups in tests using small row groups Avoids creating too many splits when split offsets are populated * Fix detecting start of row group in parquet reader * Populate split offsets when writing orc/parquet files in iceberg Offsets can be used by readers to align splits with row-group/stripe boundaries * Use timestampColumnMapping in ClickHouse * Avoid using deprecated ClickHouseContainer class * Set air.compiler.fail-warnings as true in ClickHouse * Refactor TestMultipleDistinctAggregationToMarkDistinct Remove deprecated symbol usage. * Extract DistinctAggregationStrategyChooser Extract the logic to determine whether the direct distinct aggregation applicability, which can be reused in multiple optimiser rules. * Use maximum NDV symbol to choose distinct aggregation strategy * Move OptimizeMixedDistinctAggregations Move class before refactoring to preserve history * Support for multiple aggregations in OptimizeMixedDistinctAggregations Adds support for multiple distinct aggregations in OptimizeMixedDistinctAggregations`. * Replace optimizer.optimize-mixed-distinct-aggregations Replace optimizer.optimize-mixed-distinct-aggregations with a new optimizer.distinct-aggregations-strategy `pre_aggregate`. Also rename corresponding config property optimizer.mark-distinct-strategy to optimizer.distinct-aggregations-strategy and values to NONE -> SINGLE_STEP and ALWAYS -> MARK_DISTINCT * Enable OptimizeMixedDistinctAggregations automatically Use estimated aggregation source NDV and the number of grouping keys to decide if pre-aggregate strategy should be used for a given aggregation * Remove unnecessary override method isRemotelyAccessible in IcebergSplit * Add non-null check in MemoryConnector * Fix example in test guideline * Add support for TRUNCATE statement in memory connector * Remove unsupported version check and usage in Cassandra * Add support for truncate statement in Iceberg * Enable TopN for non-textual types for clickhouse-connector * Prevent OOM when expanding projection over BigQuery no columns scan * Improve ternary operator formatting in BigQuerySplitManager * Produce single Big Query split when row count is all that matters Similar to what we do in other connectors. There is no reason to create multiple splits just to return row count information. * Use safe idiom of getting the only element * Automatically configure BigQuery scan parallelism * Cleanup in OrcTypeTranslator Use enhanced switch statements and re-arrange code based on the source data type * Fix coercion gaps for Hive tables in ORC format There were a few difference on the coercion supported between partitioned and un-partitioned table for ORC format. * Make scheduling of remote accessible splits with addresses more strict UniformNodeSelector or FTE scheduler will schedule remote accessible splits on selected nodes if such nodes are available and only fallback to other nodes if preferred nodes are no longer part of cluster. Connector might have stale node information when creating splits which could result in choosing offline nodes. Additionally, in FTE mode nodes can go down so split addresses could no longer be valid then task is restarted. Additionally, this commit simplifies UniformNodeSelector optimizedLocalScheduling which was hard to reason about and was not taking advantages of recent improvements like adaptive split queue length. Co-authored-by: Karol Sobczak <[email protected]> * Enable query and task retries in MariaDbClient Copy test overrides from 7b80852, as MariaDB also needs them. * Remove unused function parameter * Keep separate top-level classes in separate compilation units It's considered anti-pattern to have multiple non-nested classes in one .java file. We even have a checker for that, but it was suppressed in this case. * Remove redundant cast * Use anonymous variables * Cleanup warnings on top-level `@SuppressWarnings` should be applied selectively, not for the whole compilation unit. * Make helper methods static * Remove redundant throws declaration * Allow pinning to WebIdentityTokenCredentialsProvider Allow users to only use the WebIdentityTokenCredentialsProvider instead of the default credentials provider chain. * Extract STS client getter * Allow pinning to WebIdentityTokenCredentialsProvider Allow users to only use the WebIdentityTokenCredentialsProvider instead of the default credentials provider chain. * Extract constant to represent partition for non-partitioned Hudi table Co-Authored-By: Yuya Ebihara <[email protected]> * Small code fixes * remove dead code * fix comment typo * Improve performance of json functions Avoid allocating heap ByteBuffer used by InputStreamReader. * Fix broken testTruncateTable in Iceberg Snowflake catalog * Explicitly configure executorService for s3 multipartuploads Previously used forkjoin common pool meant for cpu bound operations * Increase s3.max-connections from AWS default of 50 to 500 Aligns native S3FileSystem with legacy TrinoS3FileSystem * Verify body in doScheduleAsyncCleanupRequest We observed rare cases when we got non-JSON result in doScheduleAsyncCleanupRequest and current logging does not provide enough information to understand the root cause. * Fix uncheckedCacheGet javadoc The `uncheckedCacheGet` will propagate `ExecutionError` if `Cache.get` throws it. This reverts commit 3d0632d. * Add Iceberg query runner with TASK retries enabled * Add `exchange.azure.endpoint` configuration option This option can be used instead of `exchange.azure.connection-string` and will use the default azure credentials when set. * Fix failure when reading non-primitive Parquet fields in Iceberg * Skip getting stats on varbinary type in Delta * Rename tests in TestDeltaLakeParquetStatisticsUtils * Add support for stats on timestamp type in Delta Lake * Fix format of `@DefunctConfig` values in Cassandra * Add support for typeWidening(-preview) reader feature in Delta Lake * Add test for HudiPlugin * Run TestPhoenixConnectorTest in parallel * Nessie: Support APIv2 client By default, Nessie API version will be inferred from Nessie URI. If the User has a custom URI which doesn't have version info in the URI, user can configure `iceberg.nessie-catalog.client-api-version` * Update netty to 4.1.111.Final * Update AWS SDK v2 to 2.26.0 * Update nessie to 0.90.2 * Update AWS SDK v1 to 1.12.741 * Update reactor-core to 3.6.7 * Update checker-qual to 3.44.0 * Update flyway to 10.15.0 * Update MongoDB to 5.1.1 * Update lucene-analysis-common to 9.11.0 * Update hudi to 0.15.0 * Update freemarker to 2.3.33 * Fix evictable cache invalidation race condition Before the change the following race was possible between threads A, B, C: - A calls invalidate(K) - A increments: invalidations++ - B changes state to be cached, and therefore calls invalidate(K) too - B increments: invalidations++ - C calls get(K) - C reads invalidations counter - C retrieves current token T for key K - C reads value V for T from cache - A reads and removes token T (same) for key K - B attempts to read and remove token for key K, not found - B exits invalidate(K) - C checks invalidations counter (didn't check) - C revives, i.e. re-inserts token T for key K - B calls get(K) - B retrieves token T (same) for key K - B reads value V for T from cache -- despite B having called invalidate(K) At least in this situation the problem is transient. Thread A will momentarily invalidate dataCache for token T, completing the invalidation. This commit fixes this. The bug was introduced by token reviving (commit 17faae3). This commit reverts that one and provides a different solution to the problem that commit was solving. * Pin 3rd party CI action versions GitHub allows to delete and re-publish a tag, so referencing 3rd party action by tag name should be discouraged. * Add encoding to error code in OAuth2 callback handler * Rearchitect IR optimizer and evaluator Model it as a collection of independent transformations, similar to the plan optimizer. * Distribute comparisons over case and switch Transforms expressions such as: a = case when c1 then r1 when c2 then r2 else r3 end into case when c1 then a = r1 when c2 then a = r2 else a = r3 end Additionally, simplifies expressions such as: case when c then true else false end * Enable Dependabot for GitHub Actions Dependabot will automatically open pull requests to update GitHub Actions. The main benefit of this is knowing that new versions are available, especially if third-party GitHub Actions are pinned to specific versions. Running it weekly should give enough time to review updates, and also react switfly to potential security updates. * Avoid abort in testing helper method or for-loop * Convert ResourceGroupInfo to a record class * Make sure expression type does not change after processing with rule * modify alluxio version to stable 2.9.4 * Include detailed error in S3 batch delete message * Clarify the default value of iceberg.catalog.type in Iceberg docs * Remove unused argument from getSparkTableStatistics * Fix description in Redis plugin * Optimized DATE_TRUNC function * Update libraries-bom to 26.41.0 * Update clickhouse-jdbc to 0.6.1 * Update nimbus-jose-jwt to 9.40.0 * Update AWS SDK v2 to 2.26.1 * Update GCS connector to hadoop3-2.2.23 * Update AWS SDK v1 to 1.12.742 * Update nessie to 0.90.4 * Update elasticsearch to 7.17.22 * Set hive.metastore.catalog.dir in Iceberg and Delta query runners * Remove redundant annotation * Use takari-smart-builder It's a default builder for mvnd * Replace air.main.basedir with session.rootDirectory * Update maven to 3.9.8 This version is aligned with mvnd-1.0.0 * Migrate iceberg caching product test to native filesystem * Fix argument order in ST_Point() function docs * Update docker-image version to 97 * Test allowColumnDefaults writer feature in Delta Lake * Check duplicated field names in Iceberg * Update jline to 3.26.2 * Update parquet to 1.14.1 * Improve code comment * Expose driver execution stats via JMX * Clean up code comments in JdbcJoinPushdownUtil * Clarify required configuration for fs caching * Track rate of pending acquires processing in BinPackingNodeAllocatorService * Add reason to pending acquires counters in BinPackingNodeAllocatorService * Install plugins in testing server before startup completes When startup completes (`StartupStatus.startupComplete()`), a worker will accept tasks (`TaskResource.failRequestIfInvalid`). The plugins and functions need to be installed before that happens. * Migrate unsupported writer version to integration test in Delta * Migrate unsupported writer feature to integration test in Delta * Remove testTrinoAlterTablePreservesChangeDataFeed testTrinoAlterTablePreservesTableMetadata verifies that table configuration is preserved. * Migrate append-only to integration test in Delta * Use correct isolation level in writing the commit info for DELETE * Add Trino 450 release notes * Document S3 max connections default change * [maven-release-plugin] prepare release 450 * [maven-release-plugin] prepare for next development iteration * Make AsyncResponse aware of the client disconnection With the current Jersey and Jetty implementations @suspended AsyncResponse can generate a lot of warnings that are caused by the fact that Jetty eagerly recycles request/response objects when the HTTP/2 RST_STREAM frame is sent by the client during request aborting. We abort in-flight requests for different reasons: either timeout is reached or when the client is closed during cleanup. For HTTP/1 that doesn't matter, but for HTTP/2 this will cause the AsyncResponse.resume to log an error, since the underlying request/response objects are already recycled and there is no client listening for the response. This change also cancels Future<?> bound to the AsyncContext since there is no point in processing it anymore. * Retain authorization during client redirects for basic auth * Create tables under volatile directory in TestDeltaLakeBasic catalogDir will be deleted after the test execution. * Fix wrong entry in 450 release note * Disallow row type without field names in Delta Lake * Check duplicated field names in Delta Lake Also, include exception message in DeltaLakeMetadata.addColumn. * Implement incremental refresh for single-table, predicate-only MVs (trinodb#20959) Add incremental refresh support for simple Iceberg MVs When MV refresh is executing, the planner suggests either incremental or full refresh to the connector. In this first phase, incremental refresh is suggested only when Scan/Filter/Project nodes are present in the plan tree - otherwise full refresh is performed. The Iceberg connector can act on this signal and use IncrementalAppendScan to scan the 'delta' records only and append them to the MV storage table (without truncation). Co-authored-by: Karol Sobczak <[email protected]> * Add Expression to Result in ExpressionAssertProvider * Add transform function const * Add JsonStringArrayExtract function Optimzie common case of transform(cast(json_parse(varchar_col) as array<json>), json -> json_extract_scalar(json, jsonPath)) case to be single and more performant call of JsonStringArrayExtract. * Expose lease distribution stats from BinPackingNodeAllocatorService * Update airlift to 249 * Fix product tests previously migrated to native fs to not use legacy fs * Add non-null check in SortingFileWriter * Allow adding and droping fields to records wrapped in array * Allow changing field type in records wrapped in an array * Refactor TestJsonStringArrayExtractScalar * Use List/Set injection into configuration This makes configuration more explicit in regard to expected types * Fix reading empty files when using native S3 FS * Fix configuration mistake * Allow reading empty files in native GCS FS * Workaround for reading empty files using native S3 FS When reading an empty file from S3 as a stream, using the AWS SDK v2, it incorrectly keeps reading the MD5 checksum included at the end of the response. This has been reported upstream as aws/aws-sdk-java-v2#3538 * Optimize IntegerNumberToDoubleCoercer#applyCoercedValue to avoid repeated reading of Block values * Add support for prefix configuration in Iceberg REST catalog REST catalog supports a `prefix` in the resource path as per the spec. But it is not exposed from Trino `IcebergRestCatalogConfig`. In Spark, this works by default because they accept generic property bag. * Rename getColumns to getTopLevelColumns in iceberg connector * Support partitioning on nested ROW fields in Iceberg Co-authored-by: Victoria Bukta <[email protected]> Co-authored-by: Yuya Ebihara <[email protected]> * Add Parquet Bloom filter write support to Iceberg connector * List unhealthy containers when product tests health check fails * Reschedule replacement tasks in case of artificial failure due to missing spooling output size The code introduced in trinodb#22298 was lacking rescheduling replacement tasks. As a result query execution got stuck after we observed missing spooling output stats for task. * Convert inner classes in ConstantPopulatingPageSource to record * Convert Iceberg PartitionColumn to record * Convert IcebergInputInfo to record * Convert IcebergPageSink.PartitionColumn to record * Convert Iceberg ReaderPageSourceWithRowPositions to record * Convert IcebergParquetColumnIOConverter.FieldContext to record * Convert IcebergPartitionColumn to record * Convert PartitionTransforms.ColumnTransform to record * Use field id in freeCurrentRowGroupBuffers for consistency * Remove interning from ColumnChunkProperties Avoids having unbounded and unaccounted memory usage. Also avoids the overheads associated with concurrent map look-up * Convert ColumnChunkProperties into a record * Removed unused code from ColumnChunkMetadata * Optimize ParquetReader#getColumnChunkMetaData Avoid unnecessary conversion to ColumnPath Compare arrays in reverse to find mismatch quicker * Optimize ParquetTypeUtils#getDescriptors Existing logic was complex due to performing case-insensitive matching. This was unnecesary because fileSchema and requestedSchema already contain lower cased names. Also, since requestedSchema is derived from fileSchema, we can build descriptors map directly from result of getColumns instead of repeating look-ups in fileSchema. * Update jjwt to 0.12.6 * Update minio to 8.5.11 * Update RoaringBitmap to 1.1.0 * Update AWS SDK v1 to 1.12.749 * Update AWS SDK v2 to 2.26.8 * Update oauth2-oidc-sdk to 11.13 * Update metrics-core to 4.2.26 * Update Nessie to 0.91.2 * Update openlineage to 1.17.1 * Update Azure SDK Azure SDK version 1.2.24 includes Azure Storage Blob 12.26, which introduces a breaking change, where blob names used to create a new blob client should not be URL-encoded anymore. Blob URLs in the storage don't change, so after adjusting the API usage, data written by older Trino versions should be readable by new Trino versions. * Disable caching of /ui responses This ensures that any sensitive information is not cached in the browser * Allow adding non-temurin JDK distributions This is a preparation for testing additional builds like Loom EA * Add Loom EA build * Verify ppc64le docker image * Add clarifying javadoc * Add more http proxy configs for native S3 filesystem implementation To make the http proxy config options in parity with legacy implementation * Push filter through TopNRankingNode * Fix initial value of diagnostics counter * Fix handling RESERVED status when updating stats * Allow uppercase characters in Iceberg quoted identifiers partition * Add special characters missing test coverage for native-fs s3 path * Use enhanced instanceof in KuduClientSession * Fix failure when filter operation is applied on varbinary column in Kudu * Remove enable-coordinator-dynamic-filters-distribution config This has been enabled by default for a long time without issues. It is no longer necessary to keep it as a configurable. * Migrate Hive view in Delta to integration tests * Migrate unregistering non-delta table to integration tests * Throw TrinoException for Kudu failures Inserting/Merging/Updating invalid data in Kudu doesn't throw a Kudu specific exception so we wrap all the exceptions as Trino specific exceptions * Add date data type support for Kudu Co-authored-by: Pascal Gasp <[email protected]> * Add t_cdf and t_pdf scalar functions * Fix spelling errors in documentation * Fix incorrect case optimization Expressions of the form: CASE WHEN <cond> THEN true ELSE false END were being simplified to <cond>, which is incorrect due to null handling. The original form would return false, while the simplified form returns null. * Allow pre-allocating memory per catalog on initialization * Add Trino 451 release notes * [maven-release-plugin] prepare release 451 * Merge fixes for Trino 451 (#35) --------- Co-authored-by: praveenkrishna.d <[email protected]> Co-authored-by: Yiqun Zhang <[email protected]> Co-authored-by: Piotr Findeisen <[email protected]> Co-authored-by: Yuya Ebihara <[email protected]> Co-authored-by: regadas <[email protected]> Co-authored-by: Manfred Moser <[email protected]> Co-authored-by: Colebow <[email protected]> Co-authored-by: uditkumar <[email protected]> Co-authored-by: Łukasz Osipiuk <[email protected]> Co-authored-by: Rafał Połomka <[email protected]> Co-authored-by: Ashhar Hasan <[email protected]> Co-authored-by: Jack Klamer <[email protected]> Co-authored-by: Sasha Sheikin <[email protected]> Co-authored-by: Mateusz "Serafin" Gajewski <[email protected]> Co-authored-by: Raunaq Morarka <[email protected]> Co-authored-by: Anu Sudarsan <[email protected]> Co-authored-by: takezoe <[email protected]> Co-authored-by: Kamil Endruszkiewicz <[email protected]> Co-authored-by: Lukasz Stec <[email protected]> Co-authored-by: chenjian2664 <[email protected]> Co-authored-by: Mayank Vadariya <[email protected]> Co-authored-by: Alexey Pavlenko <[email protected]> Co-authored-by: Karol Sobczak <[email protected]> Co-authored-by: Dejan Mircevski <[email protected]> Co-authored-by: Jan Waś <[email protected]> Co-authored-by: Grant Nicholas <[email protected]> Co-authored-by: Keith Whitley <[email protected]> Co-authored-by: ajantha-bhat <[email protected]> Co-authored-by: Grzegorz Nowak <[email protected]> Co-authored-by: Martin Traverso <[email protected]> Co-authored-by: Jianjian <[email protected]> Co-authored-by: senlizishi <[email protected]> Co-authored-by: Alexander Kolesnikov <[email protected]> Co-authored-by: Gontzal Monasterio <[email protected]> Co-authored-by: Marius Grama <[email protected]> Co-authored-by: Star Poon <[email protected]> Co-authored-by: Marton Bod <[email protected]> Co-authored-by: Brad <[email protected]> Co-authored-by: Vikash Kumar <[email protected]> Co-authored-by: Victoria Bukta <[email protected]> Co-authored-by: Jonas Irgens Kylling <[email protected]> Co-authored-by: kasiafi <[email protected]> Co-authored-by: Pascal Gasp <[email protected]> Co-authored-by: Emily Sunaryo <[email protected]> Co-authored-by: neriya shulman <[email protected]>
Pretty soon we'll be able to leverage the Picnic recipes again, following the work done in
That should help along the work done here to get a first version merged. |
That time has arrived 🎉. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see the work done here @ssheikin ! I've polished things up to make better use of the Picnic provided recipes, such that there's less to maintain here. 🎉
What tripped up the previous test runs was that the testng
classpath from resources dependency could not be found, and did not throw an exception. That meant we could not match the recipes provided by Picnic, and as such were not able to see the full effect of having those recipes included here.
I've discovered an issue with our refaster template annotation processor that'll take a little while longer to resolve:
Until then we're all set to merge these in and make the recipes available to folks. Thanks a lot all!
@timtebeek Thank you for going the extra mile with this PR. |
What's changed?
Add TestNG -> AssertJ rules
(applying minimal changes to JUnit -> AssertJ)
What's your motivation?
Anything in particular you'd like reviewers to focus on?
Anyone you would like to review specifically?
Have you considered any alternatives or workarounds?
I didn't find any better alternative.
Any additional context
This solution was good enough, maybe it will be helpful for someone.
If someone willing to polish it to production-ready state - please be welcomed to take over this PR.
Checklist