This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'upstream/master'
* upstream/master: (124 commits) [GOBBLIN-1699] Log progress of reducer task for visibility with slow compaction jobs apache#3552 fix helix job wait completion bug when job goes to STOPPING state (apache#3556) [GOBBLIN-1695] Fix: Failure to add spec executors doesn't block deployment (apache#3551) [GOBBLIN-1701] Replace jcenter with either maven central or gradle plugin portal (apache#3554) [GOBBLIN-1700] Remove unused coveralls-gradle-plugin dependency add MysqlUserQuotaManager (apache#3545) [GOBBLIN-1689] Decouple compiler from scheduler in warm standby mode (apache#3544) Add GMCE topic explicitly to hive commit event (apache#3547) [GOBBLIN-1678] Refactor git flowgraph component to be extensible (apache#3536) [GOBBLIN-1690] Added logging to ORC writer Allow all iceberg exceptions to be fault tolerant (apache#3541) Guard against exists fs call as well (apache#3538) Add error handling for timeaware finder to handle scenarios where fil… (apache#3537) [GOBBLIN-1675] Add pagination for GaaS on server side (apache#3533) [GOBBLIN-1672] Refactor metrics from DagManager into its own class, add metrics per … (apache#3532) [GOBBLIN-1677] Fix timezone property to read from key correctly (apache#3535) [Gobblin-931] Fix typo in gobblin CLI usage (apache#3530) [GOBBLIN-1671] : Fix script to add external jars as colon separated to HADOOP_CLASSPATH (apache#3531) [GOBBLIN-1656] Return a http status 503 on GaaS when quota is exceeded for user or flowgroup (apache#3516) [GOBBLIN-1669] Clean up TimeAwareRecursiveCopyableDataset to support seconds in time… (apache#3528) [GOBBLIN-1670] Remove rat tasks and unneeded checkstyles blocking build pipeline (apache#3529) [GOBBLIN-1668] Add audit counts for iceberg registration (apache#3527) [GOBBLIN-1667] Create new predicate - ExistingPartitionSkipPredicate (apache#3526) Calculate requested container count based on adding allocated count and outstanding ContainerRequests in Yarn (apache#3524) make the requestedContainerCountMap correctly update the container count (apache#3523) Fix running counts for retried flows (apache#3520) Allow table to flush after write failure (apache#3522) [GOBBLIN-1652]Add more log in the KafkaJobStatusMonitor in case it fails to process one GobblinTrackingEvent (apache#3513) Make Yarn container and helix instance allocation group by tag (apache#3519) [GOBBLIN-1657] Update completion watermark on change_property in IcebergMetadataWriter (apache#3517) [GOBBLIN-1654] Add capacity floor to avoid aggressively requesting resource and small files. (apache#3515) [GOBBLIN-1653] Shorten job name length if it exceeds 255 characters (apache#3514) [GOBBLIN-1650] Implement flowGroup quotas for the DagManager (apache#3511) [GOBBLIN-1648] Complete use of JDBC `DataSource` 'read-only' validation query by incorporating where previously omitted (apache#3509) Add config to set close timeout in HiveRegister (apache#3512) add an API in AbstractBaseKafkaConsumerClient to list selected topics (apache#3501) [GOBBLIN-1649] Revert gobblin-1633 (apache#3510) [GOBBLIN-1639] Prevent metrics reporting if configured, clean up workunit count metric (apache#3500) [GOBBLIN-1647] Add hive commit GTE to HiveMetadataWriter (apache#3508) [GOBBLIN-1633] Fix compaction actions on job failure not retried if compaction succeeds (apache#3494) [GOBBLIN-1646] Revert yarn container / helix tag group changes (apache#3507) [GOBBLIN-1641] Add meter for sla exceeded flows (apache#3502) GOBBLIN-1644 (apache#3506) [GOBBLIN-1645]Change the prefix of dagManager heartbeat to make it consistent with other metrics (apache#3505) Fix bug when shrinking the container in Yarn service (apache#3504) [GOBBLIN-1637] Add writer, operation, and partition info to failed metadata writer events (apache#3498) [GOBBLIN-1638] Fix unbalanced running count metrics due to Azkaban failures (apache#3499) [GOBBLIN-1634] Add retries on flow sla kills (apache#3495) [GOBBLIN-1620]Make yarn container allocation group by helix tag (apache#3487) [GOBBLIN-1636] Close DatasetCleaner after clean task (apache#3497) [GOBBLIN-1635] Avoid loading env configuration when using config store to improve the performance (apache#3496) use user supplied props to create FileSystem in DatasetCleanerTask (apache#3483) [GOBBLIN-1619] WriterUtils.mkdirsWithRecursivePermission contains race condition and puts unnecessary load on filesystem (apache#3477) use data node aliases to figure out data node names before using DMAS (apache#3493) [GOBBLIN-1630] Remove flow level metrics for adhoc flows (apache#3491) [GOBBLIN-1631]Emit heartbeat for dagManagerThread (apache#3492) [GOBBLIN-1624] Refactor quota management, fix various bugs in accounting of running … (apache#3481) [GOBBLIN-1613] Add metadata writers field to GMCE schema (apache#3490) Update [GOBBLIN-1629] Make GobblinMCEWriter be able to catch error when calculating hive specs (apache#3489) Add/fix some fields of MetadataWriterFailureEvent (apache#3485) [GOBBLIN-1627] provide option to convert datanodes names (apache#3484) Add coverage for edge cases when table paths do not exist, check parents (apache#3482) [GOBBLIN-1616] Add close connection logic in salseforceSource (apache#3486) [GOBBLIN-1621] Make HelixRetriggeringJobCallable emit job skip event when job is dropped due to previous job is running (apache#3478) [GOBBLIN-1623] Fix NPE when try to close RestApiConnector (apache#3480) Clear bad mysql packages from cache in CI/CD machines (apache#3479) [GOBBLIN-1617] pass configurations to some HadoopUtils APIs (apache#3475) [GOBBLIN-1616] Make RestApiConnector be able to close the connection finally (apache#3474) add config to set log level for any class (apache#3473) Fix bug where partitioned tables would always return the wrong equality in paths (apache#3472) [GOBBLIN-1602] Change hive table location and partition check to validate using FS r… (apache#3459) Don't flush on change_property operation (apache#3467) Fix case where error GTE is incorrectly sent from MCE writer (apache#3466) partial rollback of PR 3464 (apache#3465) [GOBBLIN-1604] Throw exception if there are no allocated requests due to lack of res… (apache#3461) [GOBBLIN-1603] Throws error if configured when encountering an IO exception while co… (apache#3460) [GOBBLIN-1606] change DEFAULT_GOBBLIN_COPY_CHECK_FILESIZE value (apache#3464) Upgraded dropwizard metrics library version from 3.2.3 -> 4.1.2 and added a new wrapper class on dropwizard Timer.Context class to handle the code compatibility as the newer version of this class implements AutoClosable instead of Closable. (apache#3463) [GOBBLIN-1605] Fix mysql ubuntu download 404 not found for Github Actions CI/CD (apache#3462) [GOBBLIN-1601] implement ChangePermissionCommitStep (apache#3457) [GOBBLIN-1598]Fix metrics already exist issue in dag manager (apache#3454) [GOBBLIN-1597] Add error handling in dagmanager to continue if dag fails to process,… (apache#3452) GOBBLIN-1579 Fail job on hive existing target table location mismatch (apache#3433) [GOBBLIN-1596] Ignore already exists exception if the table has already been created… (apache#3451) [GOBBLIn-1595]Fix the dead lock during hive registration (apache#3450) Add guard in DagManager for improperly formed SLA (apache#3449) [GOBBLIN-1588] Send failure events for write failures when watermark is advanced in MCE writer (apache#3441) [GOBBLIN-1593] Fix bugs in dag manager about metric reporting and job status monitor (apache#3448) Fix bug in `JobSpecSerializer` of inadequately preventing access errors (within `MysqlJobCatalog`) (apache#3447) [GOBBLIN-1583] Add System level job start SLA (apache#3437) [GOBBLIN-1592] Make hive copy be able to apply filter on directory (apache#3446) [GOBBLIN-1585]GaaS (DagManager) keep retrying a failed job beyond max attempt number (apache#3439) [GOBBLIN-1590] Add low/high watermark information in event emitted by Gobblin cluster (apache#3443) [HotFix]Try to fix the mysql dependency issue in Github action (apache#3445) Lazily initialize FileContext and do not store a handle of it so it can be GC'ed when required (apache#3444) [GOBBLIN-1584] Add replace record logic for Mysql writer (apache#3438) Bump up code cov version (apache#3440) [GOBBLIN-1581] Iterate over Sql ResultSet in Only the Forward Direction (apache#3435) [GOBBLIN-1575] use reference count in helix manager, so that connect/disconnect are called once and at the right time (apache#3427) ...
- Loading branch information