forked from apache/gobblin
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'upstream/master'
* upstream/master: Refactor `MysqlSpecStore` into a generalization, `MysqlNonFlowSpecStore` (not limited to `FlowSpec`s), also useable for `TopologySpec`s (apache#3414) [GOBBLIN-1563]Collect more information to analyze the RC for some job cannot emit kafka events to update job status (apache#3416) [GOBBLIN-1521] Create local mode of streaming kafka job to help user quickly onboard (apache#3372) [GOBBLIN-1559] Support wildcard for input paths (apache#3410) [GOBBLIN-1561]Improve error message when flow compilation fails (apache#3412) [GOBBLIN-1556]Add shutdown logic in FsJobConfigurationManager (apache#3407) [GOBBLIN-1542] Integrate with Helix API to add/remove task from a running helix job (apache#3393)
- Loading branch information
Showing
40 changed files
with
1,403 additions
and
536 deletions.
There are no files selected for viewing
40 changes: 40 additions & 0 deletions
40
gobblin-api/src/main/java/org/apache/gobblin/source/InfiniteSource.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.gobblin.source; | ||
|
||
import com.google.common.eventbus.EventBus; | ||
import org.apache.gobblin.annotation.Alpha; | ||
|
||
|
||
/** | ||
* An interface for infinite source, where source should be able to detect the work unit change | ||
* and post the change through eventBus | ||
* | ||
* @author Zihan Li | ||
* | ||
* @param <S> output schema type | ||
* @param <D> output record type | ||
*/ | ||
@Alpha | ||
public interface InfiniteSource<S, D> extends Source<S, D>{ | ||
|
||
/** | ||
* Return the eventBus where it will post {@link org.apache.gobblin.stream.WorkUnitChangeEvent} when workUnit change | ||
*/ | ||
EventBus getEventBus(); | ||
|
||
} |
37 changes: 37 additions & 0 deletions
37
gobblin-api/src/main/java/org/apache/gobblin/stream/WorkUnitChangeEvent.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.gobblin.stream; | ||
|
||
import java.util.List; | ||
import lombok.Getter; | ||
import org.apache.gobblin.source.workunit.WorkUnit; | ||
|
||
/** | ||
* The event for {@link org.apache.gobblin.source.InfiniteSource} to indicate there is a change in work units | ||
* Job launcher should then be able to handle this event | ||
*/ | ||
public class WorkUnitChangeEvent { | ||
@Getter | ||
private final List<String> oldTaskIds; | ||
@Getter | ||
private final List<WorkUnit> newWorkUnits; | ||
public WorkUnitChangeEvent(List<String> oldTaskIds, List<WorkUnit> newWorkUnits) { | ||
this.oldTaskIds = oldTaskIds; | ||
this.newWorkUnits = newWorkUnits; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
257 changes: 171 additions & 86 deletions
257
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobLauncher.java
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletions
31
gobblin-docs/user-guide/Run-Gobblin-Streaming-kafka-hdfs-Locally.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Table of Contents | ||
|
||
[TOC] | ||
|
||
# Introduction | ||
|
||
Gobblin supports streaming mode that allows continuous ingestion of data from Kafka to HDFS. The streaming mode has been deployed in production at LinkedIn as a Gobblin cluster that uses Yarn for container allocation and Helix for task coordination. | ||
|
||
Here, we describe how to set up a Kafka -> HDFS pipeline in local mode for users to easily start and test out a streaming ingestion pipeline. | ||
|
||
|
||
# Setup local kafka cluster | ||
|
||
Follow [kafka quick start](https://kafka.apache.org/quickstart) to set up your kafka cluster, and create test topic "testEvents" | ||
|
||
# Run EmbeddedGobblin to start the job | ||
|
||
We use the configuration: /gobblin-modules/gobblin-kafka-09/src/test/resources/kafkaHDFSStreaming.conf to execute the job. | ||
|
||
To run the job, in your intellij, you can run the test in /gobblin-modules/gobblin-kafka-09/src/test/java/org/apache/gobblin/kafka/KafkaStreamingLocalTest | ||
by removing the line '(enabled=false)'. In order to run the test in IDE, you may need to manually delete log4j-over-slf4j jar in IDE | ||
|
||
Under your kafka dir, you can run following command to produce data into your kafka topic | ||
|
||
`bin/kafka-console-producer.sh --topic testEvents --bootstrap-server localhost:9092` | ||
|
||
The job will continually consume from testEvents and write out data as txt file onto your local fileSystem (/tmp/gobblin/kafka/publish). It will write put data every 60 seconds, and will never end until | ||
you manually kill it. | ||
|
||
If you want the job ingest data as avro/orc, you will need to have schema registry as schema source and change the job configuration to control the behavior, a sample configuration can be found [here](https://github.com/apache/gobblin/blob/master/gobblin-modules/gobblin-azkaban/src/main/resources/conf/gobblin_jobs/kafka-hdfs-streaming-avro.conf) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.