-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* document structure, minor fixes * add more docs * even more docs * Update docs/introduction/architecture.mdx Co-authored-by: Maha Hajja <[email protected]> * Update docs/introduction/getting-started.mdx Co-authored-by: Maha Hajja <[email protected]> * fix review comments * add doc about referencing connectors * simplify connector introduction Co-authored-by: Maha Hajja <[email protected]>
- Loading branch information
1 parent
b027c21
commit 3433960
Showing
26 changed files
with
729 additions
and
84 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Configuration", | ||
"position": 1 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
--- | ||
title: 'Pipeline Configuration Files' | ||
slug: 'pipeline-configuration-files' | ||
--- | ||
|
||
Pipeline configuration files give you the ability to define pipelines that are provisioned by Conduit at startup. | ||
It's as simple as creating a YAML file that defines pipelines, connectors, processors, and their corresponding configurations. | ||
|
||
## Getting started | ||
|
||
Create a folder called `pipelines` at the same level as your Conduit binary file, add all your YAML files | ||
there, then run Conduit using the command: | ||
``` | ||
./conduit | ||
``` | ||
Conduit will only search for files with `.yml` or `.yaml` extensions, recursively in all sub-folders. | ||
|
||
If you have your YAML files in a different directory, or want to provision only one file, then simply run Conduit with | ||
the CLI flag `pipelines.path` and point to your file or directory: | ||
``` | ||
./conduit -pipeline.path ../my-directory | ||
``` | ||
If your directory does not exist, Conduit will fail with an error: `"pipelines.path" config value is invalid` | ||
|
||
### YAML Schema | ||
|
||
The file in general has two root keys, the `version`, and the `pipelines` map. The map consists of other elements like | ||
`status` and `name`, which are configurations for the pipeline itself. | ||
|
||
To create connectors in that pipeline, simply add another map under the pipeline map, and call it `connectors`. | ||
|
||
To create processors, either add a `processors` map under a pipeline ID, or under a connector ID, depending on its parent. | ||
Check this YAML file example with explanation for each field: | ||
|
||
``` yaml | ||
version: 1.0 # parser version, the only supported version for now is 1.0 [mandatory] | ||
|
||
pipelines: # a map of pipelines IDs and their configurations. | ||
pipeline1: # pipeline ID, has to be unique. | ||
status: running # pipelines status at startup, either running or stopped. [mandatory] | ||
name: pipeline1 # pipeline name, if not specified, pipeline ID will be used as name. [optional] | ||
description: desc # pipeline description. [optional] | ||
connectors: # a map of connectors IDs and their configurations. | ||
con1: # connector ID, has to be unique per pipeline. | ||
type: source # connector type, either "source" or "destination". [mandatory] | ||
plugin: builtin:file # connector plugin. [mandatory] | ||
name: con3 # connector name, if not specified, connector ID will be used as name. [optional] | ||
settings: # map of configurations keys and their values. | ||
path: ./file1.txt # for this example, the plugin "bultin:file" has only one configuration, which is path. | ||
con2: | ||
type: destination | ||
plugin: builtin:file | ||
name: file-dest | ||
settings: | ||
path: ./file2.txt | ||
processors: # a map of processor IDs and their configurations, "con2" is the processor parent. | ||
proc1: # processor ID, has to be unique for each parent | ||
type: js # processor type. [mandatory] | ||
settings: # map of processor configurations and values | ||
Prop1: string | ||
processors: # processor IDs, that have the pipeline "pipeline1" as a parent. | ||
proc2: | ||
type: js | ||
settings: | ||
prop1: ${ENV_VAR} # yon can use environmental variables by wrapping them in a dollar sign and curly braces ${}. | ||
``` | ||
If the file is invalid (missed a mandatory field, or has an invalid configuration value), then the pipeline that has the | ||
invalid value will be skipped, with an error message logged. | ||
If two pipelines in one file have the same ID, or the `version` field was not specified, then the file would be | ||
non-parsable and will be skipped with an error message logged. | ||
|
||
If two pipelines from different files have the same ID, the second pipeline will be skipped, with an error message | ||
specifying which pipeline was not provisioned. | ||
|
||
**_Note_**: Connector IDs and processor IDs will get their parent ID prefixed, so if you specify a connector ID as `con1` | ||
and its parent is `pipeline1`, then the provisioned connector will have the ID `pipeline1:con1`. Same goes for processors, | ||
if the processor has a pipeline parent, then the processor ID will be `connectorID:processorID`, and if a processor | ||
has a connector parent, then the processor ID will be `pipelineID:connectorID:processorID`. | ||
|
||
## Pipelines Immutability | ||
|
||
Pipelines provisioned by configuration files are **immutable**, any updates needed on a provisioned pipeline have to be | ||
done through the configuration file. You can only control stopping and starting a pipeline | ||
through the UI or API. | ||
|
||
### Updates and Deletes | ||
|
||
Updates and deletes for a pipeline provisioned by configuration files can only be done through the configuration files. | ||
Changes should be made to the files, then Conduit has to be restarted to reload the changes. Any updates or deletes done | ||
through the API or UI will be prohibited. | ||
|
||
* To delete a pipeline: simply delete it from the `pipelines` map from the configuration file, then run Conduit again. | ||
* To update a pipeline: change any field value from the configuration file, and run Conduit again to address these updates. | ||
|
||
Updates will preserve the status of the pipeline, and will continue working from where it stopped. However, the pipeline | ||
will start from the beginning of the source and will not continue from where it stopped, if one of these values were updated: | ||
{`pipeline ID`, `connector ID`, `connector plugin`, `connector type`}. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Connectors", | ||
"position": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
--- | ||
title: "Connector Behavior" | ||
sidebar_label: "Behavior" | ||
slug: "behavior" | ||
sidebar_position: 2 | ||
--- | ||
|
||
This document provides insights on how Conduit communicates with a connector. | ||
|
||
## Conduit Connector Protocol | ||
|
||
Conduit expects all connectors to follow the | ||
[Conduit Connector Protocol](https://github.com/ConduitIO/conduit-connector-protocol). | ||
The connector protocol is a set of protobuf files describing | ||
the [interface](#protocol-grpc-interface) | ||
between Conduit and the connector in the form of gRPC services. This approach | ||
allows connectors to be written in any language with support for gRPC. | ||
|
||
The connector protocol splits the connector interface in 3 gRPC services - one | ||
for the source, another for the destination, and a third one for the connector | ||
specifications. A connector needs to implement the specifications and at least | ||
the source or destination. | ||
|
||
Note that you don't need to use the connector protocol directly - we provide a | ||
[Go connector SDK](https://github.com/ConduitIO/conduit-connector-sdk) that | ||
hides the complexity of the protocol and simplifies the implementation of a | ||
connector. | ||
|
||
### Standalone vs built-in connectors | ||
|
||
While the Conduit Connector Protocol decouples Conduit from its connectors by | ||
using gRPC, it also provides a thin Go layer that allows any Go connector to be | ||
compiled into the Conduit binary as a built-in connector. The following diagram | ||
shows how Conduit communicates with a standalone connector and a built-in | ||
connector. | ||
|
||
![Standalone vs built-in connectors](/images/standalone-vs-builtin.svg) | ||
|
||
**Standalone connectors** are run as separate processes, separate from the | ||
Conduit process. They need to have an entrypoint (binary or script) which runs | ||
the connector and starts the gRPC server responsible for communicating with | ||
Conduit. A standalone connector process is started and stopped by Conduit on | ||
demand. One connector process will be started for every pipeline connector in | ||
Conduit. | ||
|
||
**Built-in connectors** on the other hand are executed in the same process as | ||
Conduit and communicate with Conduit through Go channels instead of gRPC. Any | ||
connector written in Go can be compiled into the Conduit binary and used as a | ||
built-in connector. | ||
|
||
Find out more about the [Conduit connector plugin architecture](https://github.com/ConduitIO/conduit/blob/main/docs/architecture-decision-records/20220121-conduit-plugin-architecture.md). | ||
|
||
## Protocol gRPC Interface | ||
|
||
The protocol interface is hosted on the | ||
[Buf schema registry](https://buf.build/conduitio/conduit-connector-protocol/docs/main:connector.v1). | ||
Use it as a starting point when implementing a connector in a language other | ||
than Go. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
title: "Building Connectors" | ||
slug: "building-connectors" | ||
sidebar_position: 3 | ||
--- | ||
|
||
Conduit connectors can be built in any programming language that supports gRPC. | ||
To make it easier to write connectors we provide | ||
a [Connector SDK](https://github.com/ConduitIO/conduit-connector-sdk) written in | ||
Go. Using the SDK is the recommended way of writing a Conduit connector. | ||
|
||
## Conduit connector template | ||
|
||
The easiest way to start implementing your own Conduit connector is by using the | ||
[Conduit connector template](https://github.com/ConduitIO/conduit-connector-template). | ||
It contains the basic project structure as well as some additional utilities | ||
like GitHub actions and a Makefile. | ||
|
||
Find out more about the template and how to use it in the readme. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: "Installing Connectors" | ||
slug: "installing-connectors" | ||
sidebar_position: 0 | ||
--- | ||
|
||
Conduit ships with a number of built-in connectors: | ||
|
||
- [File connector](https://github.com/ConduitIO/conduit-connector-file) provides | ||
a source/destination to read/write a local file (useful for quickly trying out | ||
Conduit without additional setup). | ||
- [Kafka connector](https://github.com/ConduitIO/conduit-connector-kafka) | ||
provides a source/destination for Apache Kafka. | ||
- [Postgres connector](https://github.com/ConduitIO/conduit-connector-postgres) | ||
provides a source/destination for PostgreSQL. | ||
- [S3 connector](https://github.com/ConduitIO/conduit-connector-s3) provides a | ||
source/destination for AWS S3. | ||
- [Generator connector](https://github.com/ConduitIO/conduit-connector-generator) | ||
provides a source which generates random data (useful for testing). | ||
|
||
Besides these connectors there is a number of standalone connectors that can be | ||
added to Conduit as plugins (find the complete | ||
list [here](https://github.com/ConduitIO/conduit/blob/main/docs/connectors.md)). | ||
|
||
### Standalone Connector Binary | ||
|
||
To install a standalone connector you first need the compiled connector binary. | ||
A binary can normally be downloaded from the latest release in the connector's | ||
GitHub repository (this may vary in 3rd party connectors not developed by the | ||
Conduit team). Make sure to download the binary that matches your operating | ||
system and architecture. | ||
|
||
Alternatively you can build the binary yourself (for instructions on building a | ||
connector please refer to the readme of that specific connector). | ||
|
||
## Installing a Connector in Conduit | ||
|
||
Conduit loads standalone connectors at startup. The connector binaries need to | ||
be placed in the `connectors` directory relative to the Conduit binary so | ||
Conduit can find them. Alternatively, the path to the standalone connectors can | ||
be adjusted using the CLI flag `-connectors.path`, for example: | ||
|
||
```shell | ||
./conduit -connectors.path=/path/to/connectors/ | ||
``` | ||
|
||
Names of the connector binaries are not important, since Conduit is getting the | ||
information about connectors from connectors themselves (using their gRPC API). | ||
|
||
Find out how to [reference your connector](/docs/connectors/referencing-connectors). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
title: "Referencing Connectors" | ||
slug: "referencing-connectors" | ||
sidebar_position: 1 | ||
--- | ||
|
||
The name used to reference a connector in API requests (e.g. to create a new | ||
connector) comes in the following format: | ||
|
||
`[PLUGIN-TYPE:]PLUGIN-NAME[@VERSION]` | ||
|
||
- `PLUGIN-TYPE` (`builtin`, `standalone` or `any`) | ||
- Defines if the specified plugin should be builtin or standalone. | ||
- If `any`, Conduit will use a standalone plugin if it exists and fall back to | ||
a builtin plugin. | ||
- Default is `any`. | ||
- `PLUGIN-NAME` | ||
- Defines the name of the plugin as specified in the plugin specifications, it | ||
has to be an exact match. | ||
- `VERSION` | ||
- Defines the plugin version as specified in the plugin specifications, it has | ||
to be an exact match. | ||
- If `latest`, Conduit will use the latest semantic version. | ||
- Default is `latest`. | ||
|
||
Examples: | ||
|
||
- `postgres` | ||
- will use the **latest** **standalone** **postgres** plugin | ||
- will fallback to the **latest** **builtin** **postgres** plugin if | ||
standalone wasn't found | ||
- `[email protected]` | ||
- will use the **standalone** **postgres** plugin with version **v0.2.0** | ||
- will fallback to a **builtin** **postgres** plugin with version **v0.2.0** | ||
if standalone wasn't found | ||
- `builtin:postgres` | ||
- will use the **latest** **builtin** **postgres** plugin | ||
- `standalone:[email protected]` | ||
- will use the **standalone** **postgres** plugin with version **v0.3.0** (no | ||
fallback to builtin) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "Deploy", | ||
"position": 3 | ||
} |
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
--- | ||
title: 'Connectors' | ||
slug: 'connectors' | ||
sidebar_position: 4 | ||
--- | ||
|
||
A connector knows how to read/write records from/to a data source/destination | ||
(e.g. a database). | ||
|
||
When thinking about connectors for Conduit, our goals were to: | ||
- provide a good development experience to connector developers, | ||
- ship Conduit with real built-in connectors (compiled into the Conduit binary), | ||
- to make it as easy as possible to write plugins in _any_ programming language, | ||
- the [Connector SDK](https://github.com/conduitio/conduit-connector-sdk) to be | ||
decoupled from Conduit and be able to change without changing Conduit itself. | ||
|
||
Have a look at our [connector docs](/docs/connectors/installing-connectors) to | ||
find out more! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.