forked from airbytehq/airbyte
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🎉 New source: Fauna (airbytehq#15274)
* Add fauna source * Update changelog to include the correct PR * Improve docs (airbytehq#1) * Applied suggestions to improve docs (airbytehq#2) * Applied suggestions to improve docs * Cleaned up the docs * Apply suggestions from code review Co-authored-by: Ewan Edwards <[email protected]> * Update airbyte-integrations/connectors/source-fauna/source_fauna/spec.yaml Co-authored-by: Ewan Edwards <[email protected]> Co-authored-by: Ewan Edwards <[email protected]> * Flake Checker (airbytehq#3) * Run ./gradlew :airbyte-integrations:connectors:source-fauna:flakeCheck * Fix all the warnings * Set additionalProperties to true to adhere to acceptance tests * Remove custom fields (airbytehq#4) * Remove custom fields from source.py * Remove custom fields from spec.yaml * Collections that support incremental sync are found correctly * Run formatter * Index values and termins are verified * Stripped additional_columns from collection config and check() * We now search for an index at the start of each sync * Add default for missing data in collection * Add a log message about the index chosen to sync an incremental stream * Add an example for a configured incremental catalog * Check test now validates the simplified check function * Remove collection name from spec.yaml and CollectionConfig * Update test_util.py to ahere to the new config * Update the first discover test to validate that we can find indexes correctly * Remove other discover tests, as they no longer apply * Full refresh test now works with simplified expanded columns * Remove unused imports * Incremental test now adheres to the find_index_for_stream system * Database test passes, so now all unit tests pass again * Remove extra fields from required section * ttl is nullable * Data defaults to an empty object * Update tests to reflect ttl and data select changes * Fix expected records. All unit tests and acceptance tests pass * Cleanup docs for find_index_for_stream * Update setup guide to reflect multiple collections * Add docs to install the fauna shell * Update examples and README to conform to the removal of additional columns Co-authored-by: Ewan Edwards <[email protected]>
- Loading branch information
1 parent
44dc0ce
commit 521f2a4
Showing
49 changed files
with
4,232 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
* | ||
!Dockerfile | ||
!main.py | ||
!source_fauna | ||
!setup.py | ||
!secrets |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Python version tools | ||
.tool-versions | ||
../../../.tool-versions | ||
# emacs auto-save files | ||
*~ | ||
*# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
FROM python:3.9.11-alpine3.15 as base | ||
|
||
# build and load all requirements | ||
FROM base as builder | ||
WORKDIR /airbyte/integration_code | ||
|
||
# upgrade pip to the latest version | ||
RUN apk --no-cache upgrade \ | ||
&& pip install --upgrade pip \ | ||
&& apk --no-cache add tzdata build-base | ||
|
||
|
||
COPY setup.py ./ | ||
# install necessary packages to a temporary folder | ||
RUN pip install --prefix=/install . | ||
|
||
# build a clean environment | ||
FROM base | ||
WORKDIR /airbyte/integration_code | ||
|
||
# copy all loaded and built libraries to a pure basic image | ||
COPY --from=builder /install /usr/local | ||
# add default timezone settings | ||
COPY --from=builder /usr/share/zoneinfo/Etc/UTC /etc/localtime | ||
RUN echo "Etc/UTC" > /etc/timezone | ||
|
||
# bash is installed for more convenient debugging. | ||
RUN apk --no-cache add bash | ||
|
||
# copy payload code only | ||
COPY main.py ./ | ||
COPY source_fauna ./source_fauna | ||
|
||
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py" | ||
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] | ||
|
||
LABEL io.airbyte.version=dev | ||
LABEL io.airbyte.name=airbyte/source-fauna |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,188 @@ | ||
# New Readers | ||
|
||
If you know how Airbyte works, read [bootstrap.md](bootstrap.md) for a quick introduction to this source. If you haven't | ||
used airbyte before, read [overview.md](overview.md) for a longer overview about what this connector is and how to use | ||
it. | ||
|
||
# For Fauna Developers | ||
|
||
## Running locally | ||
|
||
First, start a local fauna container: | ||
``` | ||
docker run --rm --name faunadb -p 8443:8443 fauna/faunadb | ||
``` | ||
|
||
In another terminal, cd into the connector directory: | ||
``` | ||
cd airbyte-integrations/connectors/source-fauna | ||
``` | ||
|
||
Once started the container is up, setup the database: | ||
``` | ||
fauna eval "$(cat examples/setup_database.fql)" --domain localhost --port 8443 --scheme http --secret secret | ||
``` | ||
|
||
Finally, run the connector: | ||
``` | ||
python main.py spec | ||
python main.py check --config examples/config_localhost.json | ||
python main.py discover --config examples/config_localhost.json | ||
python main.py read --config examples/config_localhost.json --catalog examples/configured_catalog.json | ||
``` | ||
|
||
To pick up a partial failure you need to pass in a state file. To test via example induce a crash via bad data (e.g. a missing required field), update `examples/sample_state_full_sync.json` to contain your emitted state and then run: | ||
|
||
``` | ||
python main.py read --config examples/config_localhost.json --catalog examples/configured_catalog.json --state examples/sample_state_full_sync.json | ||
``` | ||
|
||
## Running the intergration tests | ||
|
||
First, cd into the connector directory: | ||
``` | ||
cd airbyte-integrations/connectors/source-fauna | ||
``` | ||
|
||
The integration tests require a secret config.json. Ping me on slack to get this file. | ||
Once you have this file, put it in `secrets/config.json`. A sample of this file can be | ||
found at `examples/secret_config.json`. Once the file is created, build the connector: | ||
``` | ||
docker build . -t airbyte/source-fauna:dev | ||
``` | ||
|
||
Now, run the integration tests: | ||
``` | ||
python -m pytest -p integration_tests.acceptance | ||
``` | ||
|
||
|
||
# Fauna Source | ||
|
||
This is the repository for the Fauna source connector, written in Python. | ||
For information about how to use this connector within Airbyte, see [the documentation](https://docs.airbyte.io/integrations/sources/fauna). | ||
|
||
## Local development | ||
|
||
### Prerequisites | ||
**To iterate on this connector, make sure to complete this prerequisites section.** | ||
|
||
#### Minimum Python version required `= 3.9.0` | ||
|
||
#### Build & Activate Virtual Environment and install dependencies | ||
From this connector directory, create a virtual environment: | ||
``` | ||
python -m venv .venv | ||
``` | ||
|
||
This will generate a virtualenv for this module in `.venv/`. Make sure this venv is active in your | ||
development environment of choice. To activate it from the terminal, run: | ||
``` | ||
source .venv/bin/activate | ||
pip install -r requirements.txt | ||
``` | ||
If you are in an IDE, follow your IDE's instructions to activate the virtualenv. | ||
|
||
Note that while we are installing dependencies from `requirements.txt`, you should only edit `setup.py` for your dependencies. `requirements.txt` is | ||
used for editable installs (`pip install -e`) to pull in Python dependencies from the monorepo and will call `setup.py`. | ||
If this is mumbo jumbo to you, don't worry about it, just put your deps in `setup.py` but install using `pip install -r requirements.txt` and everything | ||
should work as you expect. | ||
|
||
#### Building via Gradle | ||
From the Airbyte repository root, run: | ||
``` | ||
./gradlew :airbyte-integrations:connectors:source-fauna:build | ||
``` | ||
|
||
#### Create credentials | ||
**If you are a community contributor**, follow the instructions in the [documentation](https://docs.airbyte.io/integrations/sources/fauna) | ||
to generate the necessary credentials. Then create a file `secrets/config.json` conforming to the `source_fauna/spec.yaml` file. | ||
Note that the `secrets` directory is gitignored by default, so there is no danger of accidentally checking in sensitive information. | ||
See `examples/secret_config.json` for a sample config file. | ||
|
||
**If you are an Airbyte core member**, copy the credentials in Lastpass under the secret name `source fauna test creds` | ||
and place them into `secrets/config.json`. | ||
|
||
### Locally running the connector | ||
``` | ||
python main.py spec | ||
python main.py check --config secrets/config.json | ||
python main.py discover --config secrets/config.json | ||
python main.py read --config secrets/config.json --catalog integration_tests/configured_catalog.json | ||
``` | ||
|
||
### Locally running the connector docker image | ||
|
||
#### Build | ||
First, make sure you build the latest Docker image: | ||
``` | ||
docker build . -t airbyte/source-fauna:dev | ||
``` | ||
|
||
You can also build the connector image via Gradle: | ||
``` | ||
./gradlew :airbyte-integrations:connectors:source-fauna:airbyteDocker | ||
``` | ||
When building via Gradle, the docker image name and tag, respectively, are the values of the `io.airbyte.name` and `io.airbyte.version` `LABEL`s in | ||
the Dockerfile. | ||
|
||
#### Run | ||
Then run any of the connector commands as follows: | ||
``` | ||
docker run --rm airbyte/source-fauna:dev spec | ||
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-fauna:dev check --config /secrets/config.json | ||
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-fauna:dev discover --config /secrets/config.json | ||
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-fauna:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json | ||
``` | ||
## Testing | ||
Make sure to familiarize yourself with [pytest test discovery](https://docs.pytest.org/en/latest/goodpractices.html#test-discovery) to know how your test files and methods should be named. | ||
First install test dependencies into your virtual environment: | ||
``` | ||
pip install .[tests] | ||
``` | ||
### Unit Tests | ||
To run unit tests locally, from the connector directory run: | ||
``` | ||
python -m pytest unit_tests | ||
``` | ||
|
||
### Integration Tests | ||
There are two types of integration tests: Acceptance Tests (Airbyte's test suite for all source connectors) and custom integration tests (which are specific to this connector). | ||
#### Custom Integration tests | ||
Place custom tests inside `integration_tests/` folder, then, from the connector root, run | ||
``` | ||
python -m pytest integration_tests | ||
``` | ||
#### Acceptance Tests | ||
Customize `acceptance-test-config.yml` file to configure tests. See [Source Acceptance Tests](https://docs.airbyte.io/connector-development/testing-connectors/source-acceptance-tests-reference) for more information. | ||
If your connector requires to create or destroy resources for use during acceptance tests create fixtures for it and place them inside integration_tests/acceptance.py. | ||
To run your integration tests with acceptance tests, from the connector root, run | ||
``` | ||
python -m pytest integration_tests -p integration_tests.acceptance | ||
``` | ||
To run your integration tests with docker | ||
|
||
### Using gradle to run tests | ||
All commands should be run from airbyte project root. | ||
To run unit tests: | ||
``` | ||
./gradlew :airbyte-integrations:connectors:source-fauna:unitTest | ||
``` | ||
To run acceptance and custom integration tests: | ||
``` | ||
./gradlew :airbyte-integrations:connectors:source-fauna:integrationTest | ||
``` | ||
|
||
## Dependency Management | ||
All of your dependencies should go in `setup.py`, NOT `requirements.txt`. The requirements file is only used to connect internal Airbyte dependencies in the monorepo for local development. | ||
We split dependencies between two groups, dependencies that are: | ||
* required for your connector to work need to go to `MAIN_REQUIREMENTS` list. | ||
* required for the testing need to go to `TEST_REQUIREMENTS` list | ||
|
||
### Publishing a new version of the connector | ||
You've checked out the repo, implemented a million dollar feature, and you're ready to share your changes with the world. Now what? | ||
1. Make sure your changes are passing unit and integration tests. | ||
1. Bump the connector version in `Dockerfile` -- just increment the value of the `LABEL io.airbyte.version` appropriately (we use [SemVer](https://semver.org/)). | ||
1. Create a Pull Request. | ||
1. Pat yourself on the back for being an awesome contributor. | ||
1. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master. |
45 changes: 45 additions & 0 deletions
45
airbyte-integrations/connectors/source-fauna/acceptance-test-config.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# See [Source Acceptance Tests](https://docs.airbyte.io/connector-development/testing-connectors/source-acceptance-tests-reference) | ||
# for more information about how to configure these tests | ||
connector_image: airbyte/source-fauna:dev | ||
tests: | ||
spec: | ||
- spec_path: "source_fauna/spec.yaml" | ||
connection: | ||
- config_path: "secrets/config.json" | ||
status: "succeed" | ||
- config_path: "secrets/config-deletions.json" | ||
status: "succeed" | ||
- config_path: "integration_tests/config/invalid.json" | ||
status: "failed" | ||
discovery: | ||
- config_path: "secrets/config.json" | ||
- config_path: "secrets/config-deletions.json" | ||
basic_read: | ||
- config_path: "secrets/config.json" | ||
configured_catalog_path: "integration_tests/configured_catalog.json" | ||
empty_streams: [] | ||
expect_records: | ||
path: "integration_tests/expected_records.txt" | ||
extra_fields: no | ||
exact_order: yes | ||
extra_records: no | ||
- config_path: "secrets/config-deletions.json" | ||
configured_catalog_path: "integration_tests/configured_catalog_incremental.json" | ||
empty_streams: [] | ||
expect_records: | ||
path: "integration_tests/expected_deletions_records.txt" | ||
extra_fields: no | ||
exact_order: yes | ||
extra_records: no | ||
incremental: | ||
- config_path: "secrets/config.json" | ||
configured_catalog_path: "integration_tests/configured_catalog.json" | ||
# Note that the time in this file was generated with this fql: | ||
# ToMicros(ToTime(Date("9999-01-01"))) | ||
future_state_path: "integration_tests/abnormal_state.json" | ||
- config_path: "secrets/config-deletions.json" | ||
configured_catalog_path: "integration_tests/configured_catalog_incremental.json" | ||
future_state_path: "integration_tests/abnormal_deletions_state.json" | ||
full_refresh: | ||
- config_path: "secrets/config.json" | ||
configured_catalog_path: "integration_tests/configured_catalog.json" |
16 changes: 16 additions & 0 deletions
16
airbyte-integrations/connectors/source-fauna/acceptance-test-docker.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
#!/usr/bin/env sh | ||
|
||
# Build latest connector image | ||
docker build . -t $(cat acceptance-test-config.yml | grep "connector_image" | head -n 1 | cut -d: -f2-) | ||
|
||
# Pull latest acctest image | ||
docker pull airbyte/source-acceptance-test:latest | ||
|
||
# Run | ||
docker run --rm -it \ | ||
-v /var/run/docker.sock:/var/run/docker.sock \ | ||
-v /tmp:/tmp \ | ||
-v $(pwd):/test_input \ | ||
airbyte/source-acceptance-test \ | ||
--acceptance-test-config /test_input | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Fauna Source | ||
|
||
[Fauna](https://fauna.com/) is a serverless "document-relational" database that user's interact with via APIs. This connector delivers Fauna as an airbyte source. | ||
|
||
This source is implemented in the [Airbyte CDK](https://docs.airbyte.io/connector-development/cdk-python). | ||
It also uses the [Fauna Python Driver](https://docs.fauna.com/fauna/current/drivers/python), which | ||
allows the connector to build FQL queries in python. This driver is what queries the Fauna database. | ||
|
||
Fauna has collections (similar to tables) and documents (similar to rows). | ||
|
||
Every document has at least 3 fields: `ref`, `ts` and `data`. The `ref` is a unique string identifier | ||
for every document. The `ts` is a timestamp, which is the time that the document was last modified. | ||
The `data` is arbitrary json data. Because there is no shape to this data, we also allow users of | ||
airbyte to specify which fields of the document they want to export as top-level columns. | ||
|
||
Users can also choose to export the `data` field itself in the raw and in the case of incremental syncs, metadata regarding when a document was deleted. | ||
|
||
We currently only provide a single stream, which is the collection the user has chosen. This is | ||
because to support incremental syncs we need an index with every collection, so it ends up being easier to just have the user | ||
setup the index and tell us the collection and index name they wish to use. | ||
|
||
## Full sync | ||
|
||
This source will simply call the following [FQL](https://docs.fauna.com/fauna/current/api/fql/): `Paginate(Documents(Collection("collection-name")))`. | ||
This queries all documents in the database in a paginated manner. The source then iterates over all the results from that query to export data from the connector. | ||
|
||
Docs: | ||
[Paginate](https://docs.fauna.com/fauna/current/api/fql/functions/paginate?lang=python). | ||
[Documents](https://docs.fauna.com/fauna/current/api/fql/functions/documents?lang=python). | ||
[Collection](https://docs.fauna.com/fauna/current/api/fql/functions/collection?lang=python). | ||
|
||
## Incremental sync | ||
|
||
### Updates (uses an index over ts) | ||
|
||
The source will call FQL similar to this: `Paginate(Range(Match(Index("index-name")), <last-sync-ts>, []))`. | ||
The index we match against has the values `ts` and `ref`, so it will sort by the time since the document | ||
has been modified. The Range() will limit the query to just pull the documents that have been modified | ||
since the last query. | ||
|
||
Docs: | ||
[Range](https://docs.fauna.com/fauna/current/api/fql/functions/range?lang=python). | ||
[Match](https://docs.fauna.com/fauna/current/api/fql/functions/match?lang=python). | ||
[Index](https://docs.fauna.com/fauna/current/api/fql/functions/iindex?lang=python). | ||
|
||
### Deletes (uses the events API) | ||
|
||
If the users wants deletes, we have a seperate query for that: | ||
`Paginate(Events(Documents(Collection("collection-name"))))`. This will paginate over all the events | ||
in the documents of the collection. We also filter this to only give us the events since the recently | ||
modified documents. Using these events, we can produce a record with the "deleted at" field set, so | ||
that users know the document has been deleted. | ||
|
||
Docs: | ||
[Events](https://docs.fauna.com/fauna/current/api/fql/functions/events?lang=python). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
plugins { | ||
id 'airbyte-python' | ||
id 'airbyte-docker' | ||
id 'airbyte-source-acceptance-test' | ||
} | ||
|
||
airbytePython { | ||
moduleDirectory 'source_fauna_singer' | ||
} |
Oops, something went wrong.