Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(metadata-io): Run metadata-io tests in parallel #3358

Merged

Conversation

EnricoMi
Copy link
Contributor

@EnricoMi EnricoMi commented Oct 11, 2021

The metadata-io project has a lot of tests that spin-up a Docker container, which takes 10-12s before the actual tests can run. There are 5 ElasticSearch containers. Soon there will be a Dgraph (#3261) and a Cassandra (#3286) container as well.

The pure tests take around 2 minutes:

./gradlew :metadata-io:cleanTest :metadata-io:test

Gradle seems to instantiate all tests in a project first and call the @BeforeTest method which starts the required containers. All these containers are started sequentially, before the first test is run in metadata-io. Then, few tests run in parallel. See CREATED:

$ docker container ls
CONTAINER ID   IMAGE                                                 COMMAND                  CREATED          STATUS
3437222a4acc   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   7 seconds ago    Up 5 seconds
3c6d34138356   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   17 seconds ago   Up 16 seconds
ac5b82875a52   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   27 seconds ago   Up 26 seconds
a825c27add3a   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   37 seconds ago   Up 36 seconds
a9fdf4248f4b   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   47 seconds ago   Up 46 seconds
468ae10eea30   testcontainers/ryuk:0.3.1                             "/app"                   48 seconds ago   Up 47 seconds      

Especially if you run only a specific test class, e.g. ElasticSearchGraphServiceTest, gradle still starts all containers sequentially to finally run only the one test class:

./gradlew :metadata-io:cleanTest :metadata-io:test --tests "*.ElasticSearchGraphServiceTest"

This change makes gradle run tests inside metadata-io in parallel (see CREATED):

CONTAINER ID   IMAGE                                                 COMMAND                  CREATED          STATUS
a118bad2d9cd   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   18 seconds ago   Up 17 seconds
3286a2ce7fe0   testcontainers/ryuk:0.3.1                             "/app"                   20 seconds ago   Up 18 seconds
c2899fefd923   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   21 seconds ago   Up 19 seconds
3d8d7a23b222   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   21 seconds ago   Up 19 seconds
e0494658647e   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   21 seconds ago   Up 20 seconds
feb7a2bb8b00   docker.elastic.co/elasticsearch/elasticsearch:7.9.3   "/tini -- /usr/local…"   22 seconds ago   Up 20 seconds
be21d511314d   testcontainers/ryuk:0.3.1                             "/app"                   22 seconds ago   Up 20 seconds
2c873a246b6a   testcontainers/ryuk:0.3.1                             "/app"                   22 seconds ago   Up 20 seconds
d753564f9c0c   testcontainers/ryuk:0.3.1                             "/app"                   23 seconds ago   Up 21 seconds
c1da864f75cb   testcontainers/ryuk:0.3.1                             "/app"                   23 seconds ago   Up 21 seconds

This cuts test time into almost half (down to 1m 20s). For ElasticSearchGraphServiceTest only, for instance, test time cuts down from 1m 10s to 48s. The benefit will be even larger when Dgraph and Cassandra containers are used for testing.

@EnricoMi EnricoMi changed the title test(mtadata-io): Run metadata-io tests in parallel test(metadata-io): Run metadata-io tests in parallel Oct 11, 2021
@EnricoMi
Copy link
Contributor Author

Not sure how metadata-ingestion failures are related to changes to metadata-io gradle config.

@shirshanka
Copy link
Contributor

@EnricoMi : a rebase might fix your branch. We had an upstream airflow release break our CI earlier today.

@EnricoMi EnricoMi force-pushed the branch-metadata-io-parallel-tests branch from c726db7 to 8a64f32 Compare October 12, 2021 06:32
@EnricoMi
Copy link
Contributor Author

@gabe-lyons rebased, everything is green

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit 38328d6 into datahub-project:master Oct 12, 2021
@shirshanka shirshanka added the hacktoberfest-accepted Acceptance for hacktoberfest https://hacktoberfest.com/participation/ label Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Acceptance for hacktoberfest https://hacktoberfest.com/participation/
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants