Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky postgresql test #1027

Open
ndr-brt opened this issue Feb 5, 2024 · 22 comments · Fixed by #1574
Open

Flaky postgresql test #1027

ndr-brt opened this issue Feb 5, 2024 · 22 comments · Fixed by #1574
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@ndr-brt
Copy link
Contributor

ndr-brt commented Feb 5, 2024

WHAT

There's a flaky test in the "postgresql" test cluster, it fails from time to time, e.g.:
https://github.com/eclipse-tractusx/tractusx-edc/actions/runs/7785340855/job/21227835752

FURTHER NOTES

// anything else you want to outline

Please be sure to take a look at
our contribution guidelines and
our PR etiquette.

@ndr-brt ndr-brt added enhancement New feature or request triage all new issues awaiting classification labels Feb 5, 2024
@github-project-automation github-project-automation bot moved this to Open in EDC Board Feb 5, 2024
Copy link
Contributor

github-actions bot commented Mar 9, 2024

This issue is stale because it has been open for 4 weeks with no activity.

@github-actions github-actions bot added the stale label Mar 9, 2024
@wolf4ood wolf4ood removed the stale label Mar 10, 2024
@wolf4ood wolf4ood added this to the Backlog milestone Mar 10, 2024
@wolf4ood
Copy link
Contributor

After the refactor, the parallelization of tests done here
the tests suite have been stable for the past week. I would closed this and in case some flaky tests emerge again i would re-open this

@github-project-automation github-project-automation bot moved this from Open to Done in EDC Board Mar 29, 2024
@wolf4ood wolf4ood removed the triage all new issues awaiting classification label Mar 29, 2024
@wolf4ood
Copy link
Contributor

Seems that it's still valid, reopening for investigation

https://github.com/eclipse-tractusx/tractusx-edc/actions/runs/9268208526

@wolf4ood wolf4ood reopened this May 28, 2024
@ndr-brt
Copy link
Contributor Author

ndr-brt commented May 28, 2024

looks like a new runtime with jetty is started when another one is still running, and the ports are defined statically so they are the same for every runtime (also for different tests).
One solution could be to generate new ports for every test, another to use the same runtime for all the tests (this will also make them run significantly faster).

@wolf4ood
Copy link
Contributor

Seems strange that a runtime is started when another one still running, we don't run them in parallel afaik

Copy link
Contributor

This issue is stale because it has been open for 2 weeks with no activity.

@github-actions github-actions bot added the stale label Jun 12, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 7 days since being marked as stale.

1 similar comment
Copy link
Contributor

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 19, 2024
@wolf4ood wolf4ood reopened this Jun 19, 2024
@wolf4ood wolf4ood removed the stale label Jun 19, 2024
@ndr-brt
Copy link
Contributor Author

ndr-brt commented Jun 27, 2024

Looks like the last time they broke on main was 3 weeks ago, maybe we fixed them unintentionally (maybe with the upstream e2e test runtime refactoring?).

@wolf4ood
Copy link
Contributor

It could be, but i think I saw some failure on dependabot PRs, i would leave this open for now, probably it might need further investigation

@wolf4ood
Copy link
Contributor

wolf4ood commented Jul 1, 2024

Seems that similar failure happens also in upstream, less frequent though

https://github.com/eclipse-edc/Connector/actions/runs/9743379424/job/26886697525?pr=4312

Copy link
Contributor

This issue is stale because it has been open for 2 weeks with no activity.

@github-actions github-actions bot added the stale label Jul 16, 2024
@ndr-brt
Copy link
Contributor Author

ndr-brt commented Jul 17, 2024

A possibility could be that the Participant object (that's instantiated statically) gets a free random port, but that port gets then used by the postgresql container as host port. The probability is quite low to be honest, but it could happen anyway. Will refactor it a little, then let's see if that fixes the issue

@ndr-brt ndr-brt self-assigned this Jul 17, 2024
@ndr-brt ndr-brt removed the stale label Jul 17, 2024
@wolf4ood
Copy link
Contributor

We can try but the linked upstream failure uses a global service from actions and not a containerized postgres.

I also saw failure on e2e tests without postgres

@wolf4ood
Copy link
Contributor

@ndr-brt
Copy link
Contributor Author

ndr-brt commented Jul 17, 2024

the upstream error is more specific because it says:
A binding for port 32762 already exists
that means that another binding with the same port is defined in the same runtime (maybe because of different call to getFreePort returned the same value.

while this says:
Address already in use
so it means that an external service is using the same port, and it could be either postgres or mockserver (some tests use it).

in any case I think it's something related to the getFreePort, maybe we could add a memory to it to avoid to return the same value twice on the same execution.
I'll open an issue upstream

@ndr-brt
Copy link
Contributor Author

ndr-brt commented Jul 29, 2024

My previous theory has been debunked, tests are still failing for the same issue 🤷
https://github.com/eclipse-tractusx/tractusx-edc/actions/runs/10141302101/job/28038270072

Copy link
Contributor

This issue is stale because it has been open for 4 weeks with no activity.

@github-actions github-actions bot added the stale label Aug 31, 2024
Copy link
Contributor

github-actions bot commented Sep 8, 2024

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 8, 2024
@ndr-brt
Copy link
Contributor Author

ndr-brt commented Sep 23, 2024

Copy link
Contributor

This issue is stale because it has been open for 4 weeks with no activity.

@github-actions github-actions bot added the stale label Oct 27, 2024
@ndr-brt ndr-brt removed the stale label Oct 28, 2024
Copy link
Contributor

This issue is stale because it has been open for 4 weeks with no activity.

@github-actions github-actions bot added the stale label Nov 30, 2024
@ndr-brt ndr-brt removed the stale label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants