-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Destination Redshift: integration tests fail due to OOM #13375
Comments
@alexandr-shegeda is this happening every time you run tests, or only sometimes? I also saw this in base-normalization #12846 (comment) (https://github.com/airbytehq/airbyte/runs/6702093751?check_suite_focus=true#step:11:12870)
|
@edgao about this "Out Of Memory" error
Redshift server can send us back "Out of memory" error from the remote server (it's not our memory issue) I think we cannot really understand why this happens because this error coming from redshift To avoid this problem I usually re-create I think we need to improve our python |
For the java-side I have tried different scenarios:
Neither of them produced an OOM error. It looks like an impermanent issue related to the Redshift CC: @edgao, @alexandr-shegeda, @grubberr |
I ran into this last night on a local test run, so it's probably still an issue. If the cause is that tests are polluting the integrationtests schema, could we update the tests to clean up their tables after completion? this was base-normalization's
|
@edgao @grubberr @alexandr-shegeda @alexandertsukanov Looks like we would need to enable WLM functionality in our redshift cluster config and probably restart the cluster (if needed), does anyone know who has the permissions to do this? |
are you referring to automatic WLM? From https://docs.aws.amazon.com/redshift/latest/dg/automatic-wlm.html#wlm-monitoring-automatic-wlm I think we're already using it; when I ran the Also, https://us-east-2.console.aws.amazon.com/redshiftv2/home?region=us-east-2#workload-management?parameter-group=default.redshift-1.0 says we're using automatic WLM. regardless - I have (or can find someone who has) permissions to modify+restart the cluster, so if there's some instructions I can follow just LMK! |
This is the link to successfully passed SAT The conditions should be:
Ideally, if we want to have this build passed all the time, we should have separated accounts/clusters/instances for all destinations, rather than use the same for all integration tests. But this is for ideal world) In our case we can drop all the schemas related to normalization tests before all tests begins, using dbt macro, I'll make a PR for this. |
hm. do you know if there's any issues with going from a 1-node cluster to multi-node? afaict that's the easiest+cheapest way to increase disk space, I couldn't find an option to just give the node more disk. We could probably also try deleting stuff from the cluster to free up some space? Not sure how difficult that would be. |
No idea, at the moment, but sounds like a plan.
I can see only the |
@edgao Also, I've created another clean db: This should be enough for all tests running on redshift cluster. |
2 PR with running normalization tests still can conflict with each other for redshift |
Yes, therefore the normalization tests should be modified to use suffix something like Or as far as I can see, we still need to clean-up the target tables after each tests, since schemas for redshift tests are the same, but target tables already have the unique names (suffix applied). |
redshift environment is looking good! https://github.com/airbytehq/airbyte/runs/6922267528?check_suite_focus=true ran successfully. |
Environment
Integration tests currently fail on CI due to OOM.
Logs
2022-05-31 11:48:44 destination > Details: -----------------------------------------------
2022-05-31 11:48:44 destination > error: Out Of Memory:
2022-05-31 11:48:44 destination > code: 1004
2022-05-31 11:48:44 destination > context: alloc(9992,MtPlan)
2022-05-31 11:48:44 destination > query: 7120875
2022-05-31 11:48:44 destination > location: alloc.cpp:493
2022-05-31 11:48:44 destination > process: query0_118_7120875 [pid=27884]
Additional context
https://github.com/airbytehq/airbyte/runs/6669394169?check_suite_focus=true
The text was updated successfully, but these errors were encountered: