`Data too large` error from very large data products #133

jordanpadams · 2023-09-27T22:12:22Z

Checked for duplicates

Yes - I've already checked

🐛 Describe the bug

When I did a harvest of a data set with some very large data products, I get a data too large error and the data is not loaded into the Registry.

🕵️ Expected behavior

I expected the loaded would be nominally loaded into the Registry.

📜 To Reproduce

Download TBD data product
Attempt to harvest the product
Note the error

[ERROR] LIDVID = urn:esa:psa:em16_tgo_acs:data_raw:acs_raw_hk_nir_20170907t000000-20170907t055959::3.0, 
Message = [parent] Data too large, data for [indices:data/write/bulk[s]] would be [16591820628/15.4gb], 
which exceeds the limit of [16287753830/15.1gb]. Current usage: [16591415264/15.4gb], new bytes reserved: [405364/395.8kb], 
usages [request=0/0b, fielddata=0/0b, in_flight_requests=405364/395.8kb, accounting=613644/599.2kb]

🖥 Environment Info

Linux

📚 Version of Software Used

3.7.6

🩺 Test Data / Additional context

TBD

🦄 Related requirements

No response

⚙️ Engineering Details

No response

The text was updated successfully, but these errors were encountered:

alexdunnjpl · 2023-09-27T23:36:04Z

@jordanpadams what's the best way to get a copy of the label for this product?

jordanpadams · 2023-09-28T15:19:33Z

@alexdunnjpl a ping is out to the user.

alexdunnjpl · 2023-10-03T23:35:09Z

@jordanpadams looking deeper into this error, it appears to be due to imminent exhaustion of the JVM heap on OpenSearch, rather than any one request/product being too large. (Presumably RAM allocation is currently 16GB on that node)

The fix here is to bump up the instance size to cope with peak throughput, and/or incorporate pause/retry behaviour in harvest.

Closing as a duplicate of #125 on that basis, since the fix for that is a fix for this.

jordanpadams · 2023-10-04T16:43:12Z

@alexdunnjpl nice sleuthing. 🎉

alexdunnjpl · 2023-10-04T17:30:04Z

@sjoshi-jpl I see that psa is currently r5.4xlarge.search (128GB RAM) - did this get bumped up from r5.xlarge.search (16GB RAM) at some point recently?

sjoshi-jpl · 2023-10-04T17:42:45Z

@alexdunnjpl yes this was recently bumped up based on our last conversation with @jordanpadams and @tloubrieu-jpl as we discussed how PSA could be as large / resource intensive as GEO.

jordanpadams added bug Something isn't working needs:triage labels Sep 27, 2023

jordanpadams self-assigned this Sep 27, 2023

jordanpadams added s.medium B14.1 and removed needs:triage labels Sep 27, 2023

jordanpadams assigned alexdunnjpl and unassigned jordanpadams Sep 27, 2023

jordanpadams added this to B14.1 Sep 27, 2023

github-project-automation bot moved this to Release Backlog in B14.1 Sep 27, 2023

jordanpadams mentioned this issue Sep 22, 2023

Improved Fault Tolerance for Registry and Registry API NASA-PDS/registry#178

Closed

alexdunnjpl closed this as completed Oct 3, 2023

github-project-automation bot moved this from Release Backlog to 🏁 Done in B14.1 Oct 3, 2023

jordanpadams added the duplicate This issue or pull request already exists label Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Data too large` error from very large data products #133

`Data too large` error from very large data products #133

jordanpadams commented Sep 27, 2023 •

edited

Loading

alexdunnjpl commented Sep 27, 2023

jordanpadams commented Sep 28, 2023

alexdunnjpl commented Oct 3, 2023

jordanpadams commented Oct 4, 2023

alexdunnjpl commented Oct 4, 2023 •

edited

Loading

sjoshi-jpl commented Oct 4, 2023

Data too large error from very large data products #133

Data too large error from very large data products #133

Comments

jordanpadams commented Sep 27, 2023 • edited Loading

Checked for duplicates

🐛 Describe the bug

🕵️ Expected behavior

📜 To Reproduce

🖥 Environment Info

📚 Version of Software Used

🩺 Test Data / Additional context

🦄 Related requirements

⚙️ Engineering Details

alexdunnjpl commented Sep 27, 2023

jordanpadams commented Sep 28, 2023

alexdunnjpl commented Oct 3, 2023

jordanpadams commented Oct 4, 2023

alexdunnjpl commented Oct 4, 2023 • edited Loading

sjoshi-jpl commented Oct 4, 2023

`Data too large` error from very large data products #133

`Data too large` error from very large data products #133

jordanpadams commented Sep 27, 2023 •

edited

Loading

alexdunnjpl commented Oct 4, 2023 •

edited

Loading