Limit processing of queued files to 500 per job #8

observingClouds · 2022-12-07T19:41:54Z

On the compute, shared and interactive partitions, slk retrieve is allowed to retrieve 500 files at once. Thus, if more than 500 files are requested here, it should be split into several retrievals. @antarcticrainforest Or would you split up the file list to parts <501 anyway before calling this function? I'll try to implement this feature ("group_files_by_tape") to the slk_helpers as soon as possible. But currently, I am mainly bound to slk testing and user support. So, let's see ;-) .

Originally posted by @neumannd in #3 (comment)

The text was updated successfully, but these errors were encountered:

observingClouds · 2022-12-07T19:44:27Z

On the login nodes we allow slk retrieve to retrieve one file with one call of slk retrieve. There is a StrongLink config file in /etc/stronglink.conf which is JSON and contains an attribute "retrieve_file_limit":1 (on login nodes) or "retrieve_file_limit":500 (on other nodes). This file could be imported somewhere to find out how many files are allowed to be retrieved. Maybe, this number is changed in future if needed.

_Originally posted by @neumannd in #3 (comment)

florianziemen · 2023-02-09T15:19:31Z

I think in the case of a retrieve for more than 500 files (or maybe rather 10/... tapes) we should assume the user to have done a mistake, cancel the whole thing and throw an error. Otherwise we run into the problem that users might accidentally trigger loading half the HSM into the cache...

observingClouds · 2023-02-09T17:31:44Z

I see that 500 files can be a massive request and a mistake. I would argue that this limitation should however be done at the lowest level, like slk or at least pyslk. This would ensure that the behaviour is the same across all access methods and that slkspec remains more general and could also be used for a tape archive at a different institution who may have different resources. Instead of using a number of files as limit, one could also think of restricting a retrieval by size.

florianziemen · 2023-02-09T17:33:46Z

yeah, just saying that we should not try to bypass such limitations, b/c they are there for a reason.

observingClouds · 2023-02-09T17:39:44Z

I see where you are coming from. Retrievals are now for the most part combined into a single slk retrieve call. If slk has limitations in place, these will affect also slkspec retrievals.

neumannd · 2023-02-09T19:04:22Z

yeah, just saying that we should not try to bypass such limitations, b/c they are there for a reason.

Yes. We feared slk retrieve -R /arch . ;-)

It would be the savest to read out the retrieve_file_limit from this /etc/stronglink.conf.

neumannd · 2023-02-21T08:59:57Z

@observingClouds It would be reasonable to read /etc/stronglink.conf (example content):

{"host":"archive.dkrz.de","domain":"ldap","logSize":"10MB","retrieve_file_limit":500}

Then extract the value of retrieve_file_limit. Currently, it is 1 on levante login nodes and 500 on levante compute/interactive/shared nodes. This limit might be changed in future or on individual nodes (e.g. a "mass-data-retrieval-node" where it is set to 5000).

slk_conf_global="/etc/stronglink.conf"
# `-1` == no limit
retrieve_file_limit = -1
if os.path.exists(slk_conf_global):
  with open(slk_conf_global, 'r') as f:
    data = json.load(f)
  retrieve_file_limit = data.get("retrieve_file_limit", -1)

observingClouds mentioned this issue Dec 7, 2022

WIP: Refactor #3

Merged

observingClouds added the upstream label Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit processing of queued files to 500 per job #8

Limit processing of queued files to 500 per job #8

observingClouds commented Dec 7, 2022 •

edited

Loading

observingClouds commented Dec 7, 2022

florianziemen commented Feb 9, 2023

observingClouds commented Feb 9, 2023

florianziemen commented Feb 9, 2023

observingClouds commented Feb 9, 2023 •

edited

Loading

neumannd commented Feb 9, 2023

neumannd commented Feb 21, 2023

Limit processing of queued files to 500 per job #8

Limit processing of queued files to 500 per job #8

Comments

observingClouds commented Dec 7, 2022 • edited Loading

observingClouds commented Dec 7, 2022

florianziemen commented Feb 9, 2023

observingClouds commented Feb 9, 2023

florianziemen commented Feb 9, 2023

observingClouds commented Feb 9, 2023 • edited Loading

neumannd commented Feb 9, 2023

neumannd commented Feb 21, 2023

observingClouds commented Dec 7, 2022 •

edited

Loading

observingClouds commented Feb 9, 2023 •

edited

Loading