Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance provenance sweeper with more graceful failure when registry contains zero documents #25

Closed
alexdunnjpl opened this issue Jun 15, 2023 · 10 comments
Assignees
Labels
B14.0 B14.1 i&t.skip Skip I&T of this task/ticket invalid This doesn't seem right task

Comments

@alexdunnjpl
Copy link
Contributor

alexdunnjpl commented Jun 15, 2023

Update to safeguard against DivideByZero

@alexdunnjpl alexdunnjpl added B14.0 i&t.skip Skip I&T of this task/ticket task labels Jun 15, 2023
@jordanpadams jordanpadams changed the title Provenance fails ungracefully when registry contains zero documents Enhance provenance sweeper with more graceful failure when registry contains zero documents Nov 6, 2023
@github-project-automation github-project-automation bot moved this to Release Backlog in B14.1 Nov 6, 2023
@al-niessner
Copy link
Contributor

@alexdunnjpl @jordanpadams

Is there a stack trace or something to go this this? I am not seeing any division problems.

@alexdunnjpl
Copy link
Contributor Author

@al-niessner if you've run sweepers against a completely-empty registry and aren't able to replicate, possibly this was fixed and the issue not closed, or fixed as a side-effect of something else, in which case this issue can be closed without further action.

@al-niessner
Copy link
Contributor

@alexdunnjpl @jordanpadams

On a completely empty opensearch (no registry) get a bunch of SSL messages because of self signed open cert (who cares it is local testing) and then a good error telling you that there is no registry:

$ PYTHONPATH=/home/niessner/Projects/PDS/registry-sweepers/src python3 src/pds/registrysweepers/provenance/__init__.py -b https://localhost:9200 -p admin -u admin --insecure
/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/connection/http_urllib3.py:199: UserWarning: Connecting to https://localhost:9200 using SSL with verify_certs=False is insecure.
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Traceback (most recent call last):
  File "/home/niessner/Projects/PDS/registry-sweepers/src/pds/registrysweepers/provenance/__init__.py", line 149, in <module>
    run(
  File "/home/niessner/Projects/PDS/registry-sweepers/src/pds/registrysweepers/provenance/__init__.py", line 75, in run
    successors = get_successors_by_lidvid(extant_lidvids)
  File "/home/niessner/Projects/PDS/registry-sweepers/src/pds/registrysweepers/provenance/__init__.py", line 90, in get_successors_by_lidvid
    extant_lidvids = list(extant_lidvids)  # ensure against consumable iterator
  File "/home/niessner/Projects/PDS/registry-sweepers/src/pds/registrysweepers/utils/db/__init__.py", line 67, in query_registry_db
    results = retry_call(
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/retry/api.py", line 101, in retry_call
    return __retry_internal(partial(f, *args, **kwargs), exceptions, tries, delay, max_delay, backoff, jitter, logger)
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/retry/api.py", line 33, in __retry_internal
    return f()
  File "/home/niessner/Projects/PDS/registry-sweepers/src/pds/registrysweepers/utils/db/__init__.py", line 52, in fetch_func
    return client.search(
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/client/utils.py", line 179, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/client/__init__.py", line 1553, in search
    return self.transport.perform_request(
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/transport.py", line 409, in perform_request
    raise e
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/transport.py", line 370, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/connection/http_urllib3.py", line 266, in perform_request
    self._raise_error(
  File "/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/connection/base.py", line 301, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.NotFoundError: NotFoundError(404, 'index_not_found_exception', 'no such index [registry]', registry, index_or_alias)

Did we want to make that error more ambiguous somehow?

@al-niessner
Copy link
Contributor

Then with an opensearch that has no data (created with registry/docker docker compose --profile=dev-api up):

$ PYTHONPATH=/home/niessner/Projects/PDS/registry-sweepers/src python3 src/pds/registrysweepers/provenance/__init__.py -b https://localhost:9200 -p admin -u admin --insecure
/home/niessner/.venv/pds/lib/python3.10/site-packages/opensearchpy/connection/http_urllib3.py:199: UserWarning: Connecting to https://localhost:9200 using SSL with verify_certs=False is insecure.
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/home/niessner/.venv/pds/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'localhost'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
(pds) niessner@elysium:~/Projects/PDS/registry-sweepers$ 

Same unimportant self-signed warnings but no other errors. If this is sufficient for you two, then I will let you close it without further adieu.

@alexdunnjpl
Copy link
Contributor Author

@al-niessner looks like you're just running provenance there, not the entire set of sweepers?

Should be this'n: https://github.com/NASA-PDS/registry-sweepers/blob/main/docker/sweepers_driver.py

@al-niessner
Copy link
Contributor

@alexdunnjpl

Yes, because the subject specifically says provenance sweeper. No need to run/test others.

@alexdunnjpl
Copy link
Contributor Author

@al-niessner my mistake - I'm so used to mentally translating between the two because people still refer to the sweepers suite as "provenance". Probably I wasn't doing that in the ticket title, but I can't be absolutely certain, so it may be best to run the full suite for completeness' sake given that it's not replicable via just the provenance script. Your call though.

@al-niessner
Copy link
Contributor

@alexdunnjpl

Not my call. I am just working on what was stated in the ticket. If the requirements need to be changed, then change them (fix the subject). You can also just state that the subject is wrong and they all need to be done (may be best for completeness sake is not stating it). I am used to requirements changing but it costs money when they change; so, whoever changes it has to take clear responsibility for added costs.

@alexdunnjpl
Copy link
Contributor Author

@al-niessner this ticket was basically a quick note-to-self from a few months ago that I'd forgotten the empty-registry corner case, so that I'd remember to loop back to it (hence the flippant original text). Something something good intentions...

Because of that, I can only speculate whether I made a mistake in the initial ticket subject. I'll take a look now and see whether I can replicate it.

@alexdunnjpl
Copy link
Contributor Author

I couldn't replicate it with the full-suite driver, so safe to say it's no longer a valid issue.

@github-project-automation github-project-automation bot moved this from Release Backlog to 🏁 Done in B14.1 Nov 8, 2023
@jordanpadams jordanpadams added the invalid This doesn't seem right label Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B14.0 B14.1 i&t.skip Skip I&T of this task/ticket invalid This doesn't seem right task
Projects
No open projects
Status: 🏁 Done
Development

No branches or pull requests

3 participants