Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Task exception was never retrieved" in ChainlitDataLayer.update_step() #1232

Closed
oshoma opened this issue Aug 18, 2024 · 7 comments · Fixed by #1248
Closed

"Task exception was never retrieved" in ChainlitDataLayer.update_step() #1232

oshoma opened this issue Aug 18, 2024 · 7 comments · Fixed by #1248
Labels
backend Pertains to the Python backend. data layer Pertains to data layers. needs-triage

Comments

@oshoma
Copy link
Contributor

oshoma commented Aug 18, 2024

Describe the bug
When running my Chainlit app I sometimes see a series of exception messages and tracebacks in the log that begin with "Task exception was never retrieved". As best I can tell, these come from unhandled HTTP exceptions such ashttpx.ConnectTimeout, httpx.RemoteProtocolError: Server disconnected without sending a response. and the like. I believe the source is when Chainlit.data.__init__.py calls create_step() and encounters an HTTP error.

To Reproduce

  1. Be on a WiFi network with an unstable network connection. (In my case, I am sitting 30 feet away from the nearest access point, so I get occasional connection errors.)
  2. Run chainlit app
  3. Watch the log for exception messages and stack traces beginning with the phrase "Task exception was never retrieved".

Expected behavior
The app should catch HTTP exceptions and either retry or fail gracefully.

The app should produce debug log statements to indicate the exceptions are happening.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: MacOS 12.7
  • Browser: Chrome Version 127.0.6533.120 (Official Build) (x86_64)

Additional context
Sample stack trace:

2024-08-18 12:05:10 - Task exception was never retrieved
future: <Task finished name='Task-94' coro=<ChainlitDataLayer.update_step() done, defined at /Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/chainlit/data/__init__.py:45> exception=ConnectTimeout('')>
Traceback (most recent call last):
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
    yield
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
    resp = await self._pool.handle_async_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_async/connection.py", line 99, in handle_async_request
    raise exc
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_async/connection.py", line 76, in handle_async_request
    stream = await self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_async/connection.py", line 122, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_backends/auto.py", line 30, in connect_tcp
    return await self._backend.connect_tcp(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 114, in connect_tcp
    with map_exceptions(exc_map):
  File "/Users/oshoma/miniconda3/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/oshoma/miniconda3/lib/python3.11/asyncio/tasks.py", line 269, in __step
    result = coro.throw(exc)
             ^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/chainlit/data/__init__.py", line 60, in wrapper
    return await method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/chainlit/data/__init__.py", line 401, in update_step
    await self.create_step(step_dict)
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/chainlit/data/__init__.py", line 60, in wrapper
    return await method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/chainlit/data/__init__.py", line 389, in create_step
    await self.client.api.send_steps([step])
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/literalai/api/__init__.py", line 2181, in send_steps
    return await self.gql_helper(*send_steps_helper(steps=steps))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/literalai/api/__init__.py", line 1464, in gql_helper
    response = await self.make_gql_call(description, query, variables)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/literalai/api/__init__.py", line 1376, in make_gql_call
    response = await client.post(
               ^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_client.py", line 1892, in post
    return await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_client.py", line 1574, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_client.py", line 1661, in send
    response = await self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_client.py", line 1689, in _send_handling_auth
    response = await self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_client.py", line 1726, in _send_handling_redirects
    response = await self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_client.py", line 1763, in _send_single_request
    response = await transport.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 372, in handle_async_request
    with map_httpcore_exceptions():
  File "/Users/oshoma/miniconda3/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/oshoma/dev/.../.venv/lib/python3.11/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectTimeout

...and this continues with repeated tracebacks like the one above...
@oshoma
Copy link
Contributor Author

oshoma commented Aug 18, 2024

I'll produce a PR with a candidate fix. I'm wondering whether this is best handled in the chainlit library or in the literalai client.

@dokterbob
Copy link
Collaborator

dokterbob commented Aug 19, 2024

@oshoma Thanks! Looking forward to that!

It seems to me that the cleanest developer experience when working with a library is when only well-defined exceptions are produced and so, underlying errors like I/O should be wrapped. In addition, retry logic can then be implemented on a library level.

So, to answer your question; indeed, the LiteralAI client lib seems the place.

When wrapping, take care to distinguish:

  1. (Possibly) intermittent errors (server errors, too many requests etc.)
  2. Permanent errors (bad request etc.)

Ref: https://roman.pt/posts/temporary-vs-permanent-errors/

This is not only relevant for retry logic, it's also relevant for UX (e.g. no need for the user to keep retrying if they'll actually be triggering the same bug, better to just show them there's a bug and whatever they tried won't work until it's fixed).

@dokterbob dokterbob added data layer Pertains to data layers. backend Pertains to the Python backend. labels Aug 19, 2024
@oshoma
Copy link
Contributor Author

oshoma commented Aug 20, 2024

@dokterbob Yes I agree it would be better to make the changes to the LiteralAI library rather than Chainlit. Is LiteralAI development happening in a git repo somewhere I can contribute? I see I can download files from https://pypi.org/project/literalai/#files but there is no link to a repo.

@oshoma
Copy link
Contributor Author

oshoma commented Aug 20, 2024

@dokterbob looking further at AsyncLiteralAPI in LiteralAI api/__init__.py I see the code there tries the HTTP POST with a timeout, and then raises all errors that are encountered. I assume the strategy is that API-calling clients must know about and handle those errors themselves, e.g. by retrying, or ignoring, or whatever makes most sense to the particular client.

I don't want to disturb that strategy as doing so could break API clients.

So I'm going to submit a PR for Chainlit instead of a patch for LiteralAI.

oshoma added a commit to oshoma/chainlit that referenced this issue Aug 20, 2024
Fixes Chainlit#1232

Modify Chainlit so that HTTP errors which occur while sending steps
to LiteralAI are caught and logged.

Prior to this change, HTTP errors such as timeouts result in a series
of cascading exceptions and tracebacks that begin with "Task exception
was never retrieved" and continue for several traceback iterations.
The result is a very verbose log which is hard to understand.

With this change we will see one-line error messages in the log
rather than a series of tracebacks.

In the future we might want to improve this further by retrying
(sending the steps again) when the HTTP error is temporary.
oshoma added a commit to oshoma/chainlit that referenced this issue Aug 20, 2024
Fixes Chainlit#1232

Modify Chainlit so that HTTP errors which occur while sending steps
to LiteralAI are caught and logged.

Prior to this change, HTTP errors such as timeouts result in a series
of cascading exceptions and tracebacks that begin with "Task exception
was never retrieved" and continue for several traceback iterations.
The result is a very verbose log which is hard to understand.

With this change we will see one-line error messages in the log
rather than a series of tracebacks.

In the future we might want to improve this further by retrying
(sending the steps again) when the HTTP error is temporary.
@dokterbob
Copy link
Collaborator

I think it might be wise to take this up with LiteralAI. @willydouhard What's your stance?

@dokterbob
Copy link
Collaborator

I've just discussed it with @willydouhard, LiteralAI's CTO;

  1. He agrees that client layers should abstract away lower-level protocol errors, but wants to be cautious not to break older clients.
  2. On that note, Chainlit is currently using an older version of LiteralAI's client, so an upgrade requires some mild refactoring.
  3. The LiteralAI repo is here; https://github.com/Chainlit/literalai-python
  4. Until that time, I think your patch Gracefully handle HTTP errors when sending steps #1248 seems good enough for now -- regardless I'll still ask Willy for a quick review (while I'm getting more familiar with the code base). If this wait takes too long (he's quite occupied), I'll merge some time next week.

@dokterbob
Copy link
Collaborator

Closing this as I feel your PR addresses the issue sufficiently (for now).

dokterbob pushed a commit that referenced this issue Aug 22, 2024
Fixes #1232

Modify Chainlit so that HTTP errors which occur while sending steps
to LiteralAI are caught and logged.

Prior to this change, HTTP errors such as timeouts result in a series
of cascading exceptions and tracebacks that begin with "Task exception
was never retrieved" and continue for several traceback iterations.
The result is a very verbose log which is hard to understand.

With this change we will see one-line error messages in the log
rather than a series of tracebacks.

In the future we might want to improve this further by retrying
(sending the steps again) when the HTTP error is temporary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Pertains to the Python backend. data layer Pertains to data layers. needs-triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants