Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backoff.expo v2.0.0 breaking change #2829

Closed
jamesboehmer opened this issue Jul 18, 2022 · 9 comments · Fixed by #2980
Closed

backoff.expo v2.0.0 breaking change #2829

jamesboehmer opened this issue Jul 18, 2022 · 9 comments · Fixed by #2980
Labels
bug Something isn't working

Comments

@jamesboehmer
Copy link

Describe your environment
pyproject.toml dependencies installed via poetry:

python = "3.8.11"
opentelemetry-instrumentation = "^0.32b0"
opentelemetry-instrumentation-fastapi = "^0.32b0"
opentelemetry-instrumentation-sqlalchemy = "^0.32b0"
opentelemetry-instrumentation-requests = "^0.32b0"
opentelemetry-instrumentation-httpx = "^0.32b0"
opentelemetry-instrumentation-logging = "^0.32b0"
opentelemetry-distro = {extras = ["otlp"], version = "^0.32b0"}
opentelemetry-instrumentation-aiohttp-client = "^0.32b0"
opentelemetry-api = "^1.12.0rc2"

Steps to reproduce

from fastapi import FastAPI, HTTPException, status
import uvicorn
from opentelemetry import metrics, trace
from opentelemetry.sdk.resources import SERVICE_NAME, SERVICE_VERSION, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader

app = FastAPI(
    title="OTLP backoff dependency test",
    description="OTLP backoff dependency test",
    version="0.0.1",
)

@app.get("/", responses={200: {"description": "ok"}}, include_in_schema=False)
async def root() -> None:
    """Root"""
    return status.HTTP_200_OK, "ok"

resource = Resource(
    attributes={
        SERVICE_NAME: "otlp-backoff-test",
        SERVICE_VERSION: "0.0.1",
    }
)

tracer_provider = TracerProvider(resource=resource)

processor = BatchSpanProcessor(
    OTLPSpanExporter(endpoint="http://localhost:4317/v1/traces"),
)
tracer_provider.add_span_processor(processor)
trace.set_tracer_provider(tracer_provider)

metric_exporter = OTLPMetricExporter(
    endpoint="http://localhost:4317/v1/metrics"
)

meter_reader = PeriodicExportingMetricReader(
    exporter=metric_exporter,
    export_interval_millis=5000,
)

meter_provider = MeterProvider(
    resource=resource,
    metric_readers=[meter_reader],
)
metrics.set_meter_provider(meter_provider)

def main():

    uvicorn.run(
        "main:app",
    )

if __name__ == "__main__":
    main()

What is the expected behavior?
Periodic backoff behavior in the logs

Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 8s.
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 16s.
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 32s.

What is the actual behavior?
After each transient error reported in the log, an exception is raised

Traceback (most recent call last):
  File "/venv/lib/python3.8/site-packages/opentelemetry/sdk/metrics/_internal/export/__init__.py", line 412, in _receive_metrics
    self._exporter.export(metrics_data, timeout_millis=timeout_millis)
  File "/venv/lib/python3.8/site-packages/opentelemetry/exporter/otlp/proto/grpc/metric_exporter/__init__.py", line 205, in export
    return self._export(metrics_data)
  File "/venv/lib/python3.8/site-packages/opentelemetry/exporter/otlp/proto/grpc/exporter.py", line 328, in _export
    sleep(delay)
TypeError: an integer is required (got type NoneType)

Additional context
I believe the problem is related to changed behavior in backoff 2.0.0. See backoff/_wait_gen.py at https://github.com/litl/backoff/compare/v1.11.1..v2.0.0 . The expo() function now yields once because it's called elsewhere as a coroutine, which returns a None where opentelemetry expects an int. It's not clear to me that expo() is meant for direct usage given that backoff docs show decorators, and the function is in a dunder module. That said, the function is also exported in backoff's __all__, so perhaps they meant to make a breaking change?

Given opentelemetry's wide version range for the dependency (

backoff >= 1.10.0, < 2.0.0; python_version<'3.7'
), recent installs will include this newer version of backoff with the breaking change in functionality.

I was able to fix it in my app by pinning backoff = "1.11.1"

@jamesboehmer jamesboehmer added the bug Something isn't working label Jul 18, 2022
@Yamakaky
Copy link

Is no one else having this issue? Without the version pin, this completely breaks the grpc for me. As a quick fix, is there a downside to changing the version requirement to < 2.0.0?

@adriangb
Copy link
Contributor

I am also experiencing this. I just pinned the version and haven't had any issues, but it's not great. I think this library should either handle all of the versions specified or narrow it's version constraints.

Yamakaky added a commit to Yamakaky/opentelemetry-python that referenced this issue Aug 31, 2022
Yamakaky added a commit to Yamakaky/opentelemetry-python that referenced this issue Aug 31, 2022
Yamakaky added a commit to Yamakaky/opentelemetry-python that referenced this issue Sep 1, 2022
Yamakaky added a commit to Yamakaky/opentelemetry-python that referenced this issue Sep 1, 2022
@agrapsas
Copy link

agrapsas commented Sep 6, 2022

Just adding that this is also a show stopper for me.

@aabmass
Copy link
Member

aabmass commented Sep 8, 2022

Are folks ok with changing the dependency to backoff ~= 2.0 (major version 2 compat)? Based on https://pypi.org/project/backoff/#history, 1.X release of backoff is from July 2021. Anyone still stuck on 1.X for any reason?

@jamesboehmer
Copy link
Author

@aabmass My case was simple with no dependencies on backoff at all, so pinning to 1.11.1 will be easy to back out, and I'd be supportive of a refactor to make use of ~2.0.

For other folks in this thread - See the comment on @Yamakaky 's PR.

@adriangb
Copy link
Contributor

adriangb commented Sep 9, 2022

Same here: I pinned 1.11.1 but would be in favor of supporting only ~=2.0, I'll just have to relook my deps which is pretty easy.

@jstlaurent
Copy link

I would prefer ~=2.0, since I have other dependencies that need it (schemathesis=~3.17). As with the others, I pinned 1.11.1 to fix the issue.

@cgspeck-aai
Copy link

Adding my +1 to fixing the library. This is definitely a widespread problem and is affecting me too.

@nickstenning
Copy link
Contributor

nickstenning commented Oct 18, 2022

The problem here is the direct use of the generator returned by backoff.expo without an initialising .send(None) as done by the code in backoff's "porcelain" APIs (backoff.on_exception and backoff.on_predicate).

This behavior changed in backoff==2.0.0 and the exporter retries are fundamentally broken with backoff>=2.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
8 participants