-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate replacing feature flag service #1343
Comments
From some research, I think our best bet might be flagd. It's maintained by OpenFeature, and it seems to be fairly lightweight and straightforward for our purposes. Even better, it's OpenTelemetry compatible, so we can get independent instrumentation out of it. It supports more evaluation styles than we currently do as well (such as targeting, multi-variable experiments, etc.) and can be configured with JSON files rather than requiring a separate database. |
I vote we replace the product catalog service with an Erlang/Elixir equivalent. This service is part of the minimal install and is one of 3 Go-based services in the demo. |
That should be vaguely easier to maintain since there's no UI component at least? |
ProductCatalog sounds like a good suggestion to be replaced. I'm just afraid of whenever we start adding Swift and Android to the demo. |
Went ahead and made a PR swapping things out so we can get an idea of what this all would look like. There's a few gaps (no Python provider for flagd, which I'll have to write or find) but overall it seems ok. ed: looks like there is a flagd provider for python, it's just not released yet. hopefully they'll get that sorted soon. |
@julianocosta89 @puckpuck After spending a bit of time trying to rewrite the product catalog in Erlang, I'm not so sure about it. Frankly, 3/4ths of this is that I simply don't grok Erlang well enough to be productive in it (and ChatGPT only gets you so far). I think our best bet would be to create a new Erlang service that reads off Kafka and try to make it as minimal as possible to avoid the need to touch it in the future outside of OpenTelemetry updates. I feel like making Product Catalog an Erlang service just creates another load-bearing dependency that'll be hard for us to work with going forward. |
Also worth noting there's no OpenFeature libraries for Erlang at this point in time, so we'd have to write that as well. |
Yeah, I've figured that would be tricky! |
Let me bring to the discussion @joshleecreates that has some experience in Erlang/Elixir, and also @tsloughter that was the one that initially implemented the FeatureFlagService. |
The service doesn't need to consume feature flags in its initial version.
I am inclined to agree with this. I like the idea of a simple stateless service, but to @julianocosta89's point, there would be no auto-instrumentation unless we also included a downstream database or some kind of HTTP server. But maybe that isn't a bad thing? This is a pretty realistic use-case and would be a good demo of combining auto- and manual-instrumentation to complete a trace. |
Just saw this issue after i saw the other. I can do the product catalog. Also can do a kafka service, but seems easier to just translate an existing service that seems to just read in json and serve it over grpc? |
Personally I lean away from "batteries-inlcuded" frameworks like Phoenix in the demo since the frameworks' conventions are often at odds with the development and build needs of the demo, so removing Phoenix gets a +1 from me. |
@tsloughter @joshleecreates I think the ideal replacement Erlang service would be something that meets the criteria of #912 . Reads orders off the Kafka queue, then persists them to a DB. If we wanted to swap out a JSON service, then we could... but I'd be more in favor of doing something net-new, just to make sure it wasn't a load-bearing part of the architecture. |
Replying to #1388 (comment) The biggest advantage I can see to simplifying the build pipeline is that if we move Erlang off into its own little corner, then we don't have to deal with the weird QEMU build issues any more. FF Service was load-bearing to the entire demo and wouldn't run on AS/ARM without cross-compilation. By making a service that's kind of just off to the side, if we can't build cross-platform then that's fine, we can just publish the x86 version and have it not run without impacting the overall demo. |
I knew about the Windows 11 issue and I sort of may vaguely recall an ARM build discussion at one point but is there a github issue for that. |
@tsloughter I'm not sure there's an exhaustive writeup but it's this issue: #956 The short version is that some weird combination of issues with QEMU causes segfaults when trying to update past OTP 23. The actual notes are spread out over a combination of github issues, PRs, and Slack threads. I think the WSL issue is actually independent of it. |
I had another idea, let me know what you all think. The easiest service to replace we have in the demo today is the We could replace the It will be a bit more work, but I think it is a good idea. WDYT? |
We currently have 3 services with Go that feel quite a lot. We could replace |
yeah, that would be okay to me as well. The only problem is that there is no auto-instrumentation for Kafka in Erlang/Elixir, so we need to manually do it. |
This doesn't seem like a big problem if we're just consuming a single topic, since we also want to show one example of manual instrumentation in each service anyways |
while i'm in favor of reducing go services, i'd prefer we replace them with something like Java or .NET (probably .NET) -- while I love each and every otel language equally, the world does not, and we should probably err on the side of language popularity when it comes to these decisions rather than trying to have a 1:1 mapping of services to languages. |
in terms of demo size, I feel like having the erlang service be net-new as a kafka consumer is actually better than replacing an existing one. kafka consumers can all be dropped from the minimal compose file, and since there's no UI, overall requirements to run it should be less. |
Per SIG meeting -
|
@austinlparker why FraudDetection is Kotlin and the only Kotlin service in the demo. AccountingService is Go and we have 3 services written in Go. |
sure, accountingservice is fine |
👋 Hey there. I've been messing around with the otel demo a bit recently and it is pretty great, thanks! I'm not sure if this is the best place for this, but since this issue is specifically about using But this does not work in K8s. I installed the demo via the Helm chart and everything starts up fine. However, the question is: how do I trigger the memory leak? I looked at the I tried port-forwarding But after poking around in the kubectl exec my-otel-demo-ffspostgres-58df9cd6bc-5s77z -- psql -U ffs -d ffs -c \
"UPDATE featureflags SET enabled=1 WHERE name='recommendationCache'"
UPDATE 1 After changing the flag directly in It's obiviously working to trigger the memory leak because the pod is getting kubectl get pods | grep recommendationservice
my-otel-demo-recommendationservice-756854bf-r22p8 0/1 OOMKilled 2 (89s ago) 52m All this to say, I think the docs could use an update to clarify that updating the feature flags in the |
@cthain part of the reason it doesn't work is because the helm charts haven't been updated yet. generally we don't keep them up to date with main; we usually update them before a release goes out (which should happen soon!) |
Update post SIG meeting: I started to build out a new elixir service downstream from Kafka. In my initial experiments I found that the Kafka client library I had chosen (https://github.com/kafkaex/kafka_ex) does not seem to support the version of Kafka used with the demo, or I am misunderstanding how to form the connection. In either case, it seems apparent that Kafka + Elixir is an unusual combination and not necessarily the best way to demo Erlang/Elixir observability. In the SIG we discussed some potential alternatives:
That chat app seems to be a good path forward because:
|
If you'd like to discuss with the SIG we'll be meeting tomorrow (9 PST) in the ErlEF Observability WG zoom instead of the SIG zoom. See https://docs.google.com/document/d/18JVh6ICLyRCJBpRVIXwcR1fb-K4qRX3WHYgtM7mFx2U/edit for the agenda and zoom link. |
I see we are still discussing what to do with an Elixir service. We should take that discussion to a new issue since I'm closing this one. We now have flagd rolled out with the 1.9 release for feature flagging, which covers this issue's original request. |
Capturing a SIG meeting discussion today...
The feature flag service is kind of a frustrating dependency for us currently. It's very important, but it's also hard to maintain as we don't have a ton of expertise in Erlang. In addition, due to limitations of our build pipeline it's very out of date.
Our general idea is to investigate replacing the feature flag service with some other off-the-shelf solution, then refactoring everything to use OpenFeature. Erlang support would come back into the demo as part of a smaller, more self-contained service (possibly some sort of consumer off Kafka) that should be more amenable to the build pipeline.
The text was updated successfully, but these errors were encountered: