Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of broadway #149

Merged
merged 12 commits into from
Feb 9, 2022
Merged

Get rid of broadway #149

merged 12 commits into from
Feb 9, 2022

Conversation

nickolaich
Copy link
Collaborator

removed broadway from dependencies.
It looks like PushPipeline isn't required at all.
It adds complexity with a Broadway (GenStage underneath) only for pushing jobs to Faktory.
I am not sure if I understood everything correctly but it seems we can just push job to Faktory. If assumption is correct I can clean-up code at job.ex ->perform_async function.

@swelham
Copy link
Collaborator

swelham commented Jan 17, 2022

Hey, I can provide some context around this.

One of our original goals for building this library was to provide a way to reliably push a job or batch of jobs to Faktory without the caller having to be concerned about the outcome. We took a fire and forget approach and allowed another part of the library to deal with any failures. Broadway allowed us to do this very quickly because it already had a mechanism for dealing with and retrying failed push attempts.

We also deal with large spikes in our app that have in the past required us to schedule tens/hundreds of thousands of jobs in batch from a single run of a cron job. Without the above feature, the cron job would have to deal with parallelising the push to Faktory, then tracking and retrying any potential failures.

However, Faktory now has support for batching jobs that might solve the same issue without the need for a custom implementation (I haven't looked at the feature in-depth). In which case simplifying the library is always a win 😃

@nickolaich
Copy link
Collaborator Author

nickolaich commented Jan 17, 2022

@swelham thank you for explanation. I thought the same. Since Faktory has batch operations and there is module FaktoryWorker.Batch to support creating batches and handle callbacks - broadway looks like an overhead.
Disadvantage of getting rid of broadway and concurrency scheduling jobs using elixir is that Batch operations are supported only by Enterprise version of Faktory

@swelham
Copy link
Collaborator

swelham commented Jan 18, 2022

Yeah, it would be a disadvantage to some and a breaking change. However, I think for maintainability and parity with Faktory this tradeoff makes sense in the context of this library. After all, this only existed to plug a gap in Faktory where a desired feature previously didn't exist.

In addition, any users of this library that are using the OSS version of Faktory could implement their own solution (with or without Broadway). Maybe a blog post detailing the original implementation would be useful, this could even be linked from the changelog to help any users migrate.

@Ch4s3
Copy link
Collaborator

Ch4s3 commented Jan 18, 2022

Yeah, it would be a disadvantage to some and a breaking change. However, I think for maintainability and parity with Faktory this tradeoff makes sense in the context of this library. After all, this only existed to plug a gap in Faktory where a desired feature previously didn't exist.

Our plan is to more closely mirror the official client libraries. We'd be doing a major version bump and we'll write an upgrade guide.The right solution might be for people to implement their own batching if they're using OSS Faktory.

@Ch4s3
Copy link
Collaborator

Ch4s3 commented Jan 18, 2022

@swelham are you folks still using this or have you fully transitioned off?

Copy link
Collaborator

@jeremyowensboggs jeremyowensboggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the pipeline name can be removed entirely?

|> format_pipeline_name()
end

defp format_pipeline_name(name) when is_atom(name) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed? Looks like the pipeline name may be able to be completely eliminated?

# But Faktory is a job server and do we want to care and build one more not needed layer???
opts
|> faktory_name()
|> push(payload)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching from Broadway.push_messages to push is going to change the return of this function. From the PR discussion, I gather we are planning on doing a version change.

  1. Are there additional changes needed to this code base to signal that change?
  2. Does module/package documentation need to be updated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, just documenting a thought - but if we can return a Job with the faktory job id here, that would be a great gateway into being able to check on the status of a job.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadway.push_messages and Sandbox.enqueue_job both always return :ok, so I think it should be OK to have this return {:ok, job_id} instead.

+1 for updating documentation as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like we don't need perform_async/3 at all and we can return same result for any call. private function handle_push_result prepares result and it returns {:ok, job}.
I am not sure is it better to change to {:ok, job_id} or leave it as is. job contains a bit more info, see sample below:

{:ok,
 %{
   args: [%{hey: "there!"}],
   jid: "cc7cb6ec723b1e1f35815ba1",
   jobtype: "FaktoryWorker.DefaultWorker"
 }}

Copy link
Contributor

@superhawk610 superhawk610 Jan 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be OK to return it as-is. Can you also update Sandbox.enqueue_job to return a similar value? You can just generate a random jid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@superhawk610 we have changed return to {:ok, job}, but I checked our app and I see it always expect :ok.
Question is -> revert changes and return :ok as before, change app and it's tests to have support of {:ok, job). or 3rd option: add configuration option and return optionally :ok to keep back compatibilities and return by default {:ok, job}.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change our app to accept {:ok, job}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that this is mentioned in the upgrade guide as well, as it's a breaking change.

@superhawk610
Copy link
Contributor

This change means that MyApp.Job.perform_async makes a synchronous HTTP call, instead of just sending an Erlang message; that should be fine, so long as it's taken into account when writing/calling job modules. It should definitely be the top callout of the release notes.

For this PR, I think a couple more things are required:

  • update documentation to remove references to :skip_pipeline
  • update Job.perform_async/2 (since :skip_pipeline is no longer supported)

@swelham
Copy link
Collaborator

swelham commented Jan 19, 2022

are you folks still using this or have you fully transitioned off?

@Ch4s3 We are no longer using Faktory and have completely transitioned off.

def perform_async(_, {:error, _} = error, _), do: error

def perform_async(pipeline_name, payload, opts) do
if Sandbox.active?() do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to keep the sandbox, it is what we use to simulate the server during testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyowensboggs makes sense, I overdid it with simplifications

Copy link
Contributor

@superhawk610 superhawk610 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'd just like to see Sandbox.enqueue_job updated to return the same shape as perform_async. We'll also want to draft a migration guide, but that can be done as part of the release notes.

@nickolaich nickolaich changed the title WIP Get rid of broadway Get rid of broadway Jan 20, 2022
@nickolaich
Copy link
Collaborator Author

nickolaich commented Jan 20, 2022

I've added couple of fixes:

  • put :console to backends of logger. there were an error with heartbeat: "END" message wasn't expected at one test. I don't know exactly a reason to suppress all errors in the tests or not. Though it could be configured at test.secrets.exs
  • added tag :enterprise to batch testing. Anybody who doesn't have license can run mix test --exclude enterprise to exclude that tests from running.

@jeremyowensboggs
Copy link
Collaborator

It would be a good idea to add the bit about the tags and the enterprise tag to the README.md.

Copy link
Contributor

@superhawk610 superhawk610 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Jeremy's suggestion to add a note to the README about the enterprise tag

@Ch4s3 Ch4s3 merged commit e1061ab into master Feb 9, 2022
@Ch4s3 Ch4s3 deleted the get-rid-of-broadway branch February 9, 2022 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants