Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] finish_in_order - process finish hook in the original order #338

Closed
shaicoleman opened this issue Oct 24, 2023 · 7 comments
Closed

Comments

@shaicoleman
Copy link
Contributor

shaicoleman commented Oct 24, 2023

Would it be possible to add the functionality to execute things in parallel, while printing the results in the original order?

Proof of concept example:

require 'parallel'

ITEMS = ('A'..'Z').to_a

def perform_work(item)
  sleep(rand) * 3
end

def print_item(item)
  puts item
end

results_queue = Queue.new

# Printing thread
next_index = 0
printer_thread = Thread.new do
  while next_index < ITEMS.length
    index, result = results_queue.pop

    # If not in order, push it back for rechecking later
    next results_queue << [index, result] if index != next_index

    print_item(result)
    next_index += 1
  end
end

Parallel.each_with_index(ITEMS, in_threads: 6) do |item, index|
  perform_work(item)
  results_queue << [index, item]
end

printer_thread.join
@grosser
Copy link
Owner

grosser commented Oct 25, 2023

the results that Parallel.map returns are sorted, so you could do:

out = Parallel.each_with_index(ITEMS, in_threads: 6) { |item, index| "my output" }
puts out

capturing stdout would not work since that relies on replacing $stdout and that would not be thread local
if you want to get super hacky you could replace $stdout with a thread-aware StringIO that stores each threads output separately and then returns prints it in order when all processes have finished

@grosser grosser closed this as completed Oct 25, 2023
@shaicoleman
Copy link
Contributor Author

shaicoleman commented Oct 25, 2023

That doesn't do the same thing as the example I've provided.
I'm interested in the ability to have a callback to print things (or save to disk, etc.) in order as they finish executing, not at the end

@grosser
Copy link
Owner

grosser commented Oct 25, 2023

if you want to print as things are finishing you can use a finish hook docs

in that case it would just print the result

@shaicoleman
Copy link
Contributor Author

shaicoleman commented Oct 25, 2023

The finish hook with the example from the docs will be out of order.
Parallel.map('A'..'Z', in_threads: 6, finish: -> (item, i, result) { puts item }) { sleep rand }

I want the finish hook to be triggered in order. Try to run the example I gave if I'm unclear

@grosser
Copy link
Owner

grosser commented Oct 25, 2023

require 'parallel'

want = 0
stack = []
finish = -> (item, i, result) do
  if i == want
    puts item # print current
    want += 1
    
    # print things already ready
    stack[(i + 1)...].to_a.each do |old|
      break unless old
      puts old
      want += 1
    end
  else
    stack[i] = item # store for later
  end
end

Parallel.map('A'..'Z', in_threads: 6, finish: finish) { sleep rand }

@shaicoleman
Copy link
Contributor Author

Yes, that's a more elegant and efficient solution.

I'm still hoping this would be built as a reusable solution into the parallel gem

@grosser
Copy link
Owner

grosser commented Oct 25, 2023

PR welcome :)

finish_in_order: true or so 🤷

@shaicoleman shaicoleman changed the title [feature] Parallel processing with ordered output [feature] finish_in_order - process finish hook in the original order Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants