From 85fa80dc639f68de2b921a605bc4a35a937099b6 Mon Sep 17 00:00:00 2001 From: Sylvain Joyeux Date: Thu, 5 Oct 2017 15:01:43 -0300 Subject: [PATCH] fix quadratic performance in FileTask#out_of_date? FileTask#out_of_date? has been changed in 462e403a to call #needed? which calls #out_of_date? recursively. In some cases, where the graph of dependencies is fairly dense, this leads to quadratic performance when it was linear before. Use #all_prerequisite_tasks to avoid the problem. This also saves a File.exist? test as #timestamp already takes it into account.fix quadratic performance in FileTask#out_of_date? FileTask#out_of_date? has been changed in 462e403a to call #needed? which calls #out_of_date? recursively. In some cases, where the graph of dependencies is fairly dense, this leads to quadratic performance when it was linear before. Use #all_prerequisite_tasks to avoid the problem. This also saves a File.exist? test as #timestamp already takes it into account. I made a benchmark that measures the difference in duration for out_of_date? on 12.0, 12.1 and 12.1 with this commit applied. The benchmark used to validate the performance creates 5 layers of FileTask, where all tasks of the parent layer are connected to all tasks of the child. A root task is added at the top and #out_of_date? is called on it. The benchmark varies the number of tasks per layer. **12.0** | Tasks per layers | Duration (s) | Standard deviation | |------------------|--------------|--------------------| | 1 | 1.32e-05 | 0.96e-05 | | 2 | 8.69e-05 | 1.11e-06 | | 5 | 1.84e-05 | 2.43e-06 | | 8 | 2.89e-05 | 1.05e-05 | | 10 | 3.35e-05 | 4.12e-06 | | 15 | 4.97e-05 | 6.74e-06 | | 20 | 6.19e-05 | 6.23e-06 | **12.1** | Tasks per layers | Duration (s) | Standard deviation | |------------------|--------------|--------------------| | 1 | 7.00e-05 | 5.62e-05 | | 2 | 3.98e-04 | 7.38e-05 | | 5 | 2.32e-02 | 1.02e-03 | | 8 | 0.22 | 0.006 | | 10 | 0.65 | 0.006 | | 15 | 4.78 | 0.048 | | 20 | 20 | 0.49 | **PR 224** | Tasks per layers | Duration (s) | Standard deviation | |------------------|--------------|--------------------| | 1 | 4.47e-05 | 2.68e-05 | | 2 | 7.56e-05 | 2.92e-05 | | 5 | 2.42e-03 | 4.16e-05 | | 8 | 0.51e-03 | 7.21e-05 | | 10 | 0.77e-03 | 0.13e-03 | | 15 | 14.2e-03 | 0.11e-03 | | 20 | 24.2e-03 | 0.16e-03 | Benchmarking code: ~~~ ruby require 'rake' LAYER_MAX_SIZE = 20 LAYER_COUNT = 5 def measure(size) app = Rake::Application.new layers = (0...LAYER_COUNT).map do |layer_i| (0...size).map do |i| app.define_task(Rake::FileTask, "#{layer_i}_#{i}") end end layers.each_cons(2) do |parent, child| child_names = child.map(&:name) parent.each { |t| t.enhance(child_names) } end root = app.define_task(Rake::FileTask, "root") root.enhance(layers[0].map(&:name)) tic = Time.now root.send(:out_of_date?, tic) Time.now - tic end FileUtils.touch "root" sleep 0.1 LAYER_COUNT.times do |layer_i| LAYER_MAX_SIZE.times do |i| FileUtils.touch "#{LAYER_COUNT - layer_i - 1}_#{i}" end sleep 0.1 end COUNT = 100 [1, 2, 5, 8, 10, 15, 20].each do |size| mean = 0 sum_squared_deviations = 0 COUNT.times do |count| duration = measure(size) old_mean = mean mean = old_mean + (duration - old_mean) / (count + 1) sum_squared_deviations = sum_squared_deviations + (duration - old_mean) * (duration - mean) end puts "#{size} #{mean} #{Math.sqrt(sum_squared_deviations / (COUNT - 1))}" end ~~~ --- lib/rake/file_task.rb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/rake/file_task.rb b/lib/rake/file_task.rb index 474b7bd93..364d8e395 100644 --- a/lib/rake/file_task.rb +++ b/lib/rake/file_task.rb @@ -30,10 +30,10 @@ def timestamp # Are there any prerequisites with a later time than the given time stamp? def out_of_date?(stamp) - @prerequisites.any? { |prereq| + all_prerequisite_tasks.any? { |prereq| prereq_task = application[prereq, @scope] if prereq_task.instance_of?(Rake::FileTask) - prereq_task.timestamp > stamp || prereq_task.needed? + prereq_task.timestamp > stamp || @application.options.build_all else prereq_task.timestamp > stamp end