Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in Ethon::Easy #198

Open
itsrainy opened this issue Jul 1, 2021 · 14 comments
Open

Memory leak in Ethon::Easy #198

itsrainy opened this issue Jul 1, 2021 · 14 comments

Comments

@itsrainy
Copy link

itsrainy commented Jul 1, 2021

I've tracked down a memory leak in one of our applications that uses Ethon. The simplest reproduction I've been able to find is by running the following (in an irb or pry shell):

require "ethon"
100000.times {
   Ethon::Easy.new
}

This will cause the resident set size of the process to grow and never get cleaned up. This also seems to be consistently reproducible by running the following in the project root (taken from #45):

RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 RUBY_GC_HEAP_INIT_SLOTS=1 RUBY_HEAP_SLOTS_INCREMENT=10 RUBY_HEAP_SLOTS_GROWTH_FACTOR=.1 ITERATIONS=10000 rspec -I lib -I profile profile/memory_leaks.rb

Note: commenting out everything in memory_leaks.rb except this specific test makes it much quicker as it will only run the relevant memory leak test.

It seems like the source of the leak is this specific FFI::AutoPointer, which gets created when the Easy initializer calls set_callbacks. The second argument to the AutoPointer constructor is the Curl easy_cleanup method which does seem to get called correctly.

I'm unsure if this is an Ethon, FFI, or libcurl issue, so I figured I would start here. One thing I noticed is that the Easy.cleanup method doesn't ever get called anywhere. Adding a call to cleanup in the calling code seems to fix the leak:

require "ethon"
100000.times {
   e = Ethon::Easy.new
   e.cleanup
}

This fixes the leak (and may be how we patch our applications for the time being). Is that method supposed to be automatically called through some other mechanism or are consumers expected to call it?

I was also under the impression that FFI::AutoPointer handles should automatically get GC'd and shouldn't require a call to free. If that's true, this seems like it might be an FFI issue (unless there's some subtle reference to that handle floating around Ethon somewhere)

This reproduces on Mac OS BigSur 11.03 and on Ubuntu 16.04.6 (ruby 2.7.3, ffi 1.15.3, ethon 0.14.0)

@gyfis
Copy link

gyfis commented Jul 29, 2021

We're seeing the same memory leak after upgrading ethon 0.12.0 -> 0.14.0 and ffi (1.13.1 -> 1.15.3).

Some minor info: I tried using ethon v0.12.0 with ffi 1.14.0, and I see the same memory leak from the specific test mentioned in the previous comment, so I believe this to be an issue with ffi (or something lower in the stack).

@tagliala
Copy link
Contributor

tagliala commented Aug 4, 2021

Hi, is this by any chance related to #194, fixed in #195? Could you try with the master branch and provide feedback?

@gyfis
Copy link

gyfis commented Aug 4, 2021

@tagliala Hey, thanks for the reply! I tested the memory_leaks.rb init test on the ethon master branch with ffi 1.15.3` and I'm still getting the memory leak, so I don't think that's related.

@tagliala
Copy link
Contributor

@gyfis thanks

Did you get a failing test on 0.14.0 by running

RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 RUBY_GC_HEAP_INIT_SLOTS=1 RUBY_HEAP_SLOTS_INCREMENT=10 RUBY_HEAP_SLOTS_GROWTH_FACTOR=.1 ITERATIONS=10000 rspec -I lib -I profile profile/memory_leaks.rb

?

init does not fail on my machine (x64 / 16GB / macOS 11.5.1 / ruby 2.7.3 / ffi 1.15.3 / ethon 0.14.0 / curl 7.64.1)

Anyway, I've tried to adapt my standalone test for #194 to check this one

#!/usr/bin/env ruby

# frozen_string_literal: true

ETHON_VERSION = '0.14.0' # { git: 'https://github.com/typhoeus/ethon.git' }

require 'bundler/inline'

begin
  puts "Ruby version: #{RUBY_VERSION}\n\n"

  gemfile(true) do
    source 'https://rubygems.org'

    gem 'ethon', ETHON_VERSION
  end
rescue Gem::LoadError => e
  puts "\nMissing Dependency:\n#{e.backtrace.first} #{e.message}"
rescue LoadError => e
  puts "\nError:\n#{e.backtrace.first} #{e.message}"
  puts DATA.read
  exit 1
end

require 'ethon'

$cleanup_counter = 0

# Monkey patching to provide debugging information
module Ethon
  module Curl
    module_function

    def easy_cleanup(pointer)
      $cleanup_counter += 1
      puts "***** Curl.easy_cleanup called with #{pointer}"
      puts "Cleanup counter: #{$cleanup_counter}"
    end
  end
end

10000.times do
  Ethon::Easy.new
end
RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 RUBY_GC_HEAP_INIT_SLOTS=1 RUBY_HEAP_SLOTS_INCREMENT=10 RUBY_HEAP_SLOTS_GROWTH_FACTOR=.1 ./test.rb 

If I run it against master branch (change ETHON_VERSION to the commented value), cleanup is called the expected number of times (10,000), instead on 0.14.0, where cleanup` is called a random number of times less than 10,000

So maybe it does not fix the memory leak completely but it should help

@jarthod
Copy link
Contributor

jarthod commented Aug 17, 2021

Hi all 👋

I also encountered this leak in my product, right after upgrading from ethon 0.12.0 to 0.14.0:
image

I tried 0.13.0 too and noticed the problem was also present. @tagliala IMO this is related to #194 yes because I noticed various errors related to the lack of available file descriptor in addition to memory problems (e.g. Errno::EMFILE: Too many open files, Resource temporarily unavailable - getaddrinfo). When trying to reproduce manually using Typhoeus with verbose mode I caught this error message:

Couldn't resolve host 'xxxx'
Closing connection 0
Dgetaddrinfo() thread failed to start

Couldn't resolve host 'yyyyy'
Closing connection 0
Dgetaddrinfo() thread failed to start

Couldn't resolve host 'zzzz'
Closing connection 0
Dgetaddrinfo() thread failed to start

The "thread failed to start" error is likely due to FD saturation and it turns into a "Couldn't resolve host" error in libcurl or typhoeus error handling code.

I tried locally with master branch and didn't reproduce the problem (I couldn't confirm about the memory though as I did not manage to reproduce it locally because I hit the FD limits too early) so it looks better in master at least. I'm a bit hesitant to try this in production though.

Edit: I managed to reproduce the leak locally by raising my FD limits. After running my script (which makes 10k Typhoeus requests in a 60-threads pool), with 0.12.0 it uses 1.7GB of RAM at the end, with 0.14.0 it uses ⚠️ 5.7GB or RAM and with master it uses 1.7GB. So the revert in master looks good to me.

jarthod added a commit to jarthod/ruby-mem-advisory-db that referenced this issue Aug 17, 2021
See typhoeus/ethon#194 and typhoeus/ethon#198 for a bit more details
The revert is in master but no gem released yet (`0.14.0` is latest when I write this, and it's impacted).

FTR I tried submitting this through the form (https://www.rubymem.com/advisories/preview) but I get a 500 every time.
@BuonOmo
Copy link

BuonOmo commented Sep 28, 2021

We're having the same problem in my company. I've tested the current master version (b593a9b) against 0.14.0 and 0.12.0. Locally, it seems that only the 0.12.0 does not leak. On Heroku however, we keep having a leak. I'll reduce that to a minimal exemple soon fortunately.

EDIT: We solved our issue by having smaller batch size in active record, hence it may very well have been a false lead, sorry for that.

On Heroku (0.14.0 then b593a9b then 0.12.0)

image

0.14.0

Retained STRING 15 objects of size 600/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.14.0/lib/ethon/easy/callbacks.rb:26
Retained STRING 11 objects of size 440/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.14.0/lib/ethon/easy/callbacks.rb:27
Retained OBJECT 11 objects of size 440/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.14.0/lib/ethon/easy/callbacks.rb:29
Retained ARRAY 10 objects of size 400/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.14.0/lib/ethon/easy/debug_info.rb:23
Retained DATA 4 objects of size 240/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.14.0/lib/ethon/easy/header.rb:32
Retained ARRAY 2 objects of size 80/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ffi-1.15.3/lib/ffi/autopointer.rb:96
Retained STRING 1 objects of size 40/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/activerecord-import-1.0.1/lib/activerecord-import/import.rb:579
Retained IMEMO 1 objects of size 40/2280 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/2.7.0/open-uri.rb:16

b593a9b

Retained ARRAY 17 objects of size 680/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ffi-1.15.4/lib/ffi/autopointer.rb:96
Retained DATA 6 objects of size 240/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/bundler/gems/ethon-b593a9bdc770/lib/ethon/easy/header.rb:28
Retained DATA 5 objects of size 360/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/bundler/gems/ethon-b593a9bdc770/lib/ethon/easy/header.rb:32
Retained OBJECT 4 objects of size 160/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ffi-1.15.4/lib/ffi/autopointer.rb:87
Retained ARRAY 3 objects of size 120/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/2.7.0/open-uri.rb:21
Retained HASH 3 objects of size 376/12096 (in bytes) at: /Users/ulysse/Dev/klaxit/klaxit-otp-master/app/services/find_new_itineraries_available_iterator.rb:64
Retained HASH 2 objects of size 80/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/2.7.0/open-uri.rb:21
Retained DATA 2 objects of size 144/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/bundler/gems/ethon-b593a9bdc770/lib/ethon/multi/operations.rb:17
Retained STRING 2 objects of size 80/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/bundler/gems/ethon-b593a9bdc770/lib/ethon/easy/callbacks.rb:27
Retained IMEMO 1 objects of size 40/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/2.7.0/open-uri.rb:21
Retained IMEMO 1 objects of size 40/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/2.7.0/open-uri.rb:16
Retained STRING 1 objects of size 40/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/bundler/gems/ethon-b593a9bdc770/lib/ethon/easy/callbacks.rb:26
Retained ARRAY 1 objects of size 40/12096 (in bytes) at: /Users/ulysse/Dev/klaxit/klaxit-otp-master/app/services/find_new_itineraries_available_iterator.rb:64
Retained OBJECT 1 objects of size 40/12096 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/bundler/gems/ethon-b593a9bdc770/lib/ethon/easy/callbacks.rb:29

0.12.0

Retained DATA 3 objects of size 200/400 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.12.0/lib/ethon/easy/header.rb:31
Retained DATA 2 objects of size 80/400 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ethon-0.12.0/lib/ethon/easy/header.rb:27
Retained ARRAY 2 objects of size 80/400 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ffi-1.15.4/lib/ffi/autopointer.rb:96
Retained OBJECT 1 objects of size 40/400 (in bytes) at: /Users/ulysse/.rbenv/versions/2.7.4/lib/ruby/gems/2.7.0/gems/ffi-1.15.4/lib/ffi/autopointer.rb:87

@maxvidenin
Copy link

We encountered the same issue recently. Our memory usage graph confirmed that there is a memory leak. The memory usage constantly growth and never goes down on the graph below:

Screenshot 2021-11-12 at 18 01 32

The problem appeared when we updated the gem version from 0.12.0 to 0.14.0.

Downgrade gem version back to 0.12.0 or update to 0.15.0 fixes the issue. See the memory usage graph after updating the version from 0.14.0 to 0.15.0 below:

Screenshot 2021-11-15 at 13 04 11

@tagliala
Copy link
Contributor

So I think that this could be closed as a duplicate of #194, solved in #195, and fixed in 0.15.0

@jarthod
Copy link
Contributor

jarthod commented Nov 15, 2021

Yes, maybe 0.13 and 0.14 should be yanked though? to avoid people installing it?

@tagliala
Copy link
Contributor

Yes, maybe 0.13 and 0.14 should be yanked though? to avoid people installing it?

Hi, I was used to yank gems in case of major issues or unwanted breaking changes, but now I've stopped because yanking may break production deploys of third-party applications. For example, you need a urgent hotfix on your application, but the deploy fails because the locked ethon version has been yanked. Fixing ethon is something that you would address, but not when you are fixing your own application

Based on this experience, I would not yank

@jarthod
Copy link
Contributor

jarthod commented Nov 16, 2021

Ah yes indeed, the sad world of production deploys re-downloading everything ;) Makes sense, fine for me.
I've submitted the report to rubymem/ruby-mem-advisory-db#31 (like bundle audit but for memory leaks) to at least warn the people using this but that's not a lot of people ^^

@tagliala
Copy link
Contributor

Hi, do you think this is still an issue?

The memory leak reported here and confirmed in version 0.13.0 and 0.14.0 should have been fixed in 0.15.0

@gyfis
Copy link

gyfis commented Jan 25, 2022

Hi, do you think this is still an issue?

The memory leak reported here and confirmed in version 0.13.0 and 0.14.0 should have been fixed in 0.15.0

We’re running ethon 0.15.0 with no more memory leaks present, +1 on not being an issue anymore.

@jarthod
Copy link
Contributor

jarthod commented Jan 25, 2022

+1 for me too, no memory problem since 0.15.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants