Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClientConnectorError: "Too many open files" with ClientSession #2094

Closed
samuelcolvin opened this issue Jul 14, 2017 · 10 comments
Closed

ClientConnectorError: "Too many open files" with ClientSession #2094

samuelcolvin opened this issue Jul 14, 2017 · 10 comments

Comments

@samuelcolvin
Copy link
Member

Long story short

I'm getting ClientConnectorError: [Errno 24] Cannot connect to host <domain>:80 ssl:False [Too many open files] when downloading from lots of different urls:

  1. Is this an error with aiohttp, asyncio, python or me?
  2. would be graet if ClientConnectorError could include attributes giving details on the error rather than just formatting the message.

Possibly related to or a duplicate of #1821 #1907 #764 #1667

Expected (hoped for) behaviour

ClientSession to be able to make 100k's of requests to 100k's of hosts without intermittent errors occurring.

Steps to reproduce

Download file from here and extract to top-1m.csv, run

import asyncio
import re
from itertools import groupby
from operator import itemgetter
from pathlib import Path
from statistics import mean

from aiohttp import ClientSession, TCPConnector, ThreadedResolver


class ResponseTimes:
    def __init__(self):
        self.loop = asyncio.get_event_loop()
        connector = TCPConnector(
            verify_ssl=False,
            limit=200,
            resolver=ThreadedResolver(loop=self.loop),
        )
        self.session = ClientSession(loop=self.loop, connector=connector)
        self.results = []
        self.sem = asyncio.Semaphore(value=100, loop=self.loop)

    def run(self, sites):
        self.loop.run_until_complete(self._run(sites))
        self.session.close()

    async def _run(self, sites):
        coros = [self.get(url) for url in sites]
        start = self.loop.time()
        await asyncio.gather(*coros, loop=self.loop)
        print(f'total time taken: {self.loop.time() - start:0.3f}s')
        print(f'mean download length: {mean([r[1] for r in self.results]):0.4f}s')
        self.results.sort(key=itemgetter(0))
        print(f'   time   count   status')
        print('========================')
        for k, g in groupby(self.results, key=itemgetter(0)):
            g = list(g)
            print(f'{mean([r[1] for r in g]):0.4f}s    {len(g):4}      {k}')

    async def get(self, domain):
        url = 'http://' + domain
        async with self.sem:
            start = self.loop.time()
            try:
                async with self.session.get(url, timeout=5) as r:
                    status = r.status
                    # content_len = len(await r.read())
            except Exception as e:
                status = f'{e.__class__.__name__}: {e}'
                status = re.sub(f'[a-z0-9\-\.]+:(80|443)', r'<domain>:\1', status)
            else:
                status = str(status)
            time_taken = self.loop.time() - start
        self.results.append((status, time_taken))
        # print(f'{url:>40}: {time_taken:0.4f}s {status}')


top_text = Path('top-1m.csv').read_text()
limit = 2000
sites = []
for v in top_text.split('\n'):
    if not v:
        continue
    if len(sites) >= limit:
        break
    sites.append(v.split(',', 1)[1])
print(f'sites to get: {len(sites)}')

ResponseTimes().run(sites)

Actual behaviour

Output of the script.

Note that of the 2000 requests, >800 returned a ClientConnectorError with too many open files

sites to get: 2000
Can not load response cookies: Illegal key '_csrf/link'
Can not load response cookies: Illegal key '_csrf/link'
total time taken: 30.159s
mean download length: 1.2121s
   time   count   status
========================
1.4887s     980      200
1.3857s       2      400
1.0834s      17      403
0.8338s      16      404
1.3580s       2      408
1.4277s       1      418
1.2045s       1      429
0.5417s       3      500
0.4710s       9      503
0.9899s       1      521
0.1615s      24      ClientConnectorError: [Errno -2] Cannot connect to host <domain>:80 ssl:False [Name or service not known]
0.1335s       1      ClientConnectorError: [Errno 111] Cannot connect to host <domain>:80 ssl:False [Can not connect to <domain>:80 [Connect call failed ('192.69.95.165', 80)]]
0.3042s       6      ClientConnectorError: [Errno 11] Cannot connect to host <domain>:80 ssl:False [Resource temporarily unavailable]
1.2159s       1      ClientConnectorError: [Errno 1] Cannot connect to host <domain>:443 ssl:True [Can not connect to <domain>:443 [[SSL: BAD_ECC_CERT] bad ecc cert (_ssl.c:749)]]
1.9735s      40      ClientConnectorError: [Errno 24] Cannot connect to host <domain>:443 ssl:True [Can not connect to <domain>:443 [Too many open files]]
1.9793s      18      ClientConnectorError: [Errno 24] Cannot connect to host <domain>:443 ssl:True [Too many open files]
0.6571s     296      ClientConnectorError: [Errno 24] Cannot connect to host <domain>:80 ssl:False [Can not connect to <domain>:80 [Too many open files]]
0.2817s     498      ClientConnectorError: [Errno 24] Cannot connect to host <domain>:80 ssl:False [Too many open files]
3.5596s       1      ClientConnectorError: [Errno None] Cannot connect to host <domain>:443 ssl:True [Can not connect to <domain>:443 [None]]
0.3141s       1      ServerDisconnectedError: None
5.5903s      82      TimeoutError: 

Your environment

Ubuntu 17.04, Python 3.6.1, aiohttp 2.2.3

@fafhrd91
Copy link
Member

You have to increase number of available file descriptors

@samuelcolvin
Copy link
Member Author

samuelcolvin commented Jul 14, 2017 via email

@fafhrd91
Copy link
Member

I don't think we have anything actionable in this issue, this is not aiohttp related.

@samuelcolvin
Copy link
Member Author

I think you should include more attributes on ClientConnectorError.

@samuelcolvin
Copy link
Member Author

Currently you have to parse the error message to find the reason for the ClientConnectorError.

@fafhrd91
Copy link
Member

os api returns just error code (number), and you get it as first parameter.
https://github.com/aio-libs/aiohttp/blob/master/aiohttp/connector.py#L734

specifically "errno 24" is not runtime error, it is os configuration problem.

@fafhrd91
Copy link
Member

would you provide PR?

@samuelcolvin
Copy link
Member Author

samuelcolvin commented Jul 17, 2017 via email

@asvetlov
Copy link
Member

Fixed by #2122

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants