Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep requesting even the account has rate-limited #27

Closed
TNumFive opened this issue Jul 5, 2023 · 5 comments
Closed

Keep requesting even the account has rate-limited #27

TNumFive opened this issue Jul 5, 2023 · 5 comments

Comments

@TNumFive
Copy link

TNumFive commented Jul 5, 2023

QueueClient will switch context whether current account is rate limited or not, while the context switching itself will unlock that account, cause the script to constantly switch between all loaded accounts no matter limited or not.

How about just remain the lock when switch context? To prevent using account that is already limited.

class RemainLocked(AccountsPool):
    async def unlock(self, username: str, queue: str, req_count=0):
        qs = f"""
        UPDATE accounts SET
            stats = json_set(stats, '$.{queue}', COALESCE(json_extract(stats, '$.{queue}'), 0) + {req_count}),
            last_used = datetime({utc_ts()}, 'unixepoch')
        WHERE username = :username
        """
        await execute(self._db_file, qs, {"username": username})

    async def get_for_queue(self, queue: str):
        qs = f"""
        SELECT * FROM accounts
        WHERE active = true AND (
            locks IS NULL
            OR json_extract(locks, '$.{queue}') IS NULL
            OR json_extract(locks, '$.{queue}') < datetime('now')
        )
        ORDER BY RANDOM()
        LIMIT 1
        """
        rs = await fetchone(self._db_file, qs)
        return Account.from_rs(rs) if rs else None
@vladkens
Copy link
Owner

vladkens commented Jul 5, 2023

Hi, @TNumFive.

Thanks for the issue. That was a bug. Fixed in 0.4.1.

@TNumFive
Copy link
Author

TNumFive commented Jul 6, 2023

After the update, the ctx never gets closed:

async def _get_ctx(self) -> Ctx:
    if self.ctx:
        return self.ctx
    # **this if-statment will never be executed**
    if self.ctx is not None:
        await self._close_ctx()

    acc = await self.pool.get_for_queue_or_wait(self.queue)
    clt = acc.make_client()
    self.ctx = Ctx(acc, clt)
    return self.ctx

And there would be an exception like this when I break from async-for (this problem exists before update):

async for tweet in api.user_tweets(user.id):
    if tweet.id < last_tweet_id:
        break
Task was destroyed but it is pending!
task: <Task pending name='Task-7' coro=<<async_generator_athrow without __name__>()>>
Exception ignored in: <coroutine object QueueClient.__aexit__ at 0x7f6e7930bc30>
Traceback (most recent call last):
  File "/home/ubuntu/Documents/twscrape/twscrape/queue_client.py", line 41, in __aexit__
    await self._close_ctx()
  File "/home/ubuntu/Documents/twscrape/twscrape/queue_client.py", line 46, in _close_ctx
    await self.ctx.clt.aclose()
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/httpx/_client.py", line 1974, in aclose
    await proxy.aclose()
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/httpx/_transports/default.py", line 365, in aclose
    await self._pool.aclose()
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 311, in aclose
    async with self._pool_lock:
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/httpcore/_synchronization.py", line 66, in __aexit__
    self._anyio_lock.release()
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/anyio/_core/_synchronization.py", line 165, in release
    if self._owner_task != get_current_task():
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/anyio/_core/_testing.py", line 64, in get_current_task
    return get_asynclib().get_current_task()
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/anyio/_core/_eventloop.py", line 149, in get_asynclib
    asynclib_name = sniffio.current_async_library()
  File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/sniffio/_impl.py", line 93, in current_async_library
    raise AsyncLibraryNotFoundError(
sniffio._impl.AsyncLibraryNotFoundError: unknown async library, or not in async context

Could it be possible that it's due to ctx not closed properly?

@vladkens
Copy link
Owner

vladkens commented Jul 6, 2023

Hi again, @TNumFive.

That of how python works:

On simple example:

import asyncio

class Ctx:
    async def __aenter__(self):
        print("entered")
        return self

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        print("exited")

async def gen():
    async with Ctx() as ctx:
        for x in range(1, 5):
            await asyncio.sleep(0.1)
            yield x

async def main():
    async for i in gen():
        print(i)
        if i == 3:
            break

    print("STOP")

if __name__ == "__main__":
    asyncio.run(main())

Output:

entered
1
2
3
STOP
exited

So __aexit__ will be called somewhere in future.

If you want to stop iteration with break, need to use async with lock for generator, like:

async def main():
    from contextlib import aclosing

    async with aclosing(gen()) as g:
        async for i in g:
            print(i)
            if i == 3:
                break

    print("STOP")

After this generator will be closed on break:

entered
1
2
3
exited
STOP

In your case it should be like:

from contextlib import aclosing

async with aclosing(api.user_tweets(user.id)) as gen:
    async for tweet in gen:
        if tweet.id < last_tweet_id:
            break

@vladkens
Copy link
Owner

vladkens commented Jul 6, 2023

@TNumFive
Copy link
Author

TNumFive commented Jul 6, 2023

That's very nice of you !!!

I didn't known about the contextlib stuff,

And I just found that you fixed the _get_ctx() problem as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants