Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ConnectionPool with master discovery #207

Merged
merged 3 commits into from
Apr 20, 2022

Conversation

DifferentialOrange
Copy link
Member

@DifferentialOrange DifferentialOrange commented Jan 20, 2022

python: drop Python 2 support

Python 2.7 reached the end of its life on January 1st, 2020 [1]. Since
it would be a waste to ignore several Python 3.x features in master
discovery implementation, we decided to drop Python 2 support here.

Python 2 workaround cleanup activities are expected to be solved as
part of #212 solution.

  1. https://www.python.org/doc/sunset-python-2/

connection: introduce common interface

Introduce connection interface to be used in connection pool
implementation. Only CRUD and base connect/close API is required
by the interface.

Part of #196

connection_pool: introduce connection pool

Introduce ConnectionPool class to work with cluster of Tarantool
instances. ConnectionPool support master discovery and ro/rw-based
requests, so it is most useful while working with a single replicaset of
instances. ConnectionPool is supported only for Python 3.7 or newer.
Authenticated user must be able to call box.info on instances.

ConnectionPool updates information about each server state (RO/RW)
on initial connect and then asynchronously in separate threads.
Application retries must be written considering the asynchronous nature
of cluster state refresh. User does not need to use any synchronization
mechanisms in requests, it's all handled with ConnectionPool methods.

ConnectionPool API is the same as a plain Connection API.
On each request, a connection is chosen to execute this request.
Connection is selected based on request mode:

  • Mode.ANY chooses any instance.
  • Mode.RW chooses an RW instance.
  • Mode.RO chooses an RO instance.
  • Mode.PREFER_RW chooses an RW instance, if possible, RO instance
    otherwise.
  • Mode.PREFER_RO chooses an RO instance, if possible, RW instance
    otherwise.
    All requests that are guaranteed to write (insert, replace, delete,
    upsert, update) use RW mode by default. select uses ANY by default. You
    can set the mode explicitly. call, eval, execute and ping requests
    require to set the mode explicitly.

Example:

pool.call('some_write_procedure', arg, mode=tarantool.Mode.RW)

Closes #196

image

@DifferentialOrange DifferentialOrange changed the title Differential orange/gh 196 master discovery Master discovery Jan 20, 2022
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch from 67f85d1 to c960ade Compare February 21, 2022 12:05
@Mons
Copy link

Mons commented Feb 24, 2022

Never look at box.cfg.read_only. Look only at box.info.ro

@DifferentialOrange
Copy link
Member Author

Never look at box.cfg.read_only. Look only at box.info.ro

It was used only in tests, reworked

@DifferentialOrange
Copy link
Member Author

DifferentialOrange commented Feb 25, 2022

Alternative approaches

Synchronous on request (on errors)

Solution idea: refresh schema (rw/ro info, replication state) on connect, RO error or network error.

Pros

  • Since connector is synchronous itself, no new complicated mechanisms will be introduced.
  • Retries due to schema change are easy to implement and effective.

Cons

  • If no errors (for example, if we use only selects), schema will never be refreshed.
  • Reloads block requests, thus latency for some of them will be long.

Synchronous on request (with timeout)

Solution idea: refresh schema (rw/ro info, replication state) before request if X milliseconds have passed since last refresh.

Pros

  • Since connector is synchronous itself, no new complicated mechanisms will be introduced.
  • Schema will be refreshed even if no errors on requests (for example, if we use only selects).

Cons

  • No native retries on ro/rw or network errors. If we send RW request to RO instance, request must be retried for a full timeout before it succeeds.
  • Reloads block requests, thus latency for some of them (one request each X milliseconds) will be long.

Asynchronous

Solution idea: refresh schema (rw/ro info, replication state) in separate thread each X milliseconds.

Pros

  • Reloads are non-blocking for requests.
  • Schema will be refreshed even if no errors on requests (for example, if we use only selects).

Cons

  • Since connector is synchronous itself, we will need to introduce async mechanisms. Schema refresh will require x2 connections (otherwise synchronization primitives for x1 connections, which leads to requests blocking).
  • No native retries on ro/rw or network errors. If we send RW request to RO instance, request must be retried for a full timeout before it succeeds.

What solution should we choose?

Solutions may be hybrid (1+2 or 1+3), and I think it would be the best approach to cover more cases. Personally I prefer synchronous on request on error + timeout: the only drawback is increasing latency for some requests, but it's rather simple to implement compared to introducing async. It is much harder to combine synchronous refresh on error and async approach, and using pure async approach will pass responsibility to implement non-trivial retry logic to app developers.

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch 2 times, most recently from 60df77b to ff68de5 Compare March 17, 2022 12:37
@DifferentialOrange DifferentialOrange changed the title Master discovery Introduce CnnectionPool with master discovery Mar 17, 2022
@DifferentialOrange DifferentialOrange changed the title Introduce CnnectionPool with master discovery Introduce ConnectionPool with master discovery Mar 17, 2022
@DifferentialOrange DifferentialOrange changed the base branch from master to DifferentialOrange/gh-105-unpack-binary March 17, 2022 14:59
@DifferentialOrange DifferentialOrange marked this pull request as ready for review March 17, 2022 14:59
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch from 59d1095 to d16c16d Compare March 23, 2022 06:44
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-105-unpack-binary branch from 6aa89a6 to 29c4ba2 Compare March 23, 2022 06:45
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch from d16c16d to 32190f3 Compare March 23, 2022 06:45
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-105-unpack-binary branch 2 times, most recently from 760e1cf to 8772bc7 Compare March 23, 2022 07:58
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-105-unpack-binary branch from 8772bc7 to 155b1d7 Compare March 23, 2022 08:12
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch 5 times, most recently from 0bd9fb5 to 6c2bac7 Compare March 23, 2022 11:43
@Totktonada
Copy link
Member

All requests that are guaranteed to write (insert, replace, delete,
upsert, update) use RW mode.

An RO instance can write to a replica local or a temporary space. Well, it is strange to write to a replica local space on some RO instance. However there may be a use case: say, register a task to be proceeded in background. I think we should have good defaults, but allow to choose anyway.

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-105-unpack-binary branch 3 times, most recently from 4ec6748 to 88fa990 Compare March 31, 2022 16:00
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside of several minor comments.

Please, finish review with Anastasia and proceed.

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch 8 times, most recently from 39f5a6b to aa0b5b9 Compare April 20, 2022 11:58
Copy link

@AnaNek AnaNek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch 2 times, most recently from 150751a to f4087ad Compare April 20, 2022 12:47
Python 2.7 reached the end of its life on January 1st, 2020 [1]. Since
it would be a waste to ignore several Python 3.x features in master
discovery implementation, we decided to drop Python 2 support here.

Python 2 workaround cleanup activities are expected to be solved as
part of #212 solution.

1. https://www.python.org/doc/sunset-python-2/

Part of #196
Introduce connection interface to be used in connection pool
implementation. Only CRUD and base connect/close API is required
by the interface.

Part of #196
Introduce ConnectionPool class to work with cluster of Tarantool
instances. ConnectionPool support master discovery and ro/rw-based
requests, so it is most useful while working with a single replicaset of
instances. ConnectionPool is supported only for Python 3.7 or newer.
Authenticated user must be able to call `box.info` on instances.

ConnectionPool updates information about each server state (RO/RW)
on initial connect and then asynchronously in separate threads.
Application retries must be written considering the asynchronous nature
of cluster state refresh. User does not need to use any synchronization
mechanisms in requests, it's all handled with ConnectionPool methods.

ConnectionPool API is the same as a plain Connection API.
On each request, a connection is chosen to execute this request.
A connection is chosen based on a request mode:
* Mode.ANY chooses any instance.
* Mode.RW chooses an RW instance.
* Mode.RO chooses an RO instance.
* Mode.PREFER_RW chooses an RW instance, if possible, RO instance
  otherwise.
* Mode.PREFER_RO chooses an RO instance, if possible, RW instance
  otherwise.
All requests that are guaranteed to write (insert, replace, delete,
upsert, update) use RW mode by default. select uses ANY by default. You
can set the mode explicitly. call, eval, execute and ping requests
require to set the mode explicitly.

Example:

  pool = tarantool.ConnectionPool(
      addrs=[
          {'host': '108.177.16.0', 'port': 3301},
          {'host': '108.177.16.0', 'port': 3302},
      ],
      user='test',
      password='test',)

  pool.call('some_write_procedure', arg, mode=tarantool.Mode.RW)

Closes #196
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-196-master-discovery branch from f4087ad to 2546490 Compare April 20, 2022 12:53
@DifferentialOrange DifferentialOrange merged commit fb7f9a3 into master Apr 20, 2022
@DifferentialOrange DifferentialOrange deleted the DifferentialOrange/gh-196-master-discovery branch April 20, 2022 13:27
DifferentialOrange added a commit to tarantool/doc that referenced this pull request May 11, 2022
Since the release of tarantool/python 0.8.0 [1] several things has
changed.

* Issue #105 has been fixed [2].
* CI has been migrated to GitHub Actions [3].
* New connection pool (ConnectionPool) with master discovery
  was introduced [4].
* old connection pool (MeshConnection) with round-robin failover was
  deprecated [4].

These changes together with GitHub stars update are introduced with this
patch.

1. https://github.com/tarantool/tarantool-python/releases/tag/0.8.0
2. tarantool/tarantool-python#211
3. tarantool/tarantool-python#213
4. tarantool/tarantool-python#207
DifferentialOrange added a commit to tarantool/doc that referenced this pull request May 11, 2022
Since the release of tarantool-python 0.8.0 [1] several things has
changed.

* Issue tarantool/tarantool-python#105 has been fixed [2].
* CI has been migrated to GitHub Actions [3].
* New connection pool (ConnectionPool) with master discovery
  was introduced [4].
* old connection pool (MeshConnection) with round-robin failover was
  deprecated [4].

These changes together with GitHub stars update are introduced with this
patch.

1. https://github.com/tarantool/tarantool-python/releases/tag/0.8.0
2. tarantool/tarantool-python#211
3. tarantool/tarantool-python#213
4. tarantool/tarantool-python#207
DifferentialOrange added a commit to tarantool/doc that referenced this pull request May 11, 2022
Since the release of tarantool-python 0.8.0 [1] several things has
changed.

* Issue tarantool/tarantool-python#105 has been fixed [2].
* CI has been migrated to GitHub Actions [3].
* New connection pool (ConnectionPool) with master discovery
  was introduced [4].
* old connection pool (MeshConnection) with round-robin failover was
  deprecated [4].

These changes together with GitHub stars update are introduced with this
patch.

1. https://github.com/tarantool/tarantool-python/releases/tag/0.8.0
2. tarantool/tarantool-python#211
3. tarantool/tarantool-python#213
4. tarantool/tarantool-python#207
patiencedaur pushed a commit to tarantool/doc that referenced this pull request May 16, 2022
Since the release of tarantool-python 0.8.0 [1] several things has
changed.

* Issue tarantool/tarantool-python#105 has been fixed [2].
* CI has been migrated to GitHub Actions [3].
* New connection pool (ConnectionPool) with master discovery
  was introduced [4].
* old connection pool (MeshConnection) with round-robin failover was
  deprecated [4].

These changes together with GitHub stars update are introduced with this
patch.

1. https://github.com/tarantool/tarantool-python/releases/tag/0.8.0
2. tarantool/tarantool-python#211
3. tarantool/tarantool-python#213
4. tarantool/tarantool-python#207
patiencedaur added a commit to tarantool/doc that referenced this pull request May 16, 2022
* Update python connector comparison table

Since the release of tarantool-python 0.8.0 [1] several things has
changed.

* Issue tarantool/tarantool-python#105 has been fixed [2].
* CI has been migrated to GitHub Actions [3].
* New connection pool (ConnectionPool) with master discovery
  was introduced [4].
* old connection pool (MeshConnection) with round-robin failover was
  deprecated [4].

These changes together with GitHub stars update are introduced with this
patch.

1. https://github.com/tarantool/tarantool-python/releases/tag/0.8.0
2. tarantool/tarantool-python#211
3. tarantool/tarantool-python#213
4. tarantool/tarantool-python#207

* Update translation

Co-authored-by: Patience Daur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic master discovery
4 participants