Skip to content

Commit

Permalink
types: support working with binary for Python 3
Browse files Browse the repository at this point in the history
This is a breaking change.

Before this patch, both bytes and str were encoded as mp_str. It was
possible to work with utf and non-utf strings, but not with
varbinary (mp_bin) [1]. This patch adds varbinary support for Python 3
by default. Python 2 connector behavior remains the same.

Before this patch:

* encoding="utf-8" (default)

    Python 3 -> Tarantool          -> Python 3
    str      -> mp_str (string)    -> str
    bytes    -> mp_str (string)    -> str
                mp_bin (varbinary) -> bytes

* encoding=None

    Python 3 -> Tarantool          -> Python 3
    bytes    -> mp_str (string)    -> bytes
    str      -> mp_str (string)    -> bytes
                mp_bin (varbinary) -> bytes

Using bytes as key was not supported by several methods (delete,
update, select).

After this patch:

* encoding="utf-8" (default)

    Python 3 -> Tarantool          -> Python 3
    str      -> mp_str (string)    -> str
    bytes    -> mp_bin (varbinary) -> bytes

* encoding=None

    Python 3 -> Tarantool          -> Python 3
    bytes    -> mp_str (string)    -> bytes
    str      -> mp_str (string)    -> bytes
                mp_bin (varbinary) -> bytes

Using bytes as key are now supported by all methods.

Thus, encoding="utf-8" connection may be used to work with
utf-8 strings and varbinary and encodine=None connection
may be used to work with non-utf-8 strings.

This patch does not add new restrictions (like "do not permit to use
str in encoding=None mode because result may be confusing") to preserve
current behavior (for example, using space name as str in schema
get_space).

1. tarantool/tarantool#4201

Closes #105
  • Loading branch information
DifferentialOrange committed Apr 2, 2022
1 parent dd01017 commit 3d71614
Show file tree
Hide file tree
Showing 6 changed files with 338 additions and 25 deletions.
48 changes: 48 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,54 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
(PR #192).

### Changed
- **Breaking**: change binary types encode/decode for Python 3
to support working with varbinary (PR #211, #105).
Python 2 connector behavior remains the same.

Before this patch:

* encoding="utf-8" (default)

| Python 3 | -> | Tarantool | -> | Python 3 |
|----------|----|--------------------|----|----------|
| str | -> | mp_str (string) | -> | str |
| bytes | -> | mp_str (string) | -> | str |
| | | mp_bin (varbinary) | -> | bytes |

* encoding=None

| Python 3 | -> | Tarantool | -> | Python 3 |
|----------|----|--------------------|----|----------|
| bytes | -> | mp_str (string) | -> | bytes |
| str | -> | mp_str (string) | -> | bytes |
| | | mp_bin (varbinary) | -> | bytes |

Using bytes as key was not supported by several methods (delete,
update, select).

After this patch:

* encoding="utf-8" (default)

| Python 3 | -> | Tarantool | -> | Python 3 |
|----------|----|--------------------|----|----------|
| str | -> | mp_str (string) | -> | str |
| bytes | -> | mp_bin (varbinary) | -> | bytes |

* encoding=None

| Python 3 | -> | Tarantool | -> | Python 3 |
|----------|----|--------------------|----|----------|
| bytes | -> | mp_str (string) | -> | bytes |
| str | -> | mp_str (string) | -> | bytes |
| | | mp_bin (varbinary) | -> | bytes |

Using bytes as key are now supported by all methods.

Thus, encoding="utf-8" connection may be used to work with
utf-8 strings and varbinary and encodine=None connection
may be used to work with non-utf-8 strings.

- Clarify license of the project (BSD-2-Clause) (PR #210, #197).
- Migrate CI to GitHub Actions (PR #213, PR #216, #182).
- Various improvements and fixes in README (PR #210, PR #215).
Expand Down
28 changes: 25 additions & 3 deletions tarantool/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
Request types definitions
'''

import sys
import collections
import msgpack
import hashlib
Expand Down Expand Up @@ -84,8 +85,26 @@ def __init__(self, conn):
# The option controls whether to pack binary (non-unicode)
# string values as mp_bin or as mp_str.
#
# The default behaviour of the connector is to pack both
# bytes and Unicode strings as mp_str.
# The default behaviour of the Python 2 connector is to pack
# both bytes and Unicode strings as mp_str.
#
# The default behaviour of the Python 3 connector (since
# default encoding is "utf-8") is to pack bytes as mp_bin
# and Unicode strings as mp_str. encoding=None mode must
# be used to work with non-utf strings.
#
# encoding = 'utf-8'
#
# Python 3 -> Tarantool -> Python 3
# str -> mp_str (string) -> str
# bytes -> mp_bin (varbinary) -> bytes
#
# encoding = None
#
# Python 3 -> Tarantool -> Python 3
# bytes -> mp_str (string) -> bytes
# str -> mp_str (string) -> bytes
# mp_bin (varbinary) -> bytes
#
# msgpack-0.5.0 (and only this version) warns when the
# option is unset:
Expand All @@ -98,7 +117,10 @@ def __init__(self, conn):
# just always set it for all msgpack versions to get rid
# of the warning on msgpack-0.5.0 and to keep our
# behaviour on msgpack-1.0.0.
packer_kwargs['use_bin_type'] = False
if conn.encoding is None or sys.version_info.major == 2:
packer_kwargs['use_bin_type'] = False
else:
packer_kwargs['use_bin_type'] = True

self.packer = msgpack.Packer(**packer_kwargs)

Expand Down
14 changes: 10 additions & 4 deletions tarantool/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,10 @@
if sys.version_info.major == 2:
string_types = (basestring, )
integer_types = (int, long)
supported_types = integer_types + string_types + (float,)

ENCODING_DEFAULT = None

if sys.version_info.minor < 6:
binary_types = (str, )
else:
Expand All @@ -17,10 +20,13 @@ def strxor(rhs, lhs):
return "".join(chr(ord(x) ^ ord(y)) for x, y in zip(rhs, lhs))

elif sys.version_info.major == 3:
binary_types = (bytes, )
string_types = (str, )
integer_types = (int, )
binary_types = (bytes, )
string_types = (str, )
integer_types = (int, )
supported_types = integer_types + string_types + binary_types + (float,)

ENCODING_DEFAULT = "utf-8"

from base64 import decodebytes as base64_decode

def strxor(rhs, lhs):
Expand All @@ -43,7 +49,7 @@ def check_key(*args, **kwargs):
elif args[0] is None and kwargs['select']:
return []
for key in args:
assert isinstance(key, integer_types + string_types + (float,))
assert isinstance(key, supported_types)
return list(args)


Expand Down
4 changes: 3 additions & 1 deletion test/suites/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,13 @@
from .test_mesh import TestSuite_Mesh
from .test_execute import TestSuite_Execute
from .test_dbapi import TestSuite_DBAPI
from .test_encoding import TestSuite_Encoding

test_cases = (TestSuite_Schema_UnicodeConnection,
TestSuite_Schema_BinaryConnection,
TestSuite_Request, TestSuite_Protocol, TestSuite_Reconnect,
TestSuite_Mesh, TestSuite_Execute, TestSuite_DBAPI)
TestSuite_Mesh, TestSuite_Execute, TestSuite_DBAPI,
TestSuite_Encoding)

def load_tests(loader, tests, pattern):
suite = unittest.TestSuite()
Expand Down
82 changes: 65 additions & 17 deletions test/suites/lib/skip.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,15 @@
import functools
import pkg_resources
import re
import sys

SQL_SUPPORT_TNT_VERSION = '2.0.0'


def skip_or_run_sql_test(func):
"""Decorator to skip or run SQL-related tests depending on the tarantool
def skip_or_run_test_tarantool(func, REQUIRED_TNT_VERSION, msg):
"""Decorator to skip or run tests depending on the tarantool
version.
Tarantool supports SQL-related stuff only since 2.0.0 version. So this
decorator should wrap every SQL-related test to skip it if the tarantool
version < 2.0.0 is used for testing.
Also, it can be used with the 'setUp' method for skipping the whole test
suite.
Also, it can be used with the 'setUp' method for skipping
the whole test suite.
"""

@functools.wraps(func)
Expand All @@ -28,16 +23,69 @@ def wrapper(self, *args, **kwargs):
).group()

tnt_version = pkg_resources.parse_version(self.tnt_version)
sql_support_tnt_version = pkg_resources.parse_version(
SQL_SUPPORT_TNT_VERSION
)
support_version = pkg_resources.parse_version(REQUIRED_TNT_VERSION)

if tnt_version < sql_support_tnt_version:
self.skipTest(
'Tarantool %s does not support SQL' % self.tnt_version
)
if tnt_version < support_version:
self.skipTest('Tarantool %s %s' % (self.tnt_version, msg))

if func.__name__ != 'setUp':
func(self, *args, **kwargs)

return wrapper


def skip_or_run_test_python_major(func, REQUIRED_PYTHON_MAJOR, msg):
"""Decorator to skip or run tests depending on the Python major
version.
Also, it can be used with the 'setUp' method for skipping
the whole test suite.
"""

@functools.wraps(func)
def wrapper(self, *args, **kwargs):
if func.__name__ == 'setUp':
func(self, *args, **kwargs)

major = sys.version_info.major
if major != REQUIRED_PYTHON_MAJOR:
self.skipTest('Python %s connector %s' % (major, msg))

if func.__name__ != 'setUp':
func(self, *args, **kwargs)

return wrapper


def skip_or_run_sql_test(func):
"""Decorator to skip or run SQL-related tests depending on the
tarantool version.
Tarantool supports SQL-related stuff only since 2.0.0 version.
So this decorator should wrap every SQL-related test to skip it if
the tarantool version < 2.0.0 is used for testing.
"""

return skip_or_run_test_tarantool(func, '2.0.0', 'does not support SQL')


def skip_or_run_varbinary_test(func):
"""Decorator to skip or run VARBINARY-related tests depending on
the tarantool version.
Tarantool supports VARBINARY type only since 2.2.1 version.
See https://github.com/tarantool/tarantool/issues/4201
"""

return skip_or_run_test_tarantool(func, '2.2.1',
'does not support VARBINARY type')


def skip_or_run_mp_bin_test(func):
"""Decorator to skip or run mp_bin-related tests depending on
the Python version.
Python 2 connector do not support mp_bin.
"""

return skip_or_run_test_python_major(func, 3, 'does not support mp_bin')
Loading

0 comments on commit 3d71614

Please sign in to comment.