pgtests test_emoticons_as_literal fails #1004

bkline · 2022-01-04T15:07:13Z

Environment

Python: 3.10.1
pyodbc: built from current HEAD of master branch
OS: macOS 11.6.2 (for client) Debian 10.2.1-6 (for server)
DB: PostgreSQL 14.1
driver: psqlodbcw.so 13.02.0000 (supports ODBC version 03.51)
unixODBC 2.3.9

Issue

The test_emoticons_as_literal() test fails for the pgtests.py suite.

======================================================================
FAIL: test_emoticons_as_literal (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/bkline/repos/pyodbc/tests3/pgtests.py", line 625, in test_emoticons_as_literal
    self.assertEqual(result, v)
AssertionError: 'x � z' != 'x 🌜 z'
- x � z
?   ^
+ x 🌜 z
?   ^

This is similar to the failure reported for another test suite by #536. Interestingly, if I run the tests with the --ansi flag set, this test passes, but the test_output_conversion test fails, because apparently setting ansi=True in the pyodbc.connect() call results for some reason in the return value from the local convert() function not being used (or perhaps that local function isn't invoked at all)

======================================================================
FAIL: test_output_conversion (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/bkline/repos/pyodbc/tests3/pgtests.py", line 671, in test_output_conversion
    self.assertEqual(value, 'X123.45X')
AssertionError: '123.45' != 'X123.45X'
- 123.45
+ X123.45X
? +      +

However, if I comment out the setencoding() call:

diff --git a/tests3/pgtests.py b/tests3/pgtests.py
index dcb03a9..5dfacc9 100755
--- a/tests3/pgtests.py
+++ b/tests3/pgtests.py
@@ -70,7 +70,7 @@ class PGTestCase(unittest.TestCase):
 
         # I've set my test database to use UTF-8 which seems most popular.
         self.cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8')
-        self.cnxn.setencoding(encoding='utf-8')
+        # self.cnxn.setencoding(encoding='utf-8')
 
         # As of psql 9.5.04 SQLGetTypeInfo returns absurdly small sizes leading
         # to slow writes.  Override them:

... and omit the --ansi flag in the test suite invocation, then all the tests pass (except for the error reported by #1003).

Library: /Users/bkline/repos/pyodbc/build/lib.macosx-11-x86_64-3.10/pyodbc.cpython-310-darwin.so
/Users/bkline/repos/pyodbc/tests3 --> /Users/bkline/repos/pyodbc
/Users/bkline/repos/pyodbc/tests3/pgtests.py:519: DeprecationWarning: PyUnicode_FromUnicode(NULL, size) is deprecated; use PyUnicode_New() instead
  result = str(row)
======================================================================
ERROR: test_columns (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/bkline/repos/pyodbc/tests3/pgtests.py", line 575, in test_columns
    assert row.precision == 3, row.precision
AttributeError: 'pyodbc.Row' object has no attribute 'precision'

----------------------------------------------------------------------
Ran 79 tests in 4.181s

FAILED (errors=1)

The text was updated successfully, but these errors were encountered:

gordthompson · 2022-01-04T19:17:22Z

Try running the tests with the -v switch, e.g,

$ python3 tests3/pgtests.py -v

That will show details about the test environment, specifically the driver name and version

driver:  psqlodbcw.so 12.01.0000

What does it display for you?

bkline · 2022-01-04T20:12:46Z

Appreciate your following up on these tickets so promptly. Here the verbose output.

% python3 tests3/pgtests.py -t test_emoticons_as_literal -vv
Library: /Users/bkline/repos/pyodbc/build/lib.macosx-11-x86_64-3.10/pyodbc.cpython-310-darwin.so
/Users/bkline/repos/pyodbc/tests3 --> /Users/bkline/repos/pyodbc
python:  3.10.1 (main, Dec  6 2021, 23:20:29) [Clang 13.0.0 (clang-1300.0.29.3)]
pyodbc:  4.0.33b6 /Users/bkline/repos/pyodbc/build/lib.macosx-11-x86_64-3.10/pyodbc.cpython-310-darwin.so
odbc:    03.52
driver:  psqlodbcw.so 13.02.0000
         supports ODBC version 03.51
os:      Darwin
unicode: Py_Unicode=4 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
test_emoticons_as_literal (__main__.PGTestCase) ... FAIL

======================================================================
FAIL: test_emoticons_as_literal (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/bkline/repos/pyodbc/tests3/pgtests.py", line 625, in test_emoticons_as_literal
    self.assertEqual(result, v)
AssertionError: 'x � z' != 'x 🌜 z'
- x � z
?   ^
+ x 🌜 z
?   ^


----------------------------------------------------------------------
Ran 1 test in 0.066s

FAILED (failures=1)

gordthompson · 2022-01-04T20:30:14Z

Hmmm. As in the case of #1003 this one works for me with Python 3.10 as well.

$ python3.10 tests3/pgtests.py -t test_emoticons_as_literal -vv
Library: /home/gord/git/pyodbc/build/lib.linux-x86_64-3.10/pyodbc.cpython-310-x86_64-linux-gnu.so
/home/gord/git/pyodbc/tests3 --> /home/gord/git/pyodbc
python:  3.10.1 (main, Dec 21 2021, 17:46:38) [GCC 9.3.0]
pyodbc:  4.0.33b7+test_with_dsn /home/gord/git/pyodbc/build/lib.linux-x86_64-3.10/pyodbc.cpython-310-x86_64-linux-gnu.so
odbc:    03.52
driver:  psqlodbcw.so 12.01.0000
         supports ODBC version 03.51
os:      Linux
unicode: Py_Unicode=4 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
test_emoticons_as_literal (__main__.PGTestCase) ... ok

----------------------------------------------------------------------
Ran 1 test in 1.813s

OK

bkline · 2022-01-04T20:38:52Z

On Windows, too, apparently (given that my recent PR passed all the pipeline tests). So the tests are just broken on macOS (at least with the versions of the drivers and unixODBC that I'm running).

bkline · 2022-01-04T21:33:07Z

Given the recent comments in #1003 it doesn't seem unreasonable to wonder if the fact that we're running different versions of the psqlodbc driver is a more likely explanation of the discrepancy between our results for this test, than the different operating systems on which we're running. I'll hop onto a Linux system and run some tests to see if I can confirm (or debunk) this theory.

bkline · 2022-01-05T13:48:56Z

Rolled back psqlodbc temporarily to 12.01.0000 on my macOS system, but the test still fails unless I comment out the line self.cnxn.setencoding(encoding='utf-8') or set the --ansi flag. I did find this bug report which @keitherskine filed with pgsql-bugs, but as far as I can tell his report was ignored.

Testing on a Linux system, also running psqlodbcw.so 12.01.0000, the test passes (matching your results). Also, all of the tests pass on that system with the setencoding() call commented out, and that's true of the macOS system running with the same driver version. I don't have access to a Windows test box, but it might be useful to know how the tests come out on windows without the call to setencoding(). If they pass, it might be tempting to fix the failing test by just letting the tests run with the pyodbc encoding defaults.

It would be even nicer if we understood what the underlying cause of the difference in behavior is when the encoding default is overridden. The only difference I can see (after pointing both systems to the same PostgreSQL server) is that the macOS client is using unixODBC 2.3.9 and the Linux system is running unixODBC 2.3.7. Unfortunately, attempts to build a version of unixODBC which matched the version installed on the other client ran into too much of a dependency swamp to be feasible on either system.

gordthompson · 2022-01-05T20:47:50Z

On Windows with 12.01.0000:

PS C:\Users\Gord\git\pyodbc> mypython3-64 tests3/pgtests.py -t test_emoticons_as_literal -vv
Library: C:\Users\Gord\git\pyodbc\build\lib.win-amd64-3.9\pyodbc.cp39-win_amd64.pyd
C:\Users\Gord\git\pyodbc\tests3 --> C:\Users\Gord\git\pyodbc
python:  3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
pyodbc:  4.0.33b6 C:\Users\Gord\git\pyodbc\build\lib.win-amd64-3.9\pyodbc.cp39-win_amd64.pyd
odbc:    03.80.0000
driver:  PSQLODBC35W.DLL 12.01.0000
         supports ODBC version 03.51
os:      Windows
unicode: Py_Unicode=2 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
         8.1 6.3.9600 SP0 Multiprocessor Free
test_emoticons_as_literal (__main__.PGTestCase) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.311s

OK

gordthompson · 2022-01-05T21:05:09Z

On Windows with 12.01.0000 and setencoding() disabled in pgtests.py:

PS C:\Users\Gord\git\pyodbc> mypython3-64 tests3/pgtests.py -t test_emoticons_as_literal -vv
Library: C:\Users\Gord\git\pyodbc\build\lib.win-amd64-3.9\pyodbc.cp39-win_amd64.pyd
C:\Users\Gord\git\pyodbc\tests3 --> C:\Users\Gord\git\pyodbc
python:  3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
pyodbc:  4.0.33b6 C:\Users\Gord\git\pyodbc\build\lib.win-amd64-3.9\pyodbc.cp39-win_amd64.pyd
odbc:    03.80.0000
driver:  PSQLODBC35W.DLL 12.01.0000
         supports ODBC version 03.51
os:      Windows
unicode: Py_Unicode=2 SQLWCHAR=2
Max VARCHAR = 255
Max WVARCHAR = 255
Max BINARY = (not supported)
         8.1 6.3.9600 SP0 Multiprocessor Free
test_emoticons_as_literal (__main__.PGTestCase) ... FAIL

======================================================================
FAIL: test_emoticons_as_literal (__main__.PGTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Gord\git\pyodbc\tests3\pgtests.py", line 625, in test_emoticons_as_literal
    self.assertEqual(result, v)
AssertionError: 'x ?? z' != 'x 🌜 z'
- x ?? z
?   ^^
+ x 🌜 z
?   ^


----------------------------------------------------------------------
Ran 1 test in 0.149s

FAILED (failures=1)

bkline · 2022-01-05T23:03:08Z

Well, that's a bummer. So with some stacks it fails if you leave the encoding at the default, and with others it fails if you override the default encoding. Makes for a fragile test suite. 😒

bkline · 2022-01-05T23:21:05Z

I suppose one possibility would be to make the overriding of the default encoding dynamically based on the OS.

bkline · 2022-01-06T03:47:19Z

Here's a repro of the test (pulled out into a separate script) failing on Linux with Python 3.9.7, pyodbc 4.0.32, and version 13.00.0000 of psqlodbcw.so.

pyodbc-issue1004.tar.gz

tar -xf pyodbc-issue1004.tar.gz
cd issue1004
docker compose build
docker compose up -d
docker logs issue1004-repro-1
docker compose down

Output:

============================================================
with default encoding
============================================================
python:  3.9.7 (default, Sep 10 2021, 14:59:43) [GCC 11.2.0]
pyodbc:  4.0.32
odbc:    03.52
driver:  psqlodbcw.so 13.00.0000
         supports ODBC version 03.51
os:      Linux
unicode: Py_Unicode=4 SQLWCHAR=2
------------------------------------------------------------
'x 🌜 z' matches the expected value 'x 🌜 z'

============================================================
with encoding overridden
============================================================
python:  3.9.7 (default, Sep 10 2021, 14:59:43) [GCC 11.2.0]
pyodbc:  4.0.32
odbc:    03.52
driver:  psqlodbcw.so 13.00.0000
         supports ODBC version 03.51
os:      Linux
unicode: Py_Unicode=4 SQLWCHAR=2
------------------------------------------------------------
'x ð\x9f\x8c\x9c z' does not match the expected value 'x 🌜 z'

bkline · 2022-01-06T14:30:46Z

I found a Windows machine to test on.

============================================================
with default encoding
============================================================
python:  3.9.0 (tags/v3.9.0:9cf6752, Oct 5 2020, 15:34:40) [MSC v.1927 64 bit (AMD64)]
pyodbc:  4.0.32
odbc:    03.80.0000
driver:  PSQLODBC35W.DLL 13.02.0000
         supports ODBC version 03.51
os:      Windows 10 10.0.19041 SP0 Multiprocessor Free
unicode: Py_Unicode=2 SQLWCHAR=2
------------------------------------------------------------
'x 🌜 z' matches the expected value 'x 🌜 z'

============================================================
with encoding overridden
============================================================
python:  3.9.0 (tags/v3.9.0:9cf6752, Oct 5 2020, 15:34:40) [MSC v.1927 64 bit (AMD64)]
pyodbc:  4.0.32
odbc:    03.80.0000
driver:  PSQLODBC35W.DLL 13.02.0000
         supports ODBC version 03.51
os:      Windows 10 10.0.19041 SP0 Multiprocessor Free
unicode: Py_Unicode=2 SQLWCHAR=2
------------------------------------------------------------
'x ðŸŒœ z' does not match the expected value 'x 🌜 z'

gordthompson · 2022-01-06T14:59:30Z

When you try to do the insert from your Mac with .setencoding() and .setdecoding(), e.g.,

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding="utf-8")
cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding="utf-8")
cnxn.setencoding(encoding="utf-8")

crsr = cnxn.cursor()

v = "x \U0001F31C z"

crsr.execute("CREATE TABLE t1 (s varchar(100))")
crsr.execute(f"insert into t1 values ('{v}')")

and then check the table from a separate tool like pgAdmin_4 does the inserted value look correct?

Edit:

Both pgAdmin_4 and DBeaver appear to render the column contents as x ðŸŒœ z, which are the correct bytes for UTF-8 encoding but not the Unicode character itself. 🙍‍♂️

bkline · 2022-01-06T16:31:29Z

The inserted character is not correct, so the problem happens going into the DB, not coming out. Sort of what I expected, given the semantics of "encode" and "decode" (commenting out just the setencoding() line was sufficient to get the value stored correctly).

#!/usr/bin/env python3

"""Store values with high Unicode characters in PostgreSQL database.
"""

import argparse
import pyodbc

VALUE = "x \U0001F31C z"

parser = argparse.ArgumentParser()
parser.add_argument("--connection-string", "-c", required=True)
opts = parser.parse_args()
conn = pyodbc.connect(opts.connection_string)
conn.setdecoding(pyodbc.SQL_CHAR, encoding="utf-8")
conn.setdecoding(pyodbc.SQL_WCHAR, encoding="utf-8")
conn.setencoding(encoding="utf-8")
cursor = conn.cursor()
for table_name in ("t1", "t2"):
    cursor.execute(f"CREATE TABLE {table_name} (s VARCHAR(100))")
cursor.execute("INSERT INTO t1 VALUES (?)", (VALUE,))
cursor.execute("INSERT INTO t2 VALUES ('{VALUE}')")
conn.commit()
print("done")

gordthompson · 2022-01-06T17:27:06Z

Okay, for me the test passes on both Windows and Linux if I comment out both the setencoding() and setdecoding() in tests3/pgtests.py:

pyodbc/tests3/pgtests.py

Lines 72 to 73 in b1cfb75

    
           self.cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8') 
        
           self.cnxn.setencoding(encoding='utf-8')

So if that also passes on Mac then we can just remove those lines.

bkline · 2022-01-06T18:13:46Z

The only failing test with that modification of the script on macOS is test_columns() (addressed in #1003). It's sort of a nice bonus that the correct behavior across the platforms is achieved using the package's defaults. That feels right. 👍

(I'm learning all sorts of new tricks with these recent tickets. TIL about extending a permalink in GitHub to a range. 😂 )

Fixes: mkleehammer#1004

Fixes: #1004

@hugovk

* Add support for Python 3.10, drop EOL 3.5 (mkleehammer#952) * Remove duplicate entry in pyi stub (mkleehammer#979) * Replace deprecated SafeConfigParser with ConfigParser (mkleehammer#953) * Designate connection string as optional (mkleehammer#987) * Fix spelling typos (mkleehammer#985) Co-authored-by: Gord Thompson <[email protected]> * Fix for DSN Names with non-ASCII chars (mkleehammer#951) * Fix for DSN Names with non-ASCII chars Fixes: mkleehammer#948 Co-authored-by: bamboo <[email protected]> Co-authored-by: Gord Thompson <[email protected]> * Added InterfaceError to pyodbc.pyi. (mkleehammer#1013) Co-authored-by: Benjamin Holder <[email protected]> * Upgrade deprecated unicode encoding calls (mkleehammer#792) * Do not include .pyc artifacts in source tarball mkleehammer#742 * Build wheels with cibuildwheels on GitHub Actions Fixes mkleehammer#175 Ref mkleehammer#688 Closes mkleehammer#668 Closes mkleehammer#685 Fixes mkleehammer#441 and pretty much most issues that mention ` sql.h: No such file or directory` This also need to setup some PyPI keys for automated uploads. * Install unixodbc-dev for Linux wheels * Enable GitHub Actions for pull requests * Use Debian based `manylinux_2_24` image * `apt-get` update before installing in wheel build * Use PEP 440 version name required for wheels * Skip building 32-bit wheels * 4.0.dev0 for default version, because test_version() wants 3 parts here Checked this won't shadow released minor version (credit goes to @hugovk) >>> from packaging.version import Version >>> Version("4.0.dev0") > Version("4.0.24") False * Had to use Debian image for PyPy too * Disable PyPy wheels https://cibuildwheel.readthedocs.io/en/stable/options/#build-selection PyPy is missing some C functions that `pyodbc` needs. * Update README.md * Avoid error when testing with DSN= connection Fixes: mkleehammer#1000 * Disable setencoding/setdecoding in tests3/pgtests.py Fixes: mkleehammer#1004 * Adjust test_columns() in tests3/pgtests.py for newer driver versions Fixes: mkleehammer#1003 * Move driver version check out of function * Add comment to _get_column_size() * Fix memory leak with decimal parameters Fixes: mkleehammer#1026 * Create codeql-analysis.yml * Bugfix/sql param data memory leak (mkleehammer#703) * Updated .gitignore * * Created a test file for the specific scenario * * Updated doc of test file for the specific SQLParamData scenario * * Fixed the test file for the specific SQLParamData scenario by Py_XDECREF the PyObject with 1 reference. * * Improved the test to close the cursor and set it to None, then forcing the gc * * Changed the fix of the memory leak and updated the test. * * Removed redundant empty line * * Converted tabs to spaces * * Moved variable out of conn's scope * Update gitignore, remove duplicated * Replace deprecated PyUnicode_FromUnicode(NULL, size) calls (mkleehammer#998) Current versions of Python write a deprecation warning message to stderr, which breaks CGI scripts running under web servers which fold stderr into stdout. Likely breaks other software. This change replaces the deprecated calls with PyUnicode_New(size, maxchar). The accompanying code to populate the new objects has also been rewritten to use the new PyUnicode APIs. * Making pyodbc compatible with PostgreSQL infinity dates, returning MINYEAR and MAXYEAR to python, instead of values out of python's limits * Removing autoformat from code * Removing autoformat from code * Add odbc_config support on mac and m1 homebrew dir * Note EOL of 2.7 support in README (mkleehammer#945) * Fix version of CI generated wheels The CI system is checking out exact tags like "git checkout 4.0.33", which results in a detached HEAD. The version calculation was adding the commit hash. * Fix for mkleehammer#1082 libraries in Linux wheels (mkleehammer#1084) * use argparse instead of optparse (mkleehammer#1089) Co-authored-by: Hugo van Kemenade <[email protected]> Co-authored-by: Alex Nelson <[email protected]> Co-authored-by: Kian Meng, Ang <[email protected]> Co-authored-by: Gord Thompson <[email protected]> Co-authored-by: bamboo <[email protected]> Co-authored-by: Gord Thompson <[email protected]> Co-authored-by: bdholder <[email protected]> Co-authored-by: Benjamin Holder <[email protected]> Co-authored-by: Inada Naoki <[email protected]> Co-authored-by: Michael Fladischer <[email protected]> Co-authored-by: Anatoli Babenia <[email protected]> Co-authored-by: Francisco Morales <[email protected]> Co-authored-by: Gord Thompson <[email protected]> Co-authored-by: Michael Kleehammer <[email protected]> Co-authored-by: Gilad Leifman <[email protected]> Co-authored-by: Bob Kline <[email protected]> Co-authored-by: Leandro Scott <[email protected]> Co-authored-by: Jordan Mendelson <[email protected]> Co-authored-by: Keith Erskine <[email protected]>

gordthompson added a commit to gordthompson/pyodbc that referenced this issue Jan 6, 2022

Disable setencoding/setdecoding in tests3/pgtests.py

2eece1f

Fixes: mkleehammer#1004

gordthompson mentioned this issue Jan 6, 2022

Disable setencoding/setdecoding in tests3/pgtests.py #1005

Merged

mkleehammer closed this as completed in #1005 Mar 27, 2022

mkleehammer pushed a commit that referenced this issue Mar 27, 2022

Disable setencoding/setdecoding in tests3/pgtests.py

b7cbf90

Fixes: #1004

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pgtests test_emoticons_as_literal fails #1004

pgtests test_emoticons_as_literal fails #1004

bkline commented Jan 4, 2022 •

edited

Loading

gordthompson commented Jan 4, 2022

bkline commented Jan 4, 2022

gordthompson commented Jan 4, 2022

bkline commented Jan 4, 2022

bkline commented Jan 4, 2022

bkline commented Jan 5, 2022

gordthompson commented Jan 5, 2022

gordthompson commented Jan 5, 2022

bkline commented Jan 5, 2022

bkline commented Jan 5, 2022

bkline commented Jan 6, 2022 •

edited

Loading

bkline commented Jan 6, 2022

gordthompson commented Jan 6, 2022 •

edited

Loading

bkline commented Jan 6, 2022 •

edited

Loading

gordthompson commented Jan 6, 2022

bkline commented Jan 6, 2022

pgtests test_emoticons_as_literal fails #1004

pgtests test_emoticons_as_literal fails #1004

Comments

bkline commented Jan 4, 2022 • edited Loading

Environment

Issue

gordthompson commented Jan 4, 2022

bkline commented Jan 4, 2022

gordthompson commented Jan 4, 2022

bkline commented Jan 4, 2022

bkline commented Jan 4, 2022

bkline commented Jan 5, 2022

gordthompson commented Jan 5, 2022

gordthompson commented Jan 5, 2022

bkline commented Jan 5, 2022

bkline commented Jan 5, 2022

bkline commented Jan 6, 2022 • edited Loading

bkline commented Jan 6, 2022

gordthompson commented Jan 6, 2022 • edited Loading

bkline commented Jan 6, 2022 • edited Loading

gordthompson commented Jan 6, 2022

bkline commented Jan 6, 2022

bkline commented Jan 4, 2022 •

edited

Loading

bkline commented Jan 6, 2022 •

edited

Loading

gordthompson commented Jan 6, 2022 •

edited

Loading

bkline commented Jan 6, 2022 •

edited

Loading