Unstable parser tests #29

erikogan · 2016-08-24T00:45:17Z

In my environment, the parser tests (specifically for parse_yaml & parse_json) are unstable. They fail most of the time, but on occasion will pass.

I thought this might be due to the deliberate randomization of hash keys in 5.18, and it seems (anecdotally) that setting PERL_PERTURB_KEYS=0 does appear to help some of the tests, it does not guarantee a successful run.

Am I right in assuming these tests are consistent for you, and there is something else in my environment causing this issue?

The text was updated successfully, but these errors were encountered:

erikogan · 2016-08-24T00:47:26Z

Output from a failed test:

% perl t/engine.t parse_json/00
# Subtest: Test config: /Users/erik/work/serge/t/data/engine/parse_json/00/translate.serge
    ok 1 - Config file read
WARNING: 'email_from' is not defined. Will skip sending any reports.
WARNING: 'email_to' is not defined. Will skip sending any reports.


*** [test_job] "Test Job" ***

Initializing the database...
Preloading properties...
*** OPTIMIZATIONS DISABLED: job definition has changed (or job is ran for the first time) ***
Source dir: [/Users/erik/work/serge/t/data/engine/parse_json/00/resources]
DB source: [DBI:SQLite:dbname=:memory:]
TS file path: [/Users/erik/work/serge/t/data/engine/parse_json/00/test-output/po/%LOCALE%/%FILE%.po]
Output path: [/Users/erik/work/serge/t/data/engine/parse_json/00/test-output/localized-resources/%LOCALE%/%FILE%]
Destination languages: [test]
Modified languages: [test]
Preloading cache for job 'test_job' in namespace 'test_namespace'...

Updating database from source files...

Scanning directory structure...
Scanned in 0.000226 sec, 2 files match the criteria
  relaxed.json
  strings.json
2 files are new, 0 were orphaned and 0 are no longer orphaned since last run
update_database_from_source_files() took 0.003282 seconds
Updating database from translation files for [test] language...
update_database_from_ts_files() took 0.000207 seconds

Generating translation files...

  relaxed.json (forced mode)
    [1/test] ţēšţ 1
    [2/test] ţēšţ 2
    [3/test] ţēšţ 3
    Saving /Users/erik/work/serge/t/data/engine/parse_json/00/test-output/po/test/relaxed.json.po (479 bytes)
  strings.json (forced mode)
    [4/test] ţēšţ 2
    [5/test] ţēšţ 1
    Saving /Users/erik/work/serge/t/data/engine/parse_json/00/test-output/po/test/strings.json.po (372 bytes)
generate_ts_files() took 0.016646 seconds

Generating localized files...

  relaxed.json (forced mode)
    [1/test] ţēšţ 1
    [2/test] ţēšţ 2
    [3/test] ţēšţ 3
    Saving /Users/erik/work/serge/t/data/engine/parse_json/00/test-output/localized-resources/test/relaxed.json, because forced mode is on (123 bytes)
  strings.json (forced mode)
    [4/test] ţēšţ 2
    [5/test] ţēšţ 1
    Saving /Users/erik/work/serge/t/data/engine/parse_json/00/test-output/localized-resources/test/strings.json, because forced mode is on (270 bytes)

generate_localized_files() took 0.017815 seconds
*** [end] ***
    ok 2 - Processing all jobs in config file
TEST: dumping the database to /Users/erik/work/serge/t/data/engine/parse_json/00/test-output/database...
    # Subtest: Compare directory 'test-output/database' with 'reference-output/database'
        ok 1 - Files 'test-output/database/files' and 'reference-output/database/files' should be equal
        not ok 2 - Files 'test-output/database/items' and 'reference-output/database/items' should be equal
        #   Failed test 'Files 'test-output/database/items' and 'reference-output/database/items' should be equal'
        #   at /Users/erik/work/serge/t/lib/Test/Diff.pm line 43.
        not ok 3 - Files 'test-output/database/properties' and 'reference-output/database/properties' should be equal
        #   Failed test 'Files 'test-output/database/properties' and 'reference-output/database/properties' should be equal'
        #   at /Users/erik/work/serge/t/lib/Test/Diff.pm line 43.
        ok 4 - Files 'test-output/database/strings' and 'reference-output/database/strings' should be equal
        not ok 5 - Files 'test-output/database/translations' and 'reference-output/database/translations' should be equal
        #   Failed test 'Files 'test-output/database/translations' and 'reference-output/database/translations' should be equal'
        #   at /Users/erik/work/serge/t/lib/Test/Diff.pm line 43.
        1..5
        # Looks like you failed 3 tests of 5.
    not ok 3 - Compare directory 'test-output/database' with 'reference-output/database'
    #   Failed test 'Compare directory 'test-output/database' with 'reference-output/database''
    #   at /Users/erik/work/serge/t/lib/Test/Diff.pm line 62.
    # Subtest: Compare directory 'test-output/po' with 'reference-output/po'
        ok 1 - Files 'test-output/po/test/relaxed.json.po' and 'reference-output/po/test/relaxed.json.po' should be equal
        not ok 2 - Files 'test-output/po/test/strings.json.po' and 'reference-output/po/test/strings.json.po' should be equal
        #   Failed test 'Files 'test-output/po/test/strings.json.po' and 'reference-output/po/test/strings.json.po' should be equal'
        #   at /Users/erik/work/serge/t/lib/Test/Diff.pm line 43.
        1..2
        # Looks like you failed 1 test of 2.
    not ok 4 - Compare directory 'test-output/po' with 'reference-output/po'
    #   Failed test 'Compare directory 'test-output/po' with 'reference-output/po''
    #   at /Users/erik/work/serge/t/lib/Test/Diff.pm line 62.
    1..4
    # Looks like you failed 2 tests of 4.
not ok 1 - Test config: /Users/erik/work/serge/t/data/engine/parse_json/00/translate.serge
#   Failed test 'Test config: /Users/erik/work/serge/t/data/engine/parse_json/00/translate.serge'
#   at t/engine.t line 132.
1..1

erikogan · 2016-08-24T00:49:16Z

Oh, and:

% perl -V
Summary of my perl5 (revision 5 version 18 subversion 2) configuration:

  Platform:
    osname=darwin, osvers=15.0, archname=darwin-thread-multi-2level
    uname='darwin osx219.apple.com 15.0 darwin kernel version 15.0.0: fri may 22 22:03:51 pdt 2015; root:xnu-3216.0.0.1.11~1development_x86_64 x86_64 '
    config_args='-ds -e -Dprefix=/usr -Dccflags=-g  -pipe  -Dldflags= -Dman3ext=3pm -Duseithreads -Duseshrplib -Dinc_version_list=none -Dcc=cc'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-arch i386 -arch x86_64 -g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector',
    optimize='-Os',
    cppflags='-g -pipe -fno-common -DPERL_DARWIN -fno-strict-aliasing -fstack-protector'
    ccversion='', gccversion='4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.1)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc -mmacosx-version-min=10.11.3', ldflags ='-arch i386 -arch x86_64 -fstack-protector'
    libpth=/usr/lib /usr/local/lib
    libs=
    perllibs=
    libc=, so=dylib, useshrplib=true, libperl=libperl.dylib
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-arch i386 -arch x86_64 -bundle -undefined dynamic_lookup -fstack-protector'


Characteristics of this binary (from libperl):
  Compile-time options: HAS_TIMES MULTIPLICITY PERLIO_LAYERS
                        PERL_DONT_CREATE_GVSV
                        PERL_HASH_FUNC_ONE_AT_A_TIME_HARD
                        PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP
                        PERL_PRESERVE_IVUV PERL_SAWAMPERSAND USE_64_BIT_ALL
                        USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
                        USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE
                        USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
                        USE_REENTRANT_API
  Locally applied patches:
    /Library/Perl/Updates/<version> comes before system perl directories
    installprivlib and installarchlib points to the Updates directory
  Built under darwin
  Compiled at Aug 11 2015 04:22:26
  @INC:
    /Library/Perl/5.18/darwin-thread-multi-2level
    /Library/Perl/5.18
    /Network/Library/Perl/5.18/darwin-thread-multi-2level
    /Network/Library/Perl/5.18
    /Library/Perl/Updates/5.18.2/darwin-thread-multi-2level
    /Library/Perl/Updates/5.18.2
    /System/Library/Perl/5.18/darwin-thread-multi-2level
    /System/Library/Perl/5.18
    /System/Library/Perl/Extras/5.18/darwin-thread-multi-2level
    /System/Library/Perl/Extras/5.18
    .

iafan · 2016-08-24T01:18:39Z

I admit I sit on Perl 5.16, and yes, tests run fine for me (no differences between runs).

Try running perl engine.t --init — it will create "reference" data snapshots of these tests, and then you can use git diff to see what are the differences between the committed reference versions of the data files, which might give a better idea of the cause.

I'll also upgrade to the newer version of Perl and see if I can reproduce this.

erikogan · 2016-08-24T21:14:49Z

I ran PERL_PERTURB_KEYS=0 perl t/engine.t --init and committed the results here.

The differences appear to mostly be in the ordering of items in the DB, however some of the changes appear to be ids (or other checksum values) which I would expect to be stable (but I have not looked at the inputs to generate the checksum).

After running this the tests continue to fail when run with PERL_PERTURB_KEYS=0 perl t/engine.t, but the number of failures is not stable, it’s either 2, 4, or 6. So I think there is some other issues that are not fixed by setting that particular environment variable. (However, I still strongly suspect the new hashing algorithm in 5.18 is the most likely cause)

erikogan · 2016-08-24T21:16:35Z

Oh, and I just ran PERL_PERTURB_KEYS=0 perl t/engine.t --init again on the branch referenced above and there are differences between what I checked in and what was just output. So yes, the data is no longer stable in 5.18.

erikogan · 2016-08-24T21:18:26Z

It just occurred to me that it’s probably pretty easy for me to spin up a Docker container with a pre 5.18 version of Perl and see if I can get stability on the original tests and the ones I hope to add.

erikogan · 2016-08-24T22:13:02Z

Ok, I just confirmed that running the tests in a container with 5.16.3 causes the tests to be stable. I will submit my PR with data it generated (once I fix the bug I discovered now that I can trust the tests I did not write)

But this further implicates the changes in 5.18, and the most likely culprit is the hashing algorithm overhaul.

iafan · 2016-08-25T21:02:12Z

Just a quick update: this is reproducible in Perl 5.22. This is a quick test:

Go to <serge_root>/t/data/engine/parse_json/00/resources folder and run:

$ serge test-parser parse_json strings.json

Every time you run it, the dumped data structure (the order of the strings if which they are reported by the parser callback) will be different.

iafan · 2016-08-25T21:51:55Z

I looked at the JSON/YAML parser documentation and the only easy way to solve this is to sort object keys at each level; this is guaranteed to stabilize the output. Unfortunately, there's no way to tell the parsers to use e.g. Tie::IxHash as a replacement class to preserve the order.

A more involving way to solve this would be to find (or write) streaming callback-based parsers that will report the structure as it is parsed, allowing us ultimately to preserve the original order.

Given the fact that both JSON and YAML specs state that the order of the keys in the object/dictionary is not important and is not guaranteed, I think it's safe to go with the first (easy) option.

Also: minor wording/formatting changes

iafan added the bug label Aug 24, 2016

iafan closed this as completed in a6c4d3f Aug 25, 2016

iafan pushed a commit that referenced this issue Aug 25, 2016

Add #29 fix info

bc0b2e4

Also: minor wording/formatting changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unstable parser tests #29

Unstable parser tests #29

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016

iafan commented Aug 24, 2016

erikogan commented Aug 24, 2016 •

edited

Loading

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016 •

edited

Loading

iafan commented Aug 25, 2016

iafan commented Aug 25, 2016

Unstable parser tests #29

Unstable parser tests #29

Comments

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016

iafan commented Aug 24, 2016

erikogan commented Aug 24, 2016 • edited Loading

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016

erikogan commented Aug 24, 2016 • edited Loading

iafan commented Aug 25, 2016

iafan commented Aug 25, 2016

erikogan commented Aug 24, 2016 •

edited

Loading

erikogan commented Aug 24, 2016 •

edited

Loading