Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format function for TypeDefinition seems to mess up some of the Lists #357

Closed
olzama opened this issue Nov 24, 2022 · 2 comments
Closed
Labels

Comments

@olzama
Copy link

olzama commented Nov 24, 2022

If I iterparse this TypeDefinition

main-vprn := basic-main-verb & norm-pronominal-verb &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < #subj >, 
                           COMPS < #comps >,
                           CLTS #clt ],
    ARG-ST < #subj . < #comps . #clt > > ].

and then print it back out using the format function, I get this:

main-vprn := basic-main-verb & norm-pronominal-verb &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < #subj >,
                           COMPS < #comps >,
                           CLTS #clt ],
    ARG-ST < #subj, #comps . < #comps . #clt > > ].     <----- Note the extra `,#comps`which used to not be there before

@olzama
Copy link
Author

olzama commented Nov 24, 2022

Repro:

from delphin import tdl as pydelphin_tdl

for event, obj, lineno in pydelphin_tdl.iterparse('debug.txt'):
    print(pydelphin_tdl.format(obj))

debug.txt

@goodmami goodmami added the bug label Nov 27, 2022
@goodmami
Copy link
Member

Ok thanks, I've confirmed the bug. I'm trying to figure out what to do about it. In cons-lists, the final item being delimited by a dot . indicates that it should be the value of the final REST instead of null. E.g.,

< a, b >   -->  [ FIRST a, REST [ FIRST b, REST *null* ]]
< a . b >  -->  [ FIRST a, REST b ]

For debugging I modified your debug.txt as follows, where the second type has the list defined using a comma instead of a dot with a nested list:

main-vprn := basic-main-verb & norm-pronominal-verb &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < #subj >,
                           COMPS < #comps >,
                           CLTS #clt ],
    ARG-ST < #subj . < #comps . #clt > > ].

main-vprn2 := basic-main-verb & norm-pronominal-verb &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < #subj >,
                           COMPS < #comps >,
                           CLTS #clt ],
    ARG-ST < #subj , #comps . #clt > ].

The first thing I notice is that the value of REST in the original version is another ConsList because it parsed the < character, but in the modified version it's just a plain AVM:

>>> from delphin import tdl
>>> orig, mod = [obj for _, obj, _ in tdl.iterparse('debug.txt')]
>>> orig['ARG-ST'].features()
[('FIRST', <Coreference object at 139968742236480>), ('REST', <ConsList object at 139968744179648>)]
>>> mod['ARG-ST'].features()
[('FIRST', <Coreference object at 139968742237344>), ('REST', <AVM object at 139968740017344>)]

The features of these AVMs are the same, though (I need to cast the Coreference objects to strings because equality is not defined for Coreference objects):

>>> str(orig['ARG-ST.FIRST']) == str(mod['ARG-ST.FIRST'])
True
>>> str(orig['ARG-ST.REST.FIRST']) == str(mod['ARG-ST.REST.FIRST'])
True
>>> str(orig['ARG-ST.REST.REST']) == str(mod['ARG-ST.REST.REST'])
True

Also note that PyDelphin has no problem formatting the modified version:

>>> print(tdl.format(mod))
main-vprn2 := basic-main-verb & norm-pronominal-verb &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < #subj >,
                           COMPS < #comps >,
                           CLTS #clt ],
    ARG-ST < #subj, #comps . #clt > ].

I think PyDelphin should be able to recreate the dotted form since it knows the value of REST is a ConsList.

goodmami added a commit that referenced this issue Jan 3, 2023
This required a new (non-public) type: _ImplicitAVM. This is the AVM
constructed by list syntax (< ... > or <! ... !>).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants