Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid inflections #8

Open
jdee opened this issue Nov 5, 2010 · 8 comments
Open

invalid inflections #8

jdee opened this issue Nov 5, 2010 · 8 comments
Labels

Comments

@jdee
Copy link
Owner

jdee commented Nov 5, 2010

The ActiveSupport::Inflector was used for regular plural nouns. The WordNet(R) exception list provided irregular plurals. For any other one-word noun, the ActiveSupport String#pluralize method was used. Generally this produced correct results, but often not. For example, the plural of Man (n.) (the island) is listed as Men, and the plural of shaman (n.) comes up shamen.

Some compound verbs are not handled correctly, notably log-in (v.), which Dubsar currently conjugates log-ins, log-ined, log-ining.

Dubsar provides no regular inflections for adjectives because rules for comparative and superlative degrees produce forms like sabbaticaller and sabbaticallest.

The :inflections table will simply grow to include more and more exceptions until all cases are listed, and there is no longer any need to generate it from rules or WordNet exception files. It will just be dumped out and then reloaded on each seed.

@jdee
Copy link
Owner Author

jdee commented Nov 6, 2010

There are also some stragglers like "tiing" for tie (in addition to "tying").

@jdee
Copy link
Owner Author

jdee commented Nov 6, 2010

The invalid -iing endings (tiing, diing instead of tying, dying) have been removed by a reseed. The code in the Word model that filters out duplicate inflections has been improved to work appropriately during creation, before the inflections have been saved. Now the de-dupe step at the end is no longer necessary.

@jdee
Copy link
Owner Author

jdee commented Nov 9, 2010

A number of problems have been solved. In a reseed, no regular inflection will be attempted for any word that contains anything but lower-case letters (no capitals, digits, spaces, hyphens or other punctuation).

Meanwhile, there continue to be problems in general with verbs ending in -CVc (e.g., bivouac, picnic). These will be addressed soon.

@jdee
Copy link
Owner Author

jdee commented Nov 11, 2010

That last batch has been addressed. Dubsar no longer attempts to inflect anything that doesn't match /^[a-z]+$/, i.e., nothing capitalized, nothing containing spaces, hyphens or other punctuation. The main remaining issues are with verbs ending in a short syllable with -l or -s, where there are often ambiguities (like traveled and travelled). Dubsar usually provides both, sometimes erroneously.

@jdee
Copy link
Owner Author

jdee commented Nov 12, 2010

Visit is erroneously listed with inflections "visitting" and "visitted." Same problem for "audit."

@jdee
Copy link
Owner Author

jdee commented Nov 12, 2010

The -it verbs have been fixed with a reseed.

@jdee
Copy link
Owner Author

jdee commented Nov 21, 2010

A couple of recent problems to be addressed:

cattle pluralized as cattles: This sort of word the ActiveSupport::Inflector calls "uncountable." In grammatical terms, they're perhaps indeclinable. At any rate, legitimate plurals of this form include monies, peoples, waters. The distinction has to be handled on a case-by-case basis.

hurted: WordNet does not treat this as an irregular, so I'll have to.

@jdee
Copy link
Owner Author

jdee commented Nov 23, 2010

The problem with hurted has been corrected with a migration. The cattles problem is a more general issue with the ActiveSupport Inflector and needs a little more general treatment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant