- Support for scikit-learn < 0.18 is dropped;
- Formasaurus is no longer tested with Python 3.3;
- tests are fixed to account for upstream changes; Python 3.6 build is enabled.
- more annotated data for captchas;
formasaurus init
command which trains & caches the model.
- pip bug with
pip install formasaurus[with-deps]
is worked around; it should work now aspip install formasaurus[with_deps]
.
- fixed API documentation at readthedocs.org
- more annotated data;
- new
form_classes
andfield_classes
attributes of FormFieldClassifer; - more robust web page encoding detection in
formasaurus.utils.download
; - bug fixes in annotation widgets;
fields=False
argument is supported informasaurus.extract_forms
,formasaurus.classify
,formasaurus.classify_proba
functions and in relatedFormFieldClassifier
methods. It allows to avoid predicting form field types if they are not needed.formasaurus.classifiers.instance()
is renamed toformasaurus.classifiers.get_instance()
.- Bias is no longer regularized for form type classifier.
This is a major backwards-incompatible release.
- Formasaurus now can detect field types, not only form types;
- API is changed - check the updated documentation;
- there are more form types detected;
- evaluation setup is improved;
- annotation UI is rewritten using IPython widgets;
- more training data is added.
- Python 3 support;
- fixed model auto-creation.
Initial release.