Release John Snow Labs NLP Test 1.2.0: Announcing Support for Cohere, AI21, Azure OpenAI and Hugging Face Inference API · JohnSnowLabs/langtest

📢 Overview

NLP Test 1.2.0 🚀 comes with brand new features, including: support for testing Cohere, AI21, Hugging Face Inference API and Azure-OpenAI LLMs for robustness, bias, accuracy and representation tests on the BoolQ and Natural Questions datasets, and many other enhancements and bug fixes!

A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests 🎉

Make sure to give the project a star right here ⭐

🔥 New Features & Enhancements

Adding support for 4 new LLM APIs for Question Answering task #388
Adding support for bias tests for testing LLMs on Question Answering #404
Adding support for representation tests for testing LLMs on Question Answering #405
Adding support for accuracy tests for testing LLMs on Question Answering #394
Adding new robustness test called number_to_word #377

🐛 Bug Fixes

Fixed bias tests to enable multi-token name replacements #400
Fixed issue in ethnicity/religion-names #393
Fixed issue in default HF text classification model #402

❓ How to Use

Get started now! 👇

pip install nlptest

Create your test harness in 3 lines of code 🧪

# Set OpenAI API keys
os.environ['OPENAI_API_KEY'] = ''

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='question-answering', model='gpt-3.5-turbo', hub='openai', data='BoolQ-test', config='config.yml')

# Generate test cases, run them and view a report
h.generate().run().report()

📖 Documentation

❤️ Community support

Slack For live discussion with the NLP Test community, join the #nlptest channel
GitHub For bug reports, feature requests, and contributions
Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission 👉 open an issue, a PR, or give us some feedback on features you'd like to see! 🙌

♻️ Changelog

What's Changed

fix/task test supoort check by @alytarik in #378
Add boolq dev dataset by @alytarik in #390
Issue 374 add representation tests by @ArshaanNazir in #381
Issue in ethnicity religion names by @ArshaanNazir in #393
Feature: Add representation tests for LLMs by @ArshaanNazir in #405
Fix: default HF text classification model issue by @chakravarthik27 in #402
Feature: Add support for bias tests for question answering by @ArshaanNazir in #404
Chore: Adding supported hubs as logos to landing page by @luca-martial in #403
Fix/bias_tests Enable multi-token name replacements by @ArshaanNazir in #400
Feature: Add support for number to words robustness test by @RakshitKhajuria in #377
Feature: Adding support for 4 new LLM APIs by @chakravarthik27 in #388
DRAFT: Feature/accuracy for qa task by @alytarik in #394
fix typo and order of columns by @alytarik in #406
Fix/llm accuracy bug fix by @alytarik in #407
Fix prompt template llm and transformer version by @ArshaanNazir in #408
added number_to_words test to robustness nb by @RakshitKhajuria in #410
notebooks and default_config paths updated. by @chakravarthik27 in #411
Fix: switch default HF classifier dataset from tweet to imdb by @luca-martial in #409
Chore: Website updates for new LLMs and pages by @luca-martial in #401
Release/1.2.0 by @ArshaanNazir in #415

New Contributors

@RakshitKhajuria made their first contribution in #377

Full Changelog: v1.1.0...v1.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

John Snow Labs NLP Test 1.2.0: Announcing Support for Cohere, AI21, Azure OpenAI and Hugging Face Inference API