feat(learning classifier) | make a learning classifier by itself #21

ammirsm · 2024-07-19T01:00:06Z

No description provided.

CyrusNuevoDia · 2024-07-19T23:51:40Z

py/src/zenbase/predefined/single_class_classifier/classifier.py

+    class_dict: Optional[Dict[str, str]] = field(default=None)
+    class_enum: Optional[Enum] = field(default=None)
+    prediction_class: Optional[Type[BaseModel]] = field(default=None)
+    model: str
+    zenbase_tracer: ZenbaseTracer
+    lm_function: Optional[LMFunction] = field(default=None)
+    training_set: List[DatasetItem]
+    test_set: List[DatasetItem]
+    validation_set: List[DatasetItem]
+    shots: int = 5
+    samples: int = 10
+    best_evaluation: Optional[CandidateEvalResult] = field(default=None)
+    base_evaluation: Optional[CandidateEvalResult] = field(default=None)


Are you able to do the following on Python 3.10?

use | None instead of Optional

use list instead of List

CyrusNuevoDia · 2024-07-19T23:52:40Z

py/src/zenbase/predefined/single_class_classifier/function_generator.py

+    A generator for creating single-class classifier language model functions.
+    """
+
+    instructor_client: Instructor | AsyncInstructor


Is there a way to remove dependency on the Instructor client? Ideally a user submits their own LM function with an initial prompt.

Would be cool to return the results in an OpenAI compatible kwargs form so the user can consume them however they want

we want to make sure we are getting the structured output.

CyrusNuevoDia · 2024-07-19T23:54:16Z

Lgtm for now, though tbh I think there's a lot of work we can do next week to refine the elegance and simplicity of this.

CyrusNuevoDia · 2024-07-19T23:55:21Z

I think a good way to understand the goal of this is for us to be able to get integrated into e.g. https://www.askmarvin.ai/welcome/what_is_marvin/

- Implement `news_dataset` fixture to load the 20 Newsgroups dataset. - Create tests for `SingleClassClassifierLMFunctionGenerator`, including initialization and prediction verification. - Ensure balanced dataset creation for training, validation, and test sets in `SingleClassClassifier`.

…tionGenerator Implement exponential backoff with logging for the classifier function to improve resilience during retries.

ammirsm · 2024-07-21T20:25:49Z

Lgtm for now, though tbh I think there's a lot of work we can do next week to refine the elegance and simplicity of this.

I think a good way to understand the goal of this is for us to be able to get integrated into e.g.
https://www.askmarvin.ai/welcome/what_is_marvin/

I think it's quite straightforward at the moment; we just need to create a classifier and optimize it.

Regarding the goal, it seems our code is already quite similar to the example you mentioned (e.g., https://www.askmarvin.ai/welcome/what_is_marvin/).

The only way I see to simplify it further is to:

Extract the model configurations and initialize them during the library import, then have the class retrieve them from the library. However, I prefer keeping the scope within the class instead of using a global approach.

Here’s an example of the current implementation:

classifier = SingleClassClassifier(
    # Model config --> This should be done in another part with Maven too.
    instructor_client=instructor_client,
    model="gpt-4o-mini",
    zenbase_tracer=zenbase_tracer,
    # Prompt definition --> Same as Maven.
    prompt=prompt_definition,
    class_dict=class_dict,
    # Optimization parameters.
    training_set=train_set,
    validation_set=validation_set,
    test_set=test_set,
)
best_fn, _, _ = classifier.perform()
output = best_fn(sample_input)

- Change type hints from `Optional` to union types for clarity. - Modify the `_create_evaluator` method to be static. - Enhance test assertions to validate the result object and its properties.

* Downgrade Faker to 24.2.0 and update lock files * Add single class classifier synthetic data generator This commit message succinctly describes the main addition in the diff, which is a new feature for generating synthetic data for single class classifiers. * Add instructor package and create synthetic data generator notebook

…earning-classifier-by-itself

ammirsm added 6 commits July 18, 2024 18:57

add: single class classifier generator.

0e9e45c

add: adaptors.

eac9206

add: predifined classifier.

66cdb4a

add: tests.

fb052fc

add: cookbook.

c3f13c8

update: readme.

277297f

CyrusNuevoDia reviewed Jul 19, 2024

View reviewed changes

ammirsm added 2 commits July 21, 2024 14:09

Add retry logic to classifier function in SingleClassClassifierLMFunc…

73159ab

…tionGenerator Implement exponential backoff with logging for the classifier function to improve resilience during retries.

ammirsm added 2 commits July 21, 2024 14:28

Refactor SingleClassClassifier attributes and update test cases

3137ea5

- Change type hints from `Optional` to union types for clarity. - Modify the `_create_evaluator` method to be static. - Enhance test assertions to validate the result object and its properties.

Add pytest mark for helpers in single class classifier test

2c5b7e4

ammirsm changed the title ~~WIP feat(learning classifier) | make a learning classifier by itself~~ feat(learning classifier) | make a learning classifier by itself Jul 21, 2024

ammirsm and others added 10 commits July 24, 2024 18:10

Convert dataset to LMDemo objects for dict and synthetic data types

0bac54d

Add 'datasets' to required packages in single_class_classifier notebook

ec90747

Add single class classifier notebook with synthetic data

48bc0d2

Add single class classifier notebook with synthetic data

7e40acc

Add environment setup and Zenbase import

0baa66b

Remove parea-ai package from installation list

69f3792

Bump version to 0.0.6

6140577

Update predefined prompts cookbooks section in README

6326144

Merge branch 'main' into amir/eng-32-featlearning-classifier-make-a-l…

d8b0647

…earning-classifier-by-itself

ammirsm merged commit cfae458 into main Jul 25, 2024
3 checks passed

ammirsm deleted the amir/eng-32-featlearning-classifier-make-a-learning-classifier-by-itself branch July 28, 2024 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(learning classifier) | make a learning classifier by itself #21

feat(learning classifier) | make a learning classifier by itself #21

ammirsm commented Jul 19, 2024

CyrusNuevoDia Jul 19, 2024

CyrusNuevoDia Jul 19, 2024

CyrusNuevoDia Jul 19, 2024

ammirsm Jul 21, 2024

CyrusNuevoDia commented Jul 19, 2024

CyrusNuevoDia commented Jul 19, 2024

ammirsm commented Jul 21, 2024 •

edited

Loading

feat(learning classifier) | make a learning classifier by itself #21

feat(learning classifier) | make a learning classifier by itself #21

Conversation

ammirsm commented Jul 19, 2024

CyrusNuevoDia Jul 19, 2024

Choose a reason for hiding this comment

CyrusNuevoDia Jul 19, 2024

Choose a reason for hiding this comment

CyrusNuevoDia Jul 19, 2024

Choose a reason for hiding this comment

ammirsm Jul 21, 2024

Choose a reason for hiding this comment

CyrusNuevoDia commented Jul 19, 2024

CyrusNuevoDia commented Jul 19, 2024

ammirsm commented Jul 21, 2024 • edited Loading

ammirsm commented Jul 21, 2024 •

edited

Loading