New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[fundamental] Recording support metrics #1762

Merged

crazygao merged 12 commits into main from yigao/recorder_refactor

Jan 30, 2024

Contributor

crazygao commented Jan 17, 2024 •

edited

Loading

Description

How to support metrics:
Old injection inject mock_tool, and all logic below mock_tool is dropped. especially for metric collection
Good point. mock_tool could also record items with no llm calls.

New way:
add a new layer at openai_inject.
In this way, the logic below mock_tool is not affected. For openai calls, it is easily collected.
Bad point, it doesn't know other info besides openai call.

So the new recording is a combination of these two.
Inject openai calls for llm nodes. + Inject tool for python nodes.

Also upload token numbers in live mode. Here is an example pipeline
https://github.com/microsoft/promptflow/actions/runs/7705912004

And we can retrieve the uploaded artifact.

All Promptflow Contribution checklist:

The pull request does not introduce [breaking changes].
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.
Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: suggested workflow.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

crazygao requested review from a team as code owners

January 17, 2024 03:44

crazygao temporarily deployed to internal

January 17, 2024 03:44

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 17, 2024 03:44

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 17, 2024 03:44

— with

GitHub Actions Inactive

github-actions bot added the promptflow label

github-actions bot commented Jan 17, 2024 •

edited

Loading

SDK CLI Global Config Test Result yigao/recorder_refactor

2 tests 2 ✅ 44s ⏱️
1 suites 0 💤
1 files 0 ❌

Results for commit c4045e3.

♻️ This comment has been updated with latest results.

github-actions bot commented Jan 17, 2024 •

edited

Loading

Executor Unit Test Result yigao/recorder_refactor

700 tests 700 ✅ 43s ⏱️
1 suites 0 💤
1 files 0 ❌

Results for commit c4045e3.

♻️ This comment has been updated with latest results.

github-actions bot commented Jan 17, 2024 •

edited

Loading

promptflow SDK CLI Azure E2E Test Result yigao/recorder_refactor

4 files 4 suites 3m 21s ⏱️
128 tests 109 ✅ 19 💤 0 ❌
512 runs 436 ✅ 76 💤 0 ❌

Results for commit c4045e3.

♻️ This comment has been updated with latest results.

github-actions bot commented Jan 17, 2024 •

edited

Loading

Executor E2E Test Result yigao/recorder_refactor

190 tests 188 ✅ 3m 58s ⏱️
1 suites 2 💤
1 files 0 ❌

Results for commit c4045e3.

♻️ This comment has been updated with latest results.

github-actions bot commented Jan 17, 2024 •

edited

Loading

SDK CLI Test Result yigao/recorder_refactor

12 files 12 suites 38m 24s ⏱️
416 tests 409 ✅ 7 💤 0 ❌
1 664 runs 1 636 ✅ 28 💤 0 ❌

Results for commit c4045e3.

♻️ This comment has been updated with latest results.

wangchao1230 reviewed

View reviewed changes

src/promptflow/tests/conftest.py Outdated Show resolved Hide resolved

crazygao added 4 commits

January 24, 2024 07:09


          Fixs

39a5511


          src/promptflow/tests/conftest.py

49d60e3


          temp Fix

f3607c3

Fix

31b4c34

crazygao force-pushed the yigao/recorder_refactor branch from 61b3022 to 31b4c34 Compare

January 26, 2024 03:10

crazygao temporarily deployed to internal

January 26, 2024 03:10

— with

GitHub Actions Inactive

crazygao added 2 commits

January 26, 2024 08:37

Fix

865ae20

Fix

ef2c57f

crazygao temporarily deployed to internal

January 29, 2024 05:57

— with

GitHub Actions Inactive


          Fix node cache

05a994c

crazygao requested a review from a team as a code owner

January 29, 2024 06:49

crazygao temporarily deployed to internal

January 29, 2024 06:49

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 29, 2024 06:49

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 29, 2024 06:49

— with

GitHub Actions Inactive

github-actions bot added the executor label


          Fix live count

crazygao temporarily deployed to internal

January 29, 2024 10:37

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 29, 2024 10:37

— with

GitHub Actions Inactive

crazygao dismissed zhengfeiwang’s stale review via

be43ae5

January 30, 2024 02:57

crazygao temporarily deployed to internal

January 30, 2024 02:57

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 02:57

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 02:57

— with

GitHub Actions Inactive


          Fix pipeline

91ba746

crazygao temporarily deployed to internal

January 30, 2024 03:14

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 03:14

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 03:14

— with

GitHub Actions Inactive

github-actions bot added the fundamental label

Fix

36608ab

crazygao temporarily deployed to internal

January 30, 2024 03:30

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 03:30

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 03:30

— with

GitHub Actions Inactive

github-actions bot commented Jan 30, 2024

SDK CLI Test Result

12 files ±0 12 suites ±0 50m 57s ⏱️ -14s
404 tests ±0 399 ✅ ±0 5 💤 ±0 0 ❌ ±0
1 616 runs ±0 1 596 ✅ ±0 20 💤 ±0 0 ❌ ±0

Results for commit 36608ab. ± Comparison against base commit df61729.

zhengfeiwang previously approved these changes

View reviewed changes

wangchao1230 previously approved these changes

View reviewed changes

chw-microsoft reviewed

View reviewed changes

src/promptflow/tests/sdk_cli_test/conftest.py Outdated Show resolved Hide resolved


          small refactor

c4045e3

crazygao dismissed stale reviews from wangchao1230 and zhengfeiwang via

c4045e3

January 30, 2024 07:43

crazygao temporarily deployed to internal

January 30, 2024 07:43

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 07:43

— with

GitHub Actions Inactive

crazygao temporarily deployed to internal

January 30, 2024 07:43

— with

GitHub Actions Inactive

chw-microsoft approved these changes

View reviewed changes

zhengfeiwang approved these changes

View reviewed changes

crazygao merged commit 09e27ee into main

40 checks passed

crazygao deleted the yigao/recorder_refactor branch

January 30, 2024 07:58

chw-microsoft mentioned this pull request

Enable executor recording mode #1754

Closed

7 tasks

chw-microsoft mentioned this pull request

Enable executor recording mode #1997

Merged

7 tasks

chw-microsoft added a commit that referenced this pull request


          Enable executor recording mode (#1997)

fdb286d

# Description


This PR is focused on enabling the recording injection mode for CI/tests
in the execution environment. The recording mode provides a mechanism to
record and replay end-to-end tests, enhancing the reliability and
efficiency of our testing process.

For detailed information on the recording mode, please refer to our
[documentation](https://github.com/microsoft/promptflow/blob/main/docs/dev/replay-e2e-test.md).

In this PR, we still keep the connection configuration in CI to make
sure backward compatibilities. Meanwhile, enable the recording injection
fixture to switch test into recording mode.

## **Key Features of This PR:**

### **Generation of Test Records**
The generation of records for execution tests when they are missing in
the shelve database, to make it work under recording mode.

### **Multi-Process Compatibility**
Resolves issues related to the recording injection mode in a
multi-process environment, particularly with processes
spawned/forkserver.
**_What is the issue under multi/new Process(spawn/forkserver)?_** 
Spawn/forkserver mode will not replicate the resources/setups from main
Process, including the recording setup. This would make the recording
not working anymore.

![image](https://github.com/microsoft/promptflow/assets/95913588/4cc8f68a-f61f-49ad-b2bd-6717f4d304bf)
**_How you resolved this issue?_** 
There are multiple ways to pass the recording setup in new Process, like
environment variable, serializable object as argument etc. But these
might incur interface change or too complex to squeeze into simple state
object.
We choose to re-import the state into new Process.

1) Create new target method for Process: this new target is used to
re-import the state needed

Example: For new Process target method _process_wrapper, for define
another wrapper method outside and inject the recording state
 

![image](https://github.com/microsoft/promptflow/assets/95913588/426827a0-3c1e-426d-864f-ec85ad611416)

2) Define a customized Process class with above new targets

Enable this new target method whenever new Process spawned/forkservered

![image](https://github.com/microsoft/promptflow/assets/95913588/21756ecf-fbed-4d6e-a21a-79228988b407)

3) Override context.Process or multiprocess.Process class

multiprocessing.get_context("spawn").Process = MockSpawnProcess
or

            multiprocessing.Process = MockSpawnProcess


We have implemented above logic in codes and integrated as part of
recording injection fixture for testing.


**_So all the tests under executor/sdk would intercept the third-party
call like this as default in CI?_**

Yes, all the CI is enable with "PROMPT_FLOW_TEST_MODE=replay". All the
openai/aoai related calls would be mocked to do recording result
retrieval rather than real api request.

**_Sometimes we might have necessity to customize the openai/AOAI call
in tests. How shall we do instead of using the default recording
mock?_**

Yes, 
1) Create your own target with customized mocking about openai/aoai call
.
2) Override the default recording target via context manager
"override_process_target"

Sample Tests: test_executor_openai_telemetry

**_In which scope the recording mode is enabled?_**
The recording mode is enabled per requiriment. If the test are involved
with 3rd party service with connection, like openai/aoai etc, it is
required to enable recording mode. If not, your PR will fail at the CI
tests since the connections info are not configured.
To enable recording mode, just make sure the fixture
"recording_injection" is required in your test class level or method
level.
 
_**Why not make recording mode a session-level fixture or required
fixture for all tests?**_
1) recording setup is complicated. it might introduce expected behavior
if abuse
2) Some tests like **test_executor_openai_telemetry** might server some
special test purpose. If enable recording, the customized mocking might
be not easy to be configured.

**_Note:_** 
_Above logic resolved this issue for one layer of new Processing. If you
have nested new Processing action , you need to repeat above logic at
each layer._







## **Todos and Ongoing Work[updated]:**

### **[Done]Metrics Mocks**
Currently, this openai_injector is skipped as it does not involve actual
openai/aoai requests. This omission leads to potential inaccuracies in
metrics calculation, especially token usage, in recording mode. Future
work will focus on integrating openai_injector.py into the recording
mode.
--- this fixed already by [[fundamental] Recording support metrics by
crazygao · Pull Request #1762 · microsoft/promptflow
(github.com)](#1762)

### **[Todo]Consolidation of Configuration**
Efforts are underway to consolidate the configuration settings for
recording injection in both the execution environment and the SDK,
aiming for a more unified and streamlined setup.

### **[Done]Record error info**
Record not only regular tool execution result, but also error response
if exception ocurr
----- have fixed this in this PR, sample tests:
test_executor_node_overrides

### **[Done]Test langchain**
Support langchain test with agent

---- fixed by [[fundamental] Recording support metrics by crazygao ·
Pull Request #1762 · microsoft/promptflow
(github.com)](#1762)

### **[Todo] Create pipeline CI without connection to monitor**
We need to CI without connection to identify tests which needs the
record setup.

### **[Todo] Make migration plan to make the non-connection CI as main
CI**
Need to make plan to safely switch to new record mode. Make sure there
are fallback logic if record has flow design which might hinder the new
feature delivery.

### **[Todo] Enable more apis with recording mode**
like assistant apis & tests

# All Promptflow Contribution checklist:
- [x] **The pull request does not introduce [breaking changes].**
- [ ] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [x] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [x] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [x] Title of the pull request is clear and informative.
- [ ] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.

---------

Co-authored-by: Philip Gao <[email protected]>
Co-authored-by: chjinche <[email protected]>
Co-authored-by: Cheng Liu <[email protected]>
Co-authored-by: Han Wang <[email protected]>
Co-authored-by: chenslucky <[email protected]>
Co-authored-by: cs_lucky <[email protected]>
Co-authored-by: Ying Chen <[email protected]>
Co-authored-by: Ying Chen <[email protected]>
Co-authored-by: chenyang <[email protected]>
Co-authored-by: Peiwen Gao <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

executor fundamental promptflow