Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for context_size and include 'interaction_id' in SearchRe… #1385

Merged
merged 2 commits into from
Oct 3, 2023

Conversation

austintlee
Copy link
Collaborator

Description

[Describe what this change achieves]

Issues Resolved

#1372

Check List

  • [x ] New functionality includes testing.
    • [x ] All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • [x ] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@austintlee
Copy link
Collaborator Author

Tests with failures:
 - org.opensearch.ml.action.prediction.PredictionITTests.testPredictionWithSearchInput_LogisticRegression

Seems like a flaky test.

@austintlee
Copy link
Collaborator Author

@dhrubo-os @jngz-es @ylwu-amzn can you please take a look at this PR? Thanks.

@@ -22,6 +22,8 @@
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.Setter;
import org.opensearch.common.recycler.Recycler;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see where to use it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. I added "spotless" so I don't miss stuff like this going forward.

Comment on lines 98 to 102
/*
* error={message=This model's maximum context length is 4097 tokens. However, your messages resulted in 4456 tokens.
* Please reduce the length of the messages.,
* type=invalid_request_error, param=messages, code=context_length_exceeded}
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines +96 to +98
if (Strings.isNullOrEmpty(systemPrompt) && Strings.isNullOrEmpty(userInstructions)) {
systemPrompt = DEFAULT_SYSTEM_PROMPT;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only userInstructions exists, we don't set default system promt for chat use case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. Actually, you can pass a system prompt as part of user instructions. We are getting some mixed results and keep the behavior this way so we can do more experiments.

@austintlee austintlee temporarily deployed to ml-commons-cicd-env September 29, 2023 20:58 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Sep 29, 2023

Codecov Report

Merging #1385 (857f436) into 2.x (2c8cc02) will increase coverage by 0.14%.
Report is 9 commits behind head on 2.x.
The diff coverage is 88.88%.

@@             Coverage Diff              @@
##                2.x    #1385      +/-   ##
============================================
+ Coverage     78.35%   78.49%   +0.14%     
- Complexity     2275     2296      +21     
============================================
  Files           190      190              
  Lines          9286     9418     +132     
  Branches        910      934      +24     
============================================
+ Hits           7276     7393     +117     
- Misses         1599     1602       +3     
- Partials        411      423      +12     
Flag Coverage Δ
ml-commons 78.49% <88.88%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ing/generative/GenerativeQAProcessorConstants.java 75.00% <ø> (ø)
...ering/generative/GenerativeQARequestProcessor.java 100.00% <100.00%> (ø)
.../generative/client/ConversationalMemoryClient.java 97.22% <100.00%> (+0.34%) ⬆️
...ng/generative/ext/GenerativeQAParamExtBuilder.java 100.00% <ø> (ø)
...nswering/generative/ext/GenerativeQAParamUtil.java 92.85% <100.00%> (+3.96%) ⬆️
...nanswering/generative/llm/ChatCompletionInput.java 100.00% <ø> (ø)
...estionanswering/generative/llm/DefaultLlmImpl.java 100.00% <100.00%> (ø)
...es/questionanswering/generative/llm/LlmIOUtil.java 100.00% <100.00%> (ø)
...questionanswering/generative/llm/ModelLocator.java 100.00% <ø> (ø)
...answering/generative/GenerativeSearchResponse.java 94.73% <87.50%> (-5.27%) ⬇️
... and 5 more

@@ -58,11 +60,16 @@ public class GenerativeQAResponseProcessor extends AbstractProcessor implements

private static final int DEFAULT_CHAT_HISTORY_WINDOW = 10;

private static final int MAX_PROCESSOR_TIME_IN_SECONDS = 60;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like DEFAULT makes more sense here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. Maybe a "max" value doesn't make sense here, although I think 60 seconds is a long time. I will change this to a 30 second default time (timeout).

private static final ParseField INTERACTION_SIZE = new ParseField("interaction_size");
private static final ParseField TIMEOUT = new ParseField("timeout");

public static final int SIZE_NULL_VALUE = -1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a comment for this variable as the variable name isn't quite self explanatory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
log.info("Using interaction size of {}", interactionSize);
List<Interaction> chatHistory = (conversationId == null) ? Collections.emptyList() : memoryClient.getInteractions(conversationId, interactionSize);
log.info("Retrieved chat history. ({})", getDuration(start));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious to know what's the goal of adding getDuration(start) in the log?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It logs the elapsed time for client calls. I can use debug for the log level.

@@ -37,6 +42,7 @@

import java.util.ArrayList;
import java.util.List;
import java.util.function.BiFunction;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we use these newly imported libraries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -93,45 +102,94 @@ public SearchResponse processResponse(SearchRequest request, SearchResponse resp
}

GenerativeQAParameters params = GenerativeQAParamUtil.getGenerativeQAParameters(request);

Integer timeout = params.getTimeout();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one liner?

Integer timeout = (params.getTimeout() != null && params.getTimeout() != GenerativeQAParameters.SIZE_NULL_VALUE) ? params.getTimeout() : MAX_PROCESSOR_TIME_IN_SECONDS;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that would involve calling .getTimeout() up to three times. Does the Java compiler do something clever to avoid that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. We could do:

Integer timeoutValue = params.getTimeout();
Integer timeout = (timeoutValue != null && timeoutValue != GenerativeQAParameters.SIZE_NULL_VALUE) ? timeoutValue : MAX_PROCESSOR_TIME_IN_SECONDS;

But upto you, keep the code as it is if you want.

List<Interaction> chatHistory = (conversationId == null) ? Collections.emptyList() : memoryClient.getInteractions(conversationId, interactionSize);
log.info("Retrieved chat history. ({})", getDuration(start));

Integer topN = params.getContextSize();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one liner?

Integer topN = params.getContextSize() != null ? params.getContextSize() : GenerativeQAParameters.SIZE_NULL_VALUE;

builder.field(GENERATIVE_QA_ERROR_FIELD_NAME, this.errorMessage);
} else {
/* body of our stuff */
builder.field(GENERATIVE_QA_ANSWER_FIELD_NAME, this.answer);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apology if I missed it, don't we need corresponding parser?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a TODO at the top of the file to address your question.

for (String result : contexts) {
messageArray.add(new Message(ChatRole.USER, "SEARCH RESULT: " + result).toJson());

/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we keeping this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

@austintlee austintlee temporarily deployed to ml-commons-cicd-env October 2, 2023 17:15 — with GitHub Actions Inactive
@austintlee austintlee temporarily deployed to ml-commons-cicd-env October 2, 2023 17:15 — with GitHub Actions Inactive
@austintlee austintlee temporarily deployed to ml-commons-cicd-env October 2, 2023 17:15 — with GitHub Actions Inactive
@austintlee austintlee temporarily deployed to ml-commons-cicd-env October 2, 2023 17:15 — with GitHub Actions Inactive
@austintlee
Copy link
Collaborator Author

@dhrubo-os @jngz-es anything else?

Copy link
Collaborator

@HenryL27 HenryL27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

// TODO Add "interaction_count". This is how far back in chat history we want to go back when calling LLM.
private static final int DEFAULT_PROCESSOR_TIME_IN_SECONDS = 30;

// TODO Add "interaction_count". This is how far back in chat history we want to go back when calling LLM.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be a hard add?

@dhrubo-os dhrubo-os merged commit ae6995a into opensearch-project:2.x Oct 3, 2023
7 of 9 checks passed
HenryL27 pushed a commit to HenryL27/ml-commons that referenced this pull request Oct 3, 2023
opensearch-project#1385)

* Add support for context_size and include 'interaction_id' in SearchResponse. [Issue opensearch-project#1372]

Signed-off-by: Austin Lee <[email protected]>

* Added spotless, removed unused code, added more comments.

Signed-off-by: Austin Lee <[email protected]>

---------

Signed-off-by: Austin Lee <[email protected]>
Signed-off-by: HenryL27 <[email protected]>
HenryL27 pushed a commit to HenryL27/ml-commons that referenced this pull request Oct 3, 2023
opensearch-project#1385)

* Add support for context_size and include 'interaction_id' in SearchResponse. [Issue opensearch-project#1372]

Signed-off-by: Austin Lee <[email protected]>

* Added spotless, removed unused code, added more comments.

Signed-off-by: Austin Lee <[email protected]>

---------

Signed-off-by: Austin Lee <[email protected]>
(cherry picked from commit ae6995a)
Signed-off-by: HenryL27 <[email protected]>
HenryL27 pushed a commit to HenryL27/ml-commons that referenced this pull request Oct 4, 2023
opensearch-project#1385)

* Add support for context_size and include 'interaction_id' in SearchResponse. [Issue opensearch-project#1372]

Signed-off-by: Austin Lee <[email protected]>

* Added spotless, removed unused code, added more comments.

Signed-off-by: Austin Lee <[email protected]>

---------

Signed-off-by: Austin Lee <[email protected]>
(cherry picked from commit ae6995a)
Signed-off-by: HenryL27 <[email protected]>
jngz-es pushed a commit that referenced this pull request Oct 4, 2023
#1385) (#1433)

* Add support for context_size and include 'interaction_id' in SearchResponse. [Issue #1372]



* Added spotless, removed unused code, added more comments.



---------


(cherry picked from commit ae6995a)

Signed-off-by: Austin Lee <[email protected]>
Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Austin Lee <[email protected]>
TrungBui59 pushed a commit to TrungBui59/ml-commons that referenced this pull request Nov 21, 2023
opensearch-project#1385) (opensearch-project#1433)

* Add support for context_size and include 'interaction_id' in SearchResponse. [Issue opensearch-project#1372]

* Added spotless, removed unused code, added more comments.

---------

(cherry picked from commit ae6995a)

Signed-off-by: Austin Lee <[email protected]>
Signed-off-by: HenryL27 <[email protected]>
Co-authored-by: Austin Lee <[email protected]>
Signed-off-by: TrungBui59 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants