Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Add task_name, task_function_name and actor_name in Structured Logging #48703

Merged
merged 19 commits into from
Nov 26, 2024

Conversation

MengjinYan
Copy link
Collaborator

@MengjinYan MengjinYan commented Nov 12, 2024

Why are these changes needed?

This PR added the task_name, task_function_name and actor_name to the structured logging.

To obtain the task_name and task_function, corresponding run_context methods & tests are added as well.

Related issue number

Close https://github.com/anyscale/rayturbo/issues/857

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@MengjinYan MengjinYan marked this pull request as ready for review November 13, 2024 20:51
@MengjinYan MengjinYan added the go add ONLY when ready to merge, run all tests label Nov 13, 2024
Copy link
Contributor

@alanwguo alanwguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! lgtm but I'm not the best person to review this logic.

@kevin85421 should review I think.

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Left some nit comments and questions.


Example:

.. testcode::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an example for the async case?


Example:

.. testcode::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an example for the async case?

MengjinYan and others added 4 commits November 18, 2024 15:05
Co-authored-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Mengjin Yan <[email protected]>
Co-authored-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Mengjin Yan <[email protected]>
Signed-off-by: Mengjin Yan <[email protected]>
@@ -41,6 +41,9 @@ class LogKey(str, Enum):
NODE_ID = "node_id"
ACTOR_ID = "actor_id"
TASK_ID = "task_id"
ACTOR_NAME = "actor_name"
TASK_NAME = "task_name"
TASK_FUNCTION = "task_function"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need task_function? I thought we only need task_name which should be function name by default?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. By default, in general, we the task name as the function name.

I still see 2 differences between task_name & task_function. That's why I kept them both.

  1. It is possible to set a customized task name by specifying name in options, but task function will always be the function call name
  2. The specific string in task_function will contain the python module name comparing to default task_name will only contain the function itself.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanwguo do we need function name or task name is enough for now. I feel having task_name is enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen task name can explode in cardinality but function name is a bit more stable. Having both could be useful where we can decide which to use depending on performance characteristics. For example if task_name is too high cardinality, we can choose to use func name exclusively.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@jjyao jjyao changed the title [Core] Add task_name, task_function and actor_id in Structured Logging [Core] Add task_name, task_function and actor_name in Structured Logging Nov 25, 2024
Copy link
Collaborator

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. LG, just some naming comments.

@@ -41,6 +41,9 @@ class LogKey(str, Enum):
NODE_ID = "node_id"
ACTOR_ID = "actor_id"
TASK_ID = "task_id"
ACTOR_NAME = "actor_name"
TASK_NAME = "task_name"
TASK_FUNCTION = "task_function"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: task_function_name. If you think it's too long, we can also do task_func_name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced offline. Changing to function name will implicitly indicate the parameter type and will be helpful to understand the field.

task_name = runtime_context.get_task_name()
if task_name is not None:
setattr(record, LogKey.TASK_NAME.value, task_name)
task_function = runtime_context.get_task_function()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_task_function_name

get_task_function makes me feel it might return the FunctionDescriptor

@@ -261,6 +261,8 @@ cdef optional[ObjectIDIndexType] NULL_PUT_INDEX = nullopt
# https://docs.python.org/3/library/contextvars.html#contextvars.ContextVar
# It is thread-safe.
async_task_id = contextvars.ContextVar('async_task_id', default=None)
async_task_name = contextvars.ContextVar('async_task_name', default=None)
async_task_function = contextvars.ContextVar('async_task_function', default=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task_function_name

@jjyao jjyao changed the title [Core] Add task_name, task_function and actor_name in Structured Logging [Core] Add task_name, task_function_name and actor_name in Structured Logging Nov 25, 2024
Signed-off-by: Mengjin Yan <[email protected]>
@jjyao jjyao merged commit 58425b9 into master Nov 26, 2024
5 checks passed
@jjyao jjyao deleted the issue-857 branch November 26, 2024 05:36
jecsand838 pushed a commit to jecsand838/ray that referenced this pull request Dec 4, 2024
dentiny pushed a commit to dentiny/ray that referenced this pull request Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants