Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add scrape state metrics #1900

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

catdogpandas
Copy link
Contributor

No description provided.

@catdogpandas
Copy link
Contributor Author

将 libcurl 的原始状态码返回,不要转换为字符串,映射转换为自监控指标,保证可控、枚举性,

@xdatcloud
Copy link

Reason 原因需要收敛下,比如我们通常遇到下面错误比较多:

● 401 / 403:采集作业需要配置正确的认证信息
● Connection Refused:目标端口未正常建立监听服务,通常服务发现配置出错,将它们排除即可
● Connection Timeout:通常是防火墙策略对探针拒绝访问,需要对探针放开、允许通信
● Request Timeout:采集超时,需调大采集间隔(scrape_interval)与采集超时(scrape_timeout)参数

正常情况下 status=OK,如果遇到错误可以整理成枚举值 ERR_CODE_401, ERR_CODE_403, ERR_CONN_REFUSED 等等,如果无命中当前枚举的统一用 status=ERR_UNKNOWN 处理

@catdogpandas
Copy link
Contributor Author

在 4.x 中对应 aliyun_prometheus_agent_scrape_custom_error 指标

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants