-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IO hang 无感知 #154
Comments
ACK |
|
是采用这种方案:https://chaos-mesh.org/website-zh/docs/simulate-io-chaos-on-kubernetes/ 另外,我们在私有云复现了之前掉盘引发的io hang,测试了Xenon(日志,raft文件等)与mysqld数据文件在同一块NeonSAN盘的情况,现象如下: |
有 xenon.log 嘛 |
hang结束后看xenonlog 仅有这三条连续ERROR (leader xenon)
|
请教下 "磁盘读写hang住,不报错(执行ls、touch等命令阻塞)" 这个通过pod 直接挂载 NeonSAN吗? |
故障现象 : 当主节点所用数据盘变为不可读写或只读,导致所有涉及磁盘io的请求hang住而不会报错,mysqld进程一直存在,此时Xenon在主节点发生这种故障时不会触发选主切换。
实验手段 : 我们利用混沌测试的方式对该现象进行复现,通过对数据盘所有io操作注入100s延迟来模拟io hang住。
实验现象 :
0. sysbench持续发压,压力稳定并确认集群状态正常后注入io延迟故障;
1. 磁盘不可读写,sysbench端qps掉0;
2. xenoncli cluster status 查看集群状态,Mysql字段为空,IO/SQL字段变为false:
3. 查看xenon日志,发现两项ERROR级日志持续报错:
4. 登入mysql主库正常,执行insert语句hang住;
【注】此测例xenon相关文件和mysqld数据文件不在同一块数据盘,io延迟故障仅注入mysqld所用数据盘;
xenon与mysqld在同一块盘的情况尚未测试;
The text was updated successfully, but these errors were encountered: