[Bug] [service down] service down and stop when use bash bin/start-all.sh
a few seconds
#16174
Closed
2 of 3 tasks
Labels
Search before asking
What happened
I use cluster deploy to deploy dolphinscheduler service to 2 server.
server and service running env below:
os: centos7
db: postgresql-15.7
zookeeper: 3.7.2
python: python3.6
java: java-11
I deploy the service dolphinscheduler to 2 server and make them both be master-server and worker-server.
when use
bash bin/start-all.sh
, every service work well and run perfect, but a few second later, some master-server and worker-server went down, but some workflow still can run.at the first time every server work well.
![image](https://private-user-images.githubusercontent.com/56484166/340623844-f78a3e87-2671-482c-90a0-c27bc1c1ee3e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzA0MTUsIm5iZiI6MTczOTM3MDExNSwicGF0aCI6Ii81NjQ4NDE2Ni8zNDA2MjM4NDQtZjc4YTNlODctMjY3MS00ODJjLTkwYTAtYzI3YmMxYzFlZTNlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE0MjE1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTRmZjY0ZDNjZjJlOGUxNTA3MjZjOTQ1NWVmYzJiODMzMGNhMDU2N2M2OTliMjkzYzZmNGEwOGQ2Y2EzMWZlYzAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.p0YfTclq1Zvxyc5KuRK05sHSlGRYqUMb2UiQ8vNZeDs)
few seconds later, some server down but some workflow can still run.
![image](https://private-user-images.githubusercontent.com/56484166/340624181-d4dd5a38-f24b-41a5-aca7-1e404bbff89f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzA0MTUsIm5iZiI6MTczOTM3MDExNSwicGF0aCI6Ii81NjQ4NDE2Ni8zNDA2MjQxODEtZDRkZDVhMzgtZjI0Yi00MWE1LWFjYTctMWU0MDRiYmZmODlmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE0MjE1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTgzNTNlNzk2YTc3OTRkNTMxOTgzMjgyMmQ3MGE0NmZjMjk2N2QzNGNmZDJhOWE5M2ZjNGVmN2NhZTk3Mjk2N2MmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.eSMTno7zowHLPAz6CvdKjYtS65KhHVlDsJFZH6HWQsc)
how to solve it? here is two server's master-server log file.
dolphinscheduler-master.log
dolphinscheduler-master_2.log
What you expected to happen
service work well and not down.
dolphinscheduler-master_2.log
dolphinscheduler-master.log
How to reproduce
use cluster deploy and make them all master-server and worker-server.then start the service.
Anything else
No response
Version
3.2.x
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: