-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All storages is offline after restart nebula services #5398
Comments
Errors in storage, after increased the level of logging:
|
and another question, is there any way to speed up the loading of parts after restarting the nebula storage? maybe some parameter in the configuration is responsible for this... Currently, it takes me about 3 hours to loading parts (( |
|
How many replicas do you set for each part? From the log, it looks like 2 instead of 3? |
We should consider rejecting even number in replication factor |
on every space have 16 partitions and replication factor 2, disks SSD. |
We should not configure the replication factor as an even number, maybe we should have banned such configuration when creating spaces. Could you wipe the cluster and recreate space with replication factor 1(non-ha) or 3(ha)? |
I can wipe the cluster but I have no backups ( Is there any other way to recover my data? |
@wenhaocs @pengweisong I think copying data from some of the storaged to others will do the job, right? |
Do you have executed balance data command? |
Is the network stable, or what about the I/O, CPU load of storage server? |
no, but all storages are OFFLINE, will that help? |
yes, netwok sis table, and other resources also ( |
no, do not execute any balance data command, it will be a disaster when you only have 2 copies. |
@mxsavchenko Hi, I have noticed that the issue you created hasn’t been updated for nearly a month, so I have to close it for now. If you have any new updates, you are welcome to reopen this issue anytime. Thanks a lot for your contribution anyway 😊 |
Docker
AlmaLinux 8.5
Intel xeon 4116
db3c1b3
)~400Gb
default
Hi, i have Nebula cluster on 3 nodes (graph/meta/storage), which was installed in v3.2.1 version.
A few days ago, i wanted to upgrade to version 3.4.0, i stopped all services (graph/meta/storage) on all nodes, then update docker image version to 3.4.0 and started the services again, but storage is not state ONLINE, after load parts, when switching to version 3.2.1 - the same problem. In logs storage, the leader is constantly being re-elected, and it seems that each node randomly takes the role of leader all the time, the console keeps switching storage OFFLINE/ONLINE, and then when all 3 storages have loaded parts, they go OFFLINE.
#####################
#####################
logs from storaged0/storaged1 in zip archive:
logs.zip
The text was updated successfully, but these errors were encountered: