You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using trex miner v0.26.0 on a rig with two LHR cards and two non-LHR cards, stability issues are encountered which result in large impact to the functionality of the LHR cards' mining performance.
Initially, mining performance is greatly increased over v0.25.15 and is seemingly stable; typically between 15-20 minutes of uptime, there is seemingly a CUDA/driver crash on a per-GPU basis which then reduces a given card's hashrate by approximately 60% until the rig has been rebooted. The maximum observed time on this specific rig prior to any given crash is 25 minutes, with the second LHR card crashing at the 40 minute mark.
This issue has been observed on HiveOS, using Nvidia driver v510.60.02 and additionally on driver version 510.68.02.
Aside from reverting driver to 510.60.02 to no avail, additionally it was attempted to add some Nvidia-specific tweaks to initramfs options and rebuilding initramfs - also without success. Overclock settings have been greatly reduced for the LHR cards as well with no effect.
Attached to this ticket are:
screenfetch log screenfetch.log
Same issues with me.
I have GPU crashed problem on 3080 cards and 3060 cards. Too many GPU crashed.
I low down 100 overclock on memery, but nothing changed.
The driver version is 512.15.
The error that t-rex miner showed me was 'Can't stop device xxx, cuda exception: CUDA_ERROR_UNKNOW.
sublimeBradley
changed the title
trex 0.26.0 - GPU Stability Issues - Linux (Hive)
[RESOLVED in 0.26.1] trex 0.26.0 - GPU Stability Issues - Linux (Hive)
May 14, 2022
##################################################
Update 14 May 2022
Issue appears to be resolved in trex 0.26.1 release using HiveOS 0.6-217@220513 as well as @220511
##################################################
Original Ticket
Using trex miner v0.26.0 on a rig with two LHR cards and two non-LHR cards, stability issues are encountered which result in large impact to the functionality of the LHR cards' mining performance.
Initially, mining performance is greatly increased over v0.25.15 and is seemingly stable; typically between 15-20 minutes of uptime, there is seemingly a CUDA/driver crash on a per-GPU basis which then reduces a given card's hashrate by approximately 60% until the rig has been rebooted. The maximum observed time on this specific rig prior to any given crash is 25 minutes, with the second LHR card crashing at the 40 minute mark.
This issue has been observed on HiveOS, using Nvidia driver v510.60.02 and additionally on driver version 510.68.02.
Aside from reverting driver to 510.60.02 to no avail, additionally it was attempted to add some Nvidia-specific tweaks to initramfs options and rebuilding initramfs - also without success. Overclock settings have been greatly reduced for the LHR cards as well with no effect.
Attached to this ticket are:
screenfetch log
screenfetch.log
dmesg log after crash (noted as the bottom-most lines)
dmesg_aftercrash.log
nvidia-smi log prior to crash
nvidia_smi_ok.log
nvidia-smi log after crash (crash noted on GPU #1)
nvidia_smi_err.log
The text was updated successfully, but these errors were encountered: