Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong CUDA Crash GPU Number #732

Closed
megalolka opened this issue Oct 14, 2021 · 3 comments
Closed

Wrong CUDA Crash GPU Number #732

megalolka opened this issue Oct 14, 2021 · 3 comments

Comments

@megalolka
Copy link

Hi!
I have found a bug: the miner first tells that the crashed GPU number is 6:
20211014 11:36:56 TREX: Can't stop device [ID=6, GPU #6], cuda exception: CUDA_ERROR_UNKNOWN
And then it shows that the GPU number is 5:
WD: ======== GPU CRASH LIST ========
WD: GPU#5: 1


Here is the full log:

20211014 11:36:47 [ OK ] 69/69 - 320.66 MH/s, 24ms ... GPU #6 | 48.93 G
20211014 11:36:47 ethash epoch: 447, block: 13415309, diff: 4.29 G
20211014 11:36:56 TREX: Can't stop device [ID=6, GPU #6], cuda exception: CUDA_ERROR_UNKNOWN
20211014 11:36:56 WARN: Miner is going to shutdown...
20211014 11:36:56 Main loop finished. Cleaning up resources...
20211014 11:36:56 ApiServer: stopped listening on 127.0.0.1:4067
20211014 11:36:57 T-Rex finished.
20211014 11:36:57 WARN: WATCHDOG: T-Rex does not exist anymore, restarting...
20211014 11:36:57 WATCHDOG: 2 miner restarts till 'exit'
20211014 11:36:59 T-Rex NVIDIA GPU miner v0.24.2 - [Windows]
20211014 11:36:59 r.66f25c328d28
20211014 11:36:59
20211014 11:36:59
20211014 11:36:59 NVIDIA Driver v471.96
20211014 11:36:59
20211014 11:36:59 + GPU #0: [00:01.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #1: [00:03.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #2: [00:08.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #3: [00:09.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #4: [00:0a.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #5: [00:0b.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #6: [00:0c.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59 + GPU #7: [00:0d.0] MSI GeForce RTX 3060 Ti, 8192 MB
20211014 11:36:59
20211014 11:36:59 WARN: DevFee 1% (ethash)
20211014 11:36:59
20211014 11:36:59 === MAIN POOL === | back to main in 10 mins |
20211014 11:36:59 URL : stratum+tcp://eu1.ethermine.org:4444
20211014 11:36:59 USER: 0x4e24eec709e942cdde1e1b4d65f56d44ebe13b25.RobRig01-Trex
20211014 11:36:59 PASS:
20211014 11:36:59
20211014 11:36:59 URL : stratum+tcp://us1.ethermine.org:4444
20211014 11:36:59 USER: 0x4e24eec709e942cdde1e1b4d65f56d44ebe13b25.RobRig01-Trex
20211014 11:36:59 PASS:
20211014 11:36:59
20211014 11:36:59 WARN: Temperature limit is activated. Limit is set to 68C.
20211014 11:36:59
20211014 11:36:59 Starting on: eu1.ethermine.org:4444
20211014 11:36:59 ApiServer: HTTP server started on 127.0.0.1:4067
20211014 11:36:59 ---------------------------------------------------
20211014 11:36:59 For control navigate to: http://127.0.0.1:4067/trex
20211014 11:36:59 ---------------------------------------------------
20211014 11:36:59 Using protocol: stratum1.
20211014 11:36:59 Authorizing...
20211014 11:36:59 Authorized successfully.
20211014 11:36:59 ethash epoch: 447, block: 13415309, diff: 4.29 G
20211014 11:37:00 GPU #0: [LHR 72<>] intensity 21.3, mclock 800MHz, core clock locked 1320MHz
20211014 11:37:00 GPU #2: [LHR 71<>] intensity 21.3, mclock 900MHz, core clock locked 1320MHz
20211014 11:37:00 GPU #4: [LHR 71<>] intensity 21.3, mclock 800MHz, core clock locked 1320MHz
20211014 11:37:00 GPU #5: [LHR 72<>] intensity 21.3, mclock 800MHz, core clock locked 1320MHz
20211014 11:37:01 GPU #6: [LHR 72<>] intensity 21.3, mclock 840MHz, core clock locked 1320MHz
20211014 11:37:01 GPU #7: [LHR 71<>] intensity 21.3, mclock 720MHz, core clock locked 1320MHz
20211014 11:37:01 GPU #1: [LHR 71<>] intensity 21.3, mclock 840MHz, core clock locked 1320MHz
20211014 11:37:01 GPU #3: [LHR 75<>] intensity 21.3, mclock 800MHz, core clock locked 1320MHz
20211014 11:37:05 GPU #5: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #4: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #3: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #6: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #2: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #1: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #0: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:05 GPU #7: generating DAG 4.49 GB for epoch 447 ...
20211014 11:37:23 GPU #0: DAG generated [crc: 1e3734c7, time: 18226 ms], memory left: 2.50 GB
20211014 11:37:23 GPU #0: using kernel #4
20211014 11:37:23 GPU #1: DAG generated [crc: 1e3734c7, time: 18594 ms], memory left: 2.50 GB
20211014 11:37:23 GPU #1: using kernel #4
20211014 11:37:25 GPU #0: target hashrate for unlocker - 41.88 MH/s
20211014 11:37:25 GPU #5: DAG generated [crc: 1e3734c7, time: 20605 ms], memory left: 2.50 GB
20211014 11:37:25 GPU #5: using kernel #4
20211014 11:37:25 GPU #3: DAG generated [crc: 1e3734c7, time: 20670 ms], memory left: 2.50 GB
20211014 11:37:25 GPU #3: using kernel #4
20211014 11:37:25 GPU #1: target hashrate for unlocker - 41.33 MH/s
20211014 11:37:27 GPU #2: DAG generated [crc: 1e3734c7, time: 22683 ms], memory left: 2.50 GB
20211014 11:37:27 GPU #5: target hashrate for unlocker - 38.66 MH/s
20211014 11:37:27 GPU #2: using kernel #4
20211014 11:37:27 GPU #4: DAG generated [crc: 1e3734c7, time: 22779 ms], memory left: 2.50 GB
20211014 11:37:27 GPU #4: using kernel #4
20211014 11:37:27 GPU #3: target hashrate for unlocker - 37.81 MH/s
20211014 11:37:28 ethash epoch: 447, block: 13415310, diff: 4.29 G
20211014 11:37:28 GPU #7: DAG generated [crc: 1e3734c7, time: 23662 ms], memory left: 2.50 GB
20211014 11:37:28 GPU #7: using kernel #4
20211014 11:37:28 GPU #6: DAG generated [crc: 1e3734c7, time: 23951 ms], memory left: 2.50 GB
20211014 11:37:28 GPU #6: using kernel #4
20211014 11:37:30 [ OK ] 1/1 - 385.67 MH/s, 24ms ... GPU #3 | 33.94 G
20211014 11:37:30 GPU #4: target hashrate for unlocker - 41.69 MH/s
20211014 11:37:30 GPU #2: target hashrate for unlocker - 41.33 MH/s
20211014 11:37:30 GPU #7: target hashrate for unlocker - 40.86 MH/s
20211014 11:37:30 GPU #6: target hashrate for unlocker - 39.38 MH/s
20211014 11:37:34 [ OK ] 2/2 - 316.57 MH/s, 24ms ... GPU #5 | 39.54 G
20211014 11:37:45 [ OK ] 3/3 - 319.56 MH/s, 24ms ... GPU #3 | 8.17 G
20211014 11:37:56 [ OK ] 4/4 - 322.89 MH/s, 24ms ... GPU #1 | 5.11 G
20211014 11:37:59 ethash epoch: 447, block: 13415311, diff: 4.29 G

--------------20211014 11:38:00 --------------
Mining at eu1.ethermine.org:4444, diff: 4.29 G
GPU #0: MSI RTX 3060 Ti - 41.77 MH/s, [LHR 72<>] [T:58C, P:140W, F:81%, E:390kH/W]
GPU #1: MSI RTX 3060 Ti - 41.15 MH/s, [LHR 71<>] [T:56C, P:138W, F:64%, E:399kH/W], 8/8 R:0% I:0%
GPU #2: MSI RTX 3060 Ti - 41.26 MH/s, [LHR 71<>] [T:51C, P:132W, F:47%, E:425kH/W]
GPU #3: MSI RTX 3060 Ti - 36.95 MH/s, [LHR 75<>] [T:46C, P:128W, F:41%, E:370kH/W], 15/15 R:0% I:0%
GPU #4: MSI RTX 3060 Ti - 41.40 MH/s, [LHR 71<>] [T:50C, P:127W, F:48%, E:440kH/W]
GPU #5: MSI RTX 3060 Ti - 38.57 MH/s, [LHR 72<>] [T:55C, P:121W, F:30%, E:406kH/W], 9/9 R:0% I:0%
GPU #6: MSI RTX 3060 Ti - 38.92 MH/s, [LHR 72<>] [T:42C, P:122W, F:40%, E:453kH/W]
GPU #7: MSI RTX 3060 Ti - 40.69 MH/s, [LHR 71<>] [T:52C, P:132W, F:50%, E:407kH/W]
Hashrate: 320.71 MH/s, Shares/min: 7.011 (Avg. 4.286), Avg.P: 782W, Avg.E: 410kH/W
Max diff share was found by GPU #5, diff: 39.54 G
Uptime: 1 min | Algo: ethash | T-Rex v0.24.2
WD: 18 mins 6 secs, shares: 73/73 , restarts 1
WD: ======== GPU CRASH LIST ========
WD: GPU#5: 1

@Crisis83
Copy link

I have simular issues but reporting the wrong gpu as crashed. Say GPU#0 is reported as crashed but I know it’s GPU#3 since I messed with GPU#3’s OC settings. I guess it could be the same bug. Reverting back fixes the issue. Sometimes this is frustrating when going through logs trying to find which card crashed the rig at night and you can’t always trust the log.

@trexminer
Copy link
Owner

Here is an assembly with possible fix. Please try:
https://www.dropbox.com/s/svmnpjcw3e5syi0/t-rex-0.24.2-win.zip?dl=1

@trexminer
Copy link
Owner

Should be fixed in 0.24.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants