Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIOPEN_FIND_ENFORCE=3 picked failing solution #51

Closed
jeffdaily opened this issue Aug 29, 2018 · 10 comments
Closed

MIOPEN_FIND_ENFORCE=3 picked failing solution #51

jeffdaily opened this issue Aug 29, 2018 · 10 comments
Assignees

Comments

@jeffdaily
Copy link

$ apt list | grep miopen
miopen-hip/Ubuntu 16.04,now 1.4.2-0258028 amd64 [installed]
miopengemm/Ubuntu 16.04,now 1.1.5-9547fb9 amd64 [installed]

Running the tf_cnn_benchmarks.py as follows with tensorflow-upstream r1.8-rocm branch:

MIOPEN_FIND_ENFORCE=3 python tf_cnn_benchmarks.py --model=vgg16 --batch_size=64 --num_gpus=1 --num_batches=1 --num_warmup_batches=0

Possibly a separate bug, I saw the output

2018-08-29 09:42:38.991402: I tensorflow/core/kernels/conv_grad_filter_ops.cc:959] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpen(HIP): Error [FindSolutionImpl] Search failed for: ConvAsmBwdWrW3x3: /data/repo/MIOpen/src/hip/handlehip.cpp:70: Memory not available to allocate buffer: 411041792

So I switched to using MIOpenDriver without all of TF. Without MIOPEN_FIND_ENFORCE=3 this was the output:

./bin/MIOpenDriver conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpenDriver: conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpen Forward Conv. Algorithm: 3
GPU Kernel Time Forward Conv. Elapsed: 18.477495 ms
Forward Convolution Verifies on CPU and GPU (5.68537e-08)
MIOpen Backward Data Conv. Algorithm: 3
GPU Kernel Time Backward Data Conv. Elapsed: 17.817320 ms
MIOpen Backward Weights Conv. Algorithm: 0
GPU Kernel Time Backward Weights Conv. Elapsed: 61.924541 ms
Backward Convolution Data Verifies on CPU and GPU (6.28033e-08)
Backward Convolution Weights Verifies on CPU and GPU (3.45614e-07)

Using MIOPEN_FIND_ENFORCE=3 selected a failing kernel:

MIOPEN_FIND_ENFORCE=3 ./bin/MIOpenDriver conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpenDriver: conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpen Forward Conv. Algorithm: 3
GPU Kernel Time Forward Conv. Elapsed: 18.349342 ms
Forward Convolution Verifies on CPU and GPU (5.68537e-08)
MIOpen Backward Data Conv. Algorithm: 3
GPU Kernel Time Backward Data Conv. Elapsed: 18.237505 ms
MIOpen(HIP): Warning [GenericSearch] ConvAsmBwdWrW3x3: Searching the best solution among 5416...
MIOpen(HIP): Warning [Monitor] 10/0/5416 76.4986, best within recent 11: 76.4986 #2 2,0,8,1,1,1, ETA:1829.38 sec.
MIOpen(HIP): Warning [Monitor] 18/0/5416 76.0711, best within recent 8: 76.0711 #12 2,1,8,1,1,1, ETA:2048.48 sec.
MIOpen(HIP): Warning [Monitor] 32/0/5416 53.4908, best within recent 14: 53.4908 #23 3,0,16,1,1,1, ETA:1662.38 sec.
MIOpen(HIP): Warning [Monitor] 45/0/5416 45.7382, best within recent 13: 45.7382 #45 5,0,8,2,1,1, ETA:1559.3 sec.
MIOpen(HIP): Warning [Monitor] 55/0/5416 45.3974, best within recent 10: 45.3974 #52 2,1,8,2,1,1, ETA:1579.78 sec.
MIOpen(HIP): Warning [Monitor] 72/0/5416 45.3929, best within recent 17: 45.3929 #57 7,1,8,2,1,1, ETA:1428.14 sec.
MIOpen(HIP): Warning [Monitor] 86/0/5416 39.2118, best within recent 14: 39.2118 #82 2,0,8,4,1,1, ETA:1388.66 sec.
MIOpen(HIP): Warning [Monitor] 95/0/5416 38.3107, best within recent 9: 38.3107 #92 2,1,8,4,1,1, ETA:1423.62 sec.
MIOpen(HIP): Warning [Monitor] 108/0/5416 37.8397, best within recent 13: 37.8397 #102 2,0,16,4,1,1, ETA:1397.33 sec.MIOpen(HIP): Warning [Monitor] 124/0/5416 37.8397, best within recent 16: 40.2554 #109 9,0,16,4,1,1, ETA:1349.06 sec.MIOpen(HIP): Warning [Monitor] 135/0/5416 37.8397, best within recent 11: 44.4192 #134 4,1,8,8,1,1, ETA:1354 sec.
MIOpen(HIP): Warning [Monitor] 148/0/5416 37.8397, best within recent 13: 44.4951 #137 7,1,8,8,1,1, ETA:1340.02 sec.
MIOpen(HIP): Warning [Monitor] 164/0/5416 37.8397, best within recent 16: 70.7134 #161 1,0,16,1,2,1, ETA:1308.35 sec.MIOpen(HIP): Warning [Monitor] 180/0/5416 37.8397, best within recent 16: 51.9839 #180 0,0,8,2,2,1, ETA:1279.06 sec.
MIOpen(HIP): Warning [Monitor] 194/0/5416 37.8397, best within recent 14: 42.9275 #181 1,0,8,2,2,1, ETA:1265.97 sec.
MIOpen(HIP): Warning [Monitor] 210/0/5416 37.8397, best within recent 16: 40.292 #201 1,0,16,2,2,1, ETA:1243.87 sec.
MIOpen(HIP): Warning [Monitor] 223/0/5416 37.8397, best within recent 13: 42.4137 #211 1,1,16,2,2,1, ETA:1239.31 sec.MIOpen(HIP): Warning [Monitor] 233/0/5416 37.8397, best within recent 10: 43.6917 #229 9,0,8,4,2,1, ETA:1252.31 sec.
MIOpen(HIP): Warning [Monitor] 244/0/5416 32.7761, best within recent 11: 32.7761 #242 2,0,16,4,2,1, ETA:1263.28 sec.MIOpen(HIP): Warning [Monitor] 253/0/5416 32.7761, best within recent 9: 36.442 #251 1,1,16,4,2,1, ETA:1279.37 sec.
MIOpen(HIP): Warning [Monitor] 261/0/5416 32.7761, best within recent 8: 43.0791 #256 6,1,16,4,2,1, ETA:1301.46 sec.
MIOpen(HIP): Warning [Monitor] 267/0/5416 32.7761, best within recent 6: 45.4455 #263 3,0,8,8,2,1, ETA:1329.88 sec.
MIOpen(HIP): Warning [Monitor] 273/0/5416 32.7761, best within recent 6: 45.748 #272 2,1,8,8,2,1, ETA:1356.6 sec.
MIOpen(HIP): Warning [Monitor] 279/0/5416 32.7761, best within recent 6: 45.9136 #277 7,1,8,8,2,1, ETA:1383.04 sec.
MIOpen(HIP): Warning [Monitor] 294/0/5416 32.7761, best within recent 15: 77.2678 #285 5,0,8,1,3,1, ETA:1364.04 sec.
MIOpen(HIP): Warning [Monitor] 308/0/5416 32.7761, best within recent 14: 77.6754 #296 6,1,8,1,3,1, ETA:1349.86 sec.
MIOpen(HIP): Warning [Monitor] 323/0/5416 32.7761, best within recent 15: 43.2013 #321 1,0,8,2,3,1, ETA:1333.66 sec.
MIOpen(HIP): Warning [Monitor] 336/0/5416 32.7761, best within recent 13: 43.2116 #329 9,0,8,2,3,1, ETA:1326.67 sec.
MIOpen(HIP): Warning [Monitor] 350/0/5416 32.7761, best within recent 14: 43.2379 #337 7,1,8,2,3,1, ETA:1314.85 sec.
MIOpen(HIP): Warning [Monitor] 362/0/5416 32.7761, best within recent 12: 38.196 #351 1,1,16,2,3,1, ETA:1310.16 sec.
MIOpen(HIP): Warning [Monitor] 370/0/5416 32.7761, best within recent 8: 45.2074 #368 8,0,8,4,3,1, ETA:1321.09 sec.
MIOpen(HIP): Warning [Monitor] 378/0/5416 32.7761, best within recent 8: 45.657 #376 6,1,8,4,3,1, ETA:1333.9 sec.
MIOpen(HIP): Warning [Monitor] 387/0/5416 32.3893, best within recent 9: 32.3893 #380 0,0,16,4,3,1, ETA:1342.44 sec.
MIOpen(HIP): Warning [Monitor] 396/0/5416 32.3893, best within recent 9: 32.4194 #389 9,0,16,4,3,1, ETA:1347.62 sec.
MIOpen(HIP): Warning [Monitor] 406/0/5416 32.386, best within recent 10: 32.386 #397 7,1,16,4,3,1, ETA:1349.91 sec.
MIOpen(HIP): Warning [Monitor] 417/0/5416 32.386, best within recent 11: 78.2849 #412 2,1,8,1,4,1, ETA:1348.55 sec.
--snip long output--
MIOpen(HIP): Warning [Monitor] 5407/0/5416 32.386, best within recent 8: 64.2473 #5407 0,1,16,1,7,8, ETA:2.76698 sec.
MIOpen(HIP): Warning [GenericSearch] Done: 5416/0/5416, best #397 32.386 7,1,16,4,3,1
MIOpen(HIP): Warning [GenericSearch] ...Score: 1.51211 (default time 48.9711)
MIOpen Backward Weights Conv. Algorithm: 1
GPU Kernel Time Backward Weights Conv. Elapsed: 32.823360 ms
Backward Convolution Data Verifies on CPU and GPU (6.28033e-08)
Backward Convolution Weights Failed: 1.08136e-06

Now when I run the same MIOpenDriver config, it's using the failing solution:

./bin/MIOpenDriver conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpenDriver: conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpen Forward Conv. Algorithm: 3
GPU Kernel Time Forward Conv. Elapsed: 18.376612 ms
Forward Convolution Verifies on CPU and GPU (5.68537e-08)
MIOpen Backward Data Conv. Algorithm: 3
GPU Kernel Time Backward Data Conv. Elapsed: 18.370230 ms
MIOpen Backward Weights Conv. Algorithm: 1
GPU Kernel Time Backward Weights Conv. Elapsed: 48.794930 ms
Backward Convolution Data Verifies on CPU and GPU (6.28033e-08)
Backward Convolution Weights Failed: 1.08136e-06
@asroy
Copy link
Contributor

asroy commented Aug 30, 2018

@daniellowell @zjing14

It seems ASM is selected for backward-weight. But the error is very small. Maybe just an accuracy issue.

2018-08-29 09:42:38.991402: I tensorflow/core/kernels/conv_grad_filter_ops.cc:959] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 64 -c 128 -H 112 -W 112 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -t 1
MIOpen(HIP): Error [FindSolutionImpl] Search failed for: ConvAsmBwdWrW3x3: /data/repo/MIOpen/src/hip/handlehip.cpp:70: Memory not available to allocate buffer: 411041792

I tried with direct OCL backward-weight (private MLOpen repo), it passed.

@atamazov
Copy link
Contributor

atamazov commented Aug 31, 2018

There are TWO separate issues:

  • (1) Lack of available workspace when miopen is used within TF execution environment. Currently, the MIOpen's auto-tune process needs free workspace (enough to allocate temporary output buffer).
    • I highly recommend moving this topic into the separate ticket.
  • (2) Verification error related to some WrW kernel. However, as @daniellowell said, the error is very small and most likely related to the accuracy. What we need know to be able to investigate & fix this:
    • Which kernel is failing.
      • Please re-run the driver with MIOPEN_LOG_LEVEL=5 and attach the resulting log.
    • GPU model, version of ROCm, version of MIOpen.
      • Please run /opt/rocm/opencl/bin/x86_64/clinfo >~/clinfo.txt and the following script and attach the resulting ~/clinfo.txt and ~/rocm-version-info.txt to the reply.

Note that MIOPEN_FIND_ENFORCE=search can be used instead of ...=3 (this env var supports symbolic identifiers).

@jeffdaily
Copy link
Author

clinfo.txt
miopen.txt
rocm-version-info.txt

@atamazov
Copy link
Contributor

atamazov commented Aug 31, 2018

Thanks. Now we are able to reproduce the problem. How important is this accuracy issue for you?

Note that you can use export MIOPEN_DEBUG_GCN_ASM_DIRECT_3X3WRW_SEARCH_QUICK=1 which boosts 3x3 WrW auto-tune process ~10X times at the cost of very small perf drop (<1%) of the tuned kernel.

@atamazov
Copy link
Contributor

@jeffdaily Please try re-tuning the kernel with QUICK setting:

export MIOPEN_DEBUG_GCN_ASM_DIRECT_3X3WRW_SEARCH_QUICK=1
export MIOPEN_FIND_ENFORCE=search_db_update

This may fix the accuracy problem.

Again -- please let us know how important is the accuracy issue for you.

@jeffdaily
Copy link
Author

@atamazov Sorry for the delayed response. I reported this only because it looked odd seeing the "Failed" status and having a "Failed" solution selected. I haven't yet performed any accuracy-related evaluations that would strictly depend on this result. This issue was raised for informational purposes only at this time.

I tried your suggestion above with the quick search. Similar result? Attached.
miopen_quick.txt

@atamazov
Copy link
Contributor

@jeffdaily No problem, thanks. We see that searching affects accuracy. The jitter of accuracy is a sign of a bug in the kernel.

@dagamayank
Copy link
Contributor

Does this specific issue still persist? I see some other improvements will help this.

@atamazov
Copy link
Contributor

atamazov commented Mar 13, 2019

@dagamayank

Does this specific issue still persist? I see some other improvements will help this.

(1) Lack of available workspace when miopen is used within TF execution environment...

This is in work, https://github.com/AMDComputeLibraries/MLOpen/pull/1111.

(2) Verification error related to some WrW kernel.

This is false positive with asm 3x3 WrW kernel, which could randomly occur depending on tuning. More info at https://github.com/AMDComputeLibraries/MLOpen/issues/1124.

Recently added Winograd WrW is slow on this config.

@atamazov
Copy link
Contributor

@dagamayank

Does this specific issue still persist? I see some other improvements will help this.

(1) Lack of available workspace when miopen is used within TF execution environment...
This is in work, AMDComputeLibraries/MLOpen#1111.

Now done. I think this ticket can be closed together with next release.

cderb added a commit that referenced this issue Aug 3, 2022
30d699b9e Perf Eval Update (#60)
3535b948c PerfCompile and PerfEval changes (#59)
de79468d2 remove unneccessary solution check, add check for previously modified kernel names (#56)
6924286a2 miopen hash update (#55)
530399575 Refactor googletest infra to align with MIOpen (#53)
71c50d146 Datatype fix for BN (#57)
8abe2f5c6 Perf Eval updates, Add find info (#51)
e1c1ef0f5 filter find compile by solver input (#54)
722feea66 sp/chk precomp kernel 264 (#41)
b9aba2034 Batch norm find compile (#50)
359f3da80 Fix missing link directives in fin binary (#48)
a4020c1ba Cache Miss Fixes (#46)
2ec7ef44d Enable google test and compiling fin in the CI (#47)
8b6b453bc Applicability support for batch norm (#45)
44323aae9 Perf compile/eval for fin (#42)
ebd9aa6bd update member name (#43)
d6d798efe add cu count (#39)
8e1989a9f Add find option for selecting only dynamic solvers (#38)
0e164bf66 setting json version (#37)
f3f7fed18 Remove function redefinition (#36)
e1de51a58 Performance DB de-serialize test (#34)
043cdcdaa Layout support in Fin (#33)
3a1d58236 Hotfix (#32)
ee3f0d543 4.4 Tuning Bugfixes (#31)
832dbe234 Tunability Reporting (#27)
a564a229f include gfx90a_110 (#28)

git-subtree-dir: fin
git-subtree-split: 30d699b9edc014c6076a9649f849bd3c4588d4ab
averinevg pushed a commit that referenced this issue Aug 19, 2022
* add perf cfg validity test to TestSysDbRecord

* remove debug prints

* removing invalid entries from all perf dbs

* VACUUM sqlite

* Squashed 'fin/' changes from 53d2563fe..30d699b9e

30d699b9e Perf Eval Update (#60)
3535b948c PerfCompile and PerfEval changes (#59)
de79468d2 remove unneccessary solution check, add check for previously modified kernel names (#56)
6924286a2 miopen hash update (#55)
530399575 Refactor googletest infra to align with MIOpen (#53)
71c50d146 Datatype fix for BN (#57)
8abe2f5c6 Perf Eval updates, Add find info (#51)
e1c1ef0f5 filter find compile by solver input (#54)
722feea66 sp/chk precomp kernel 264 (#41)
b9aba2034 Batch norm find compile (#50)
359f3da80 Fix missing link directives in fin binary (#48)
a4020c1ba Cache Miss Fixes (#46)
2ec7ef44d Enable google test and compiling fin in the CI (#47)
8b6b453bc Applicability support for batch norm (#45)
44323aae9 Perf compile/eval for fin (#42)
ebd9aa6bd update member name (#43)
d6d798efe add cu count (#39)
8e1989a9f Add find option for selecting only dynamic solvers (#38)
0e164bf66 setting json version (#37)
f3f7fed18 Remove function redefinition (#36)
e1de51a58 Performance DB de-serialize test (#34)
043cdcdaa Layout support in Fin (#33)
3a1d58236 Hotfix (#32)
ee3f0d543 4.4 Tuning Bugfixes (#31)
832dbe234 Tunability Reporting (#27)
a564a229f include gfx90a_110 (#28)

git-subtree-dir: fin
git-subtree-split: 30d699b9edc014c6076a9649f849bd3c4588d4ab

* Squashed 'fin/' changes from 30d699b9e..ea5c844af

ea5c844af fix direction test
3aa412ee1 Update to use revised testSysDbRecord miopen function

git-subtree-dir: fin
git-subtree-split: ea5c844aff8b5d46537aa59034a596fd15cd9e1e

* rename pipe step

* Squashed 'fin/' changes from ea5c844af..c702cb968

c702cb968 format

git-subtree-dir: fin
git-subtree-split: c702cb96800a03b17ee17d03a015dfa38e3883b9

* Squashed 'fin/' changes from c702cb968..d5397abd3

d5397abd3 rename targets

git-subtree-dir: fin
git-subtree-split: d5397abd37b6908bcd96ef750ea5a3ace04cdf3c

* rename archive

Co-authored-by: Jun Liu <[email protected]>
cderb added a commit that referenced this issue Oct 5, 2022
e05dcb421 perf db validation fix (#68)
260d9465d Add INT8 as a data_type v2 (#67)
b6a5b2a77 sync with fin folder in miopen (#62)
0e03399ec prep for Palamida scan (#63)
e6bd05c33 Performance db testing (#61)
30d699b9e Perf Eval Update (#60)
3535b948c PerfCompile and PerfEval changes (#59)
de79468d2 remove unneccessary solution check, add check for previously modified kernel names (#56)
6924286a2 miopen hash update (#55)
530399575 Refactor googletest infra to align with MIOpen (#53)
71c50d146 Datatype fix for BN (#57)
8abe2f5c6 Perf Eval updates, Add find info (#51)
e1c1ef0f5 filter find compile by solver input (#54)
722feea66 sp/chk precomp kernel 264 (#41)
b9aba2034 Batch norm find compile (#50)
359f3da80 Fix missing link directives in fin binary (#48)
a4020c1ba Cache Miss Fixes (#46)
2ec7ef44d Enable google test and compiling fin in the CI (#47)
8b6b453bc Applicability support for batch norm (#45)
44323aae9 Perf compile/eval for fin (#42)
ebd9aa6bd update member name (#43)
d6d798efe add cu count (#39)
8e1989a9f Add find option for selecting only dynamic solvers (#38)
0e164bf66 setting json version (#37)
f3f7fed18 Remove function redefinition (#36)
e1de51a58 Performance DB de-serialize test (#34)
043cdcdaa Layout support in Fin (#33)
3a1d58236 Hotfix (#32)
ee3f0d543 4.4 Tuning Bugfixes (#31)
832dbe234 Tunability Reporting (#27)
a564a229f include gfx90a_110 (#28)

git-subtree-dir: fin
git-subtree-split: e05dcb42187f05fe0d0d1b05b822dc4b750f199e
junliume added a commit that referenced this issue Oct 6, 2022
* remove datatype 0,1 from perf_db

* rm invalid fp16 entries from pdb

* Squashed 'fin/' changes from 53d2563fe..e05dcb421

e05dcb421 perf db validation fix (#68)
260d9465d Add INT8 as a data_type v2 (#67)
b6a5b2a77 sync with fin folder in miopen (#62)
0e03399ec prep for Palamida scan (#63)
e6bd05c33 Performance db testing (#61)
30d699b9e Perf Eval Update (#60)
3535b948c PerfCompile and PerfEval changes (#59)
de79468d2 remove unneccessary solution check, add check for previously modified kernel names (#56)
6924286a2 miopen hash update (#55)
530399575 Refactor googletest infra to align with MIOpen (#53)
71c50d146 Datatype fix for BN (#57)
8abe2f5c6 Perf Eval updates, Add find info (#51)
e1c1ef0f5 filter find compile by solver input (#54)
722feea66 sp/chk precomp kernel 264 (#41)
b9aba2034 Batch norm find compile (#50)
359f3da80 Fix missing link directives in fin binary (#48)
a4020c1ba Cache Miss Fixes (#46)
2ec7ef44d Enable google test and compiling fin in the CI (#47)
8b6b453bc Applicability support for batch norm (#45)
44323aae9 Perf compile/eval for fin (#42)
ebd9aa6bd update member name (#43)
d6d798efe add cu count (#39)
8e1989a9f Add find option for selecting only dynamic solvers (#38)
0e164bf66 setting json version (#37)
f3f7fed18 Remove function redefinition (#36)
e1de51a58 Performance DB de-serialize test (#34)
043cdcdaa Layout support in Fin (#33)
3a1d58236 Hotfix (#32)
ee3f0d543 4.4 Tuning Bugfixes (#31)
832dbe234 Tunability Reporting (#27)
a564a229f include gfx90a_110 (#28)

git-subtree-dir: fin
git-subtree-split: e05dcb42187f05fe0d0d1b05b822dc4b750f199e

* fix clang-format issue

Co-authored-by: Jun Liu <[email protected]>
cderb added a commit that referenced this issue Nov 21, 2022
49e3e3a62 clang format
db80b1777 update to using TestPerfCfgParams for pdb validity checks
e48a4fd3a format
a4f85842c exception for non-tunable solvers in params check
d58c42bbd Check params at end of perf tuning (#70)
1a3b47c7b Return status for failed compile commands (#69)
d59962752 out_layout -> in_layout
6ba7a8f3f Rename conv_mode to mode (#64)
513a3da1b [bg/LWPTUNA-173] (#65)
e05dcb421 perf db validation fix (#68)
260d9465d Add INT8 as a data_type v2 (#67)
b6a5b2a77 sync with fin folder in miopen (#62)
0e03399ec prep for Palamida scan (#63)
e6bd05c33 Performance db testing (#61)
30d699b9e Perf Eval Update (#60)
3535b948c PerfCompile and PerfEval changes (#59)
de79468d2 remove unneccessary solution check, add check for previously modified kernel names (#56)
6924286a2 miopen hash update (#55)
530399575 Refactor googletest infra to align with MIOpen (#53)
71c50d146 Datatype fix for BN (#57)
8abe2f5c6 Perf Eval updates, Add find info (#51)
e1c1ef0f5 filter find compile by solver input (#54)
722feea66 sp/chk precomp kernel 264 (#41)
b9aba2034 Batch norm find compile (#50)
359f3da80 Fix missing link directives in fin binary (#48)
a4020c1ba Cache Miss Fixes (#46)
2ec7ef44d Enable google test and compiling fin in the CI (#47)
8b6b453bc Applicability support for batch norm (#45)
44323aae9 Perf compile/eval for fin (#42)
ebd9aa6bd update member name (#43)
d6d798efe add cu count (#39)
8e1989a9f Add find option for selecting only dynamic solvers (#38)
0e164bf66 setting json version (#37)
f3f7fed18 Remove function redefinition (#36)
e1de51a58 Performance DB de-serialize test (#34)
043cdcdaa Layout support in Fin (#33)
3a1d58236 Hotfix (#32)
ee3f0d543 4.4 Tuning Bugfixes (#31)
832dbe234 Tunability Reporting (#27)
a564a229f include gfx90a_110 (#28)

git-subtree-dir: fin
git-subtree-split: 49e3e3a62a7cc54adacbeea95680d35f9a4685de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants