bugfixing the AVX2 Extract8+16 codes, where there's lines like [...] #4

GerHobbelt · 2021-07-13T08:55:22Z

Tfloat patch 4: bugfixes for AVX2 FAST_FLOAT Extract8+16 implementations tesseract-ocr#3494

Extract from tesseract-ocr#3490 - bugfixing the AVX2 Extract8+16 codes, where there's lines like __m256d scale01234567 = _mm256_loadu_ps(scales), i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.

Note: next pullreq is a reduced version of this: less code duplication for bleeding edge tfloat branch.

Note: tesseract-ocr#3495 is this one (tesseract-ocr#3494) PLUS FAST_FLOAT condition only applied to the ExtractXYZ calls, as the others are good to go with only their prototype adjusted from double --> TFloat. Hence tesseract-ocr#3495 is only moving code compared to this one, no code change. (I don't know what diff tools you use, but thus this one (tesseract-ocr#3494) would be easier to diff/review, and then verify that tesseract-ocr#3495 is only copy/cut/paste work, resulting in a much larger diff)

[Edit: here's how this one's diff looks over at my place with BeyondCompare as diff visualizer: at least for me, much easier to 'read' than github's webview:

GerHobbelt · 2021-07-13T12:22:37Z

Will adjust this one to match #2 (comment) and formatting style as well.

…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.

GerHobbelt · 2021-07-13T12:32:56Z

@stweil: code reformatted according to editor spec file.

This was referenced Jul 13, 2021

Tfloat patch 4: bugfixes for AVX2 FAST_FLOAT Extract8+16 implementations tesseract-ocr/tesseract#3494

Closed

Improved #4 / 3494: AVX2 bugfixes + no code duplication for the integer workhorses in there #5

Closed

bugfixing the AVX2 Extract8+16 codes, where there's lines like `__m25…

81b69b0

…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.

GerHobbelt force-pushed the tfloat-patch-4 branch from ba85ac4 to 81b69b0 Compare July 13, 2021 12:32

stweil merged commit 4e3c112 into stweil:tfloat Jul 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfixing the AVX2 Extract8+16 codes, where there's lines like [...] #4

bugfixing the AVX2 Extract8+16 codes, where there's lines like [...] #4

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 13, 2021

bugfixing the AVX2 Extract8+16 codes, where there's lines like [...] #4

bugfixing the AVX2 Extract8+16 codes, where there's lines like [...] #4

Conversation

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 13, 2021

GerHobbelt commented Jul 13, 2021