Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Litecoin: enable support for SSE2 instructions in scrypt via configure --enable-sse2 #316

Closed
wants to merge 455 commits into from

Conversation

romanornr
Copy link

Hi,

I think SSE2 is missing for Litecoin.
Right now I have this implementation working for Viacoin.
The implementation works fine for my Arch Linux installation. (Not tested on windows or Mac)
https://github.com/romanornr/viacoin/tree/0.13-dev

I think this works fine for Litecoin too but the Travis tests will break ! So that needs to be modified.
So if someone could review this, that would be great.

harding and others added 30 commits October 17, 2016 13:54
…my soft forks

bf86073 Release notes: correct segwit signalling period start conditions (David A. Harding)
2de93f0 Relase notes: correct segwit activation point (David A. Harding)
5f9c7b0 Release notes: add info about segwit and null dummy soft forks (David A. Harding)
c9ffe90 Add historical release notes for v0.13.0 (Micha)
Base64 contains '/', and the '/' character in credentials is problematic
for AuthServiceProxy which represents the RPC endpoint as an URI with
user and password embedded.

Closes bitcoin#8399.

Github-Pull: bitcoin#8858
Rebased-From: 1c80386
This value can be significantly higher if the users uses addnode

Github-Pull: bitcoin#8944
Rebased-From: 1ab21cf
… builds

d179eed doc: update 0.13.1 release note info on linux arm builds [skip ci] (mruddy)
Only allow skipping relevant services until there are four outbound
 connections up.

This avoids quickly filling up with peers lacking the relevant
 services when addrman has few or none of them.

Github-Pull: bitcoin#8949
Rebased-From: 9583477
We normally prefer to connect to peers offering the relevant services.

If we're not connected to enough peers with relevant services, we
 probably don't know about them and could use dnsseed's help.

Github-Pull: bitcoin#8949
Rebased-From: 4630479
(Eric participated in Segwit work but has no direct commits, so should
be mentioned)
Minor version, not major version.
99f5cf1 release-notes: Update from blog draft (Luke Dashjr)
nMaxInbound might very well be 0 or -1, if the user prefers to keep
a small number of maxconnections.

Note: nMaxInbound of -1 means that the user set maxconnections
to 8 or less, but we still want to keep an additional slot for
the feeler connection.

Github-Pull: bitcoin#9008
Rebased-From: fa1c3c2
libc++ on 10.7 causes too many issues.

See bitcoin#8577 for discussion/details.

Github-Pull: bitcoin#9015
Rebased-From: 339c4b6
…7 support

1d12463 Update release notes for dropping osx 10.7 support (Michael Ford)
Note that this is not a major issue as, in order for the missing
lock to cause issues, you have to receive a GETBLOCKTXN message
while reindexing, adding a block header via RPC, etc, which results
in either a table rehash or an insert into the bucket which you are
currently looking at.

Github-Pull: bitcoin#8995
Rebased-From: dfe7906
@shaolinfry
Copy link
Member

shaolinfry commented May 13, 2017

SSE2 support was in 0.8 but for some reason was missed in 0.10+

@shaolinfry
Copy link
Member

@romanornr Travis is unhappy.

@romanornr
Copy link
Author

@shaolinfry I'm aware that Travis is unhappy.
Already pointed out
screenshot from 2017-05-13 20-13-32

I haven't figured out yet on how to fix the travis tests.
I will try to fix the travis tests but let me know if you guys figured it out.

@shaolinfry
Copy link
Member

shaolinfry commented May 13, 2017

@MarvinOl
Copy link

MarvinOl commented May 25, 2017

There's a typo in src/crypto/scrypt.h change:

 #if defined(USE_SSE2)
+#inlude <string>

@romanornr
Copy link
Author

@MarvinOl thanks. Just made a commit to fix the typo.

@shaolinfry
Copy link
Member

Same travis problems though.... any luck solving those?

@romanornr
Copy link
Author

No I didn't @shaolinfry I tried llibgcc-4.8-dev and other versions in different variations.
This is something I tried: viacoin/viacoin@71a0f31

@MarvinOl
Copy link

MarvinOl commented Jun 1, 2017

A comment about speed improvements this brings. This is by no means pointless to implement today (some might say that modern CPUs are fast and compilers optimized).

I modified scrypt_test.ccp to only run either SSE2 or non-SSE2 hashing 10000 times. On my low power Intel Atom J2900 CPU the SSE2 version is over 2x faster (42 vs 18 seconds, just pasting one, the results are stable over multiple runs):
`# time ./test_litecoin --run_test=scrypt_tests
Running 1 test case...

*** No errors detected
41.942u 0.002s 0:42.06 99.7% 9063+838k 54+0io 532pf+0w

SSE2 enabled:
18.880u 0.008s 0:18.91 99.8% 9054+837k 2+0io 389pf+0w`

@romanornr
Copy link
Author

@MarvinOl do you have a link to your fork or modification ? Sounds intresting.

@MarvinOl
Copy link

MarvinOl commented Jun 4, 2017

I just used your code to enable SSE2 and then did 2 compiles, one with it defined and one without.

And modified test_scrypt.cpp to be only one or the other, and run it 10k times:

#if defined(USE_SSE2)
    (void) scrypt_detect_sse2();
#endif
    uint256 scrypthash;
    std::vector<unsigned char> inputbytes;
    char scratchpad[SCRYPT_SCRATCHPAD_SIZE];
    for (int j = 0; j < 10000; j++) {
    for (int i = 0; i < HASHCOUNT; i++) {
        inputbytes = ParseHex(inputhex[i]);
#if defined(USE_SSE2)
        // Test SSE2 scrypt
        scrypt_1024_1_1_256_sp_sse2((const char*)&inputbytes[0], BEGIN(scrypthash), scratchpad);
        BOOST_CHECK_EQUAL(scrypthash.ToString().c_str(), expected[i]);
#else
        // Test generic scrypt
        scrypt_1024_1_1_256_sp_generic((const char*)&inputbytes[0], BEGIN(scrypthash), scratchpad);
        BOOST_CHECK_EQUAL(scrypthash.ToString().c_str(), expected[i]);
#endif
    }
    }

@romanornr
Copy link
Author

Ooh I misunderstood. Thought you had an improvement for SSE2.

In that case, yeah I was aware. It's indeed 2x faster with SSE2. So this is definitely great.

@shaolinfry
Copy link
Member

Any more progress with fixing Travis here?

@shaolinfry
Copy link
Member

Needs rebase onto master.

@romanornr
Copy link
Author

Ok, I will do it.

@shaolinfry
Copy link
Member

bump

@romanornr
Copy link
Author

Please close this one. Here is the follow-up #362

@thrasher-
Copy link
Member

Closed in favor of #362

@thrasher- thrasher- closed this Jul 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.