Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce memory access in linear solve loop #30

Merged
merged 1 commit into from
Jul 29, 2020

Conversation

ali5h
Copy link
Contributor

@ali5h ali5h commented Mar 8, 2020

By defining this temp variable we use that fact that i is never equal to Li[j]. Compiler does not have this information.

binary code for the loop changes from

        movsx   rdi, DWORD PTR [rdx+rax*4]
        vmovss  xmm0, DWORD PTR [rcx+rax*4]
        add     rax, 1
        lea     rdi, [r8+rdi*4]
        vmovss  xmm1, DWORD PTR [rdi]
        vfnmadd132ss    xmm0, xmm1, DWORD PTR [r9]
        vmovss  DWORD PTR [rdi], xmm0

to

        movsx   rdi, DWORD PTR [rdx+rax*4]
        vmovss  xmm0, DWORD PTR [rcx+rax*4]
        add     rax, 1
        lea     rdi, [r8+rdi*4]
        vfnmadd213ss    xmm0, xmm1, DWORD PTR [rdi]
        vmovss  DWORD PTR [rdi], xmm0

notice the drop of the first vmovss by the compiler.


This change is Reviewable

By defining this temp variable we use that fact that `i` is never
equal to `Li[j]`. Compiler does not have this information.

binary code for the loop changes from
```
        movsx   rdi, DWORD PTR [rdx+rax*4]
        vmovss  xmm0, DWORD PTR [rcx+rax*4]
        add     rax, 1
        lea     rdi, [r8+rdi*4]
        vmovss  xmm1, DWORD PTR [rdi]
        vfnmadd132ss    xmm0, xmm1, DWORD PTR [r9]
        vmovss  DWORD PTR [rdi], xmm0
```

to

```
        movsx   rdi, DWORD PTR [rdx+rax*4]
        vmovss  xmm0, DWORD PTR [rcx+rax*4]
        add     rax, 1
        lea     rdi, [r8+rdi*4]
        vfnmadd213ss    xmm0, xmm1, DWORD PTR [rdi]
        vmovss  DWORD PTR [rdi], xmm0
```

notice the drop of the first `vmovss` by the compiler.
@CLAassistant
Copy link

CLAassistant commented Mar 8, 2020

CLA assistant check
All committers have signed the CLA.

@coveralls
Copy link

coveralls commented Mar 8, 2020

Pull Request Test Coverage Report for Build 161

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 100.0%

Totals Coverage Status
Change from base Build 157: 0.0%
Covered Lines: 87
Relevant Lines: 87

💛 - Coveralls

@ali5h ali5h closed this Jun 18, 2020
@bstellato
Copy link
Collaborator

bstellato commented Jun 29, 2020

Apologies for the slow response. @ali5h is there a reason why this PR was closed?

@ali5h ali5h reopened this Jun 29, 2020
@ali5h
Copy link
Contributor Author

ali5h commented Jun 29, 2020

reopened

@bstellato bstellato closed this Jul 29, 2020
@bstellato bstellato reopened this Jul 29, 2020
Copy link
Collaborator

@bstellato bstellato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@bstellato bstellato merged commit 7d16b70 into osqp:master Jul 29, 2020
@ali5h ali5h deleted the lsolve-perf branch July 29, 2020 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants