-
-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chapter 3 - ISA: X86 duality #44
Conversation
Not sure what value it provides to the readers. If there is some implicit point that you're trying to make, then I suggest to make it explicit. |
Hi Denis, thank you for the review. The point I'm trying to make is that in the initial section, it states that modern architectures are load-store. X86 is one of the most used ISAs and isn't a load-store architecture but register-memory. As a consequence, a reader of the book could falsely assume that X86 ISA is a load-store architecture. If you think this brings no value, I'll close the PR. If you think this brings value, I'm all ears to rewrite it. Perhaps add it as a footnote? Regards, Peter. |
Ok, I changed the original paragraph: |
Hi Denis, the original text was correct. So modern ISA's like RISC-V and ARM are load-store architectures. The X86 ISA is register-memory, but after uops conversion, the X86 microarchitecture also has transformed into a load-store architecture. |
Ooops, of course, you're right. I was implicitly thinking about x86 again. :) |
Please check now. |
What is missing is that the X86 microarchitecture is a load/store architecture as part of the uops conversion. Given the following code:
After uops conversion it could look like this:
I find it very helpful because I need to think a lot less about the complex addressing modes and it helps me to understand the performance opportunities. In the first example, it isn't immediately clear that the loads of [A] and [B] can be performed out of order, but in the uops version, it is much more obvious. It could also help to prevent people to manually 'optimize' code like this (C-example):
This is written by a 'smart' engineer who wants to help the CPU by giving more opportunities for out-of-order execution because he doesn't understand the uops version of |
Ok, I agree, but I think this section is not the best place to discuss uops. I have a section for this: |
Thank you @pveentjer ! |
Added note about the duality of load/store and register/memory behavior of the X86.