-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move string representation entirely to ustringhash -- phase I #1603
Conversation
LGTM! |
As a data point: Today, in the span of about 2 hours of real work, I was able to fully convert our renderer to accept this Phase I change and pass its whole testsuite. So I don't think this step is going to be a big burden for the renderer authors out there. (The NEXT step, where a bunch of other data that needs to get passed, out of band of the RendererServices calls, might be hairier.) |
static constexpr int width = WidthT; | ||
typedef ustringhash ValueType; | ||
|
||
// To enable vectorization, use uintptr_t to store the ustringhash (const char *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment says "use uintptr_t to store the ustringhash", but code has size_t. Make them match or do you really want a uint64_t or whatever is inside ustringhash. Ok, I see ustringhash is using size_t, so similar question would using a fixed size uint64_t be a more portable choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, too, and wasn't sure.
The hash of ustring (and the data item inside ustringhash) are size_t. Also, if anybody does try to build this for a 32 bit architecture, ustring (a char* inside) will also be 32 bits, so I think there's some use in having ustring and ustringhash the same size, which wouldn't be the case if ustringhash were uint64 on a platform with 32 bit pointers.
So which is better: ustringhash the same absolute size on all architectures (but possibly differing from the pointer size), or ustringhash the same size as ustring on each architecture (but not necessarily from one architecture to the other -- though how much are we really concerned with 32 bit?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about better, but don't we need all 64bits to avoid collisions? Meaning is ustringhash with only 32bits inside not useful for our purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel safer with 64 bit hash. But to make that the case on 32 bit platforms, we'd need to change the OIIO definition of ustringhash and the return value of ustring::hash, which are currently size_t.
Do we care / will anybody use OSL on 32 bit platforms?
If we care, do we consider 32 bit platforms "supported", or just "maybe it works, your problem?" Because if it's the latter, then I won't feel guilty about an ABI break on platform we don't officially support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be ok if we mark OSL as only supporting 64-bit platforms for now (we can static assert that size_t is what we expect).
But I also think that at the next opportunity to change the API in OIIO we should "fix" ustring to return a uint64_t
instead of size_t
for the hash since going forward we are going to be making assumptions about the rarity of collisions in the 64 bit hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so let's try on this plan, then:
- Proceed as we have been for now.
- Modify OIIO-next (in-progress 2.5, which is allowed to change ABI) to have ustring::hash and the data inside ustringhash change from size_t to uint64_t. Note that this is no actual change on 64 bit machines.
- Our position when we release this code (next summer) will be that if you want to be guaranteed to have 64 bit hashes even on 32 bit machines, you should be using OIIO 2.5+.
- We're still not really making any strong promises about how well OSL works on any 32 bit platform, we certainly don't test it in CI, probably none of the regular developers have access to such a machine to reproduce and fix problems particular to 32 bit.
Wait, are we overthinking this?
There are certainly no studio customers who expect 32 bit. Do any of the commercial renderers like PRMan or Arnold even sell 32 bit versions any more? The only thing I could think of is Blender -- do they purport to support 32 bit platforms for Cycles? Can you imagine anybody rendering with < 4GB of addressable RAM who also expects a full OSL implementation that never hits a hash collision? If you completely fill 4GB with ustrings (counting allocation overhead, etc.), do you get even close to a reasonable probability of a 32 bit hash collision, or would a 32 bit collision be absurdly unlikely unless you had so many strings that you'd need much more than 4GB?
How many people are we realistically leaving in the dark if we just declare OSL to require a 64 bit CPU architecture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can take a poll at this week's TSC meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree about OSL -- I was just thinking it would still be good to have a consistent behavior in OIIO for other applications that might want to do similar things with ustring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, it appears that blender does not officially support 32-bit builds anymore:
https://developer.blender.org/T67184
(dates back to 2019)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, in that case, let's just declare OSL unsupported on 32 bits unless somebody pipes up with a complaint AND a plan for who's going to take responsibility to fix and test it.
I'll fix up the OIIO side to make the hashes a guaranteed 64 bit type, but in master/2.5 only with no plan to backport or take any chances of ABI break in a release branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple comments to resolve, overall LGTM
The first step of the great ustringhash conversion: This changes all RendererServices methods (i.e., callbacks to the renderer from the shader JITed code) that used to take ustring parameters, switched to take ustringhash instead. The second step -- which I will tackle after this part is merged -- will be to fully switch the in-memory representation of strings during shading execution to ustringhash on the CPU (thus matching how we've been doing it on GPU and making a number of data stuctures and representations identical/shared for CPU and GPU). But in the mean time, this lets renderers get started changing their RendererServices specializations. Just like with OSL's internal implementation, it may be easier for renderers to break the job into phase I (rendererservices) and phase II (in-memory rep). Some notes and guideposts: * In oslconfig.h, define ustringrep to alias to either ustring or ustringhash, depending on a new (but temporary) CMake variable `OSL_USTRINGREP_IS_HASH`, which is OFF for now, meaning that ustringrep is still ustring. * Also in oslconfig.h, several helper conversion functions among ustring, ustringhash, ustringrip. * RendererServices and BatchedRendererServices (and their various subclasses in testshade and testrender): change the method signatures from ustring to ustringhash for all methods that are reachable from shader code. Signed-off-by: Larry Gritz <[email protected]>
The first step of the great ustringhash conversion: This changes all RendererServices methods (i.e., callbacks to the renderer from the shader JITed code) that used to take ustring parameters, switched to take ustringhash instead.
The second step -- which I will tackle after this part is merged -- will be to fully switch the in-memory representation of strings during shading execution to ustringhash on the CPU (thus matching how we've been doing it on GPU and making a number of data stuctures and representations identical/shared for CPU and GPU).
But in the mean time, this lets renderers get started changing their RendererServices specializations. Just like with OSL's internal implementation, it may be easier for renderers to break the job into phase I (rendererservices) and phase II (in-memory rep).
Some notes and guideposts:
In oslconfig.h, define ustringrep to alias to either ustring or ustringhash, depending on a new (but temporary) CMake variable
OSL_USTRINGREP_IS_HASH
, which is OFF for now, meaning that ustringrep is still ustring.Also in oslconfig.h, several helper conversion functions among ustring, ustringhash, ustringrip.
RendererServices and BatchedRendererServices (and their various subclasses in testshade and testrender): change the method signatures from ustring to ustringhash for all methods that are reachable from shader code.
Signed-off-by: Larry Gritz [email protected]