-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Linear regression on a view can cause segfaults depending on the offsets #4199
Comments
Looks like only
Even when y is a dataframe, it doesn't need to be copied. So, it's probably a problem from Series?
|
update: Getting a cudaErrorMisalignedAddress now Stacktrace:
|
I can't immediately think of why this would be happening other than the underlying pointer that the cudf view is returning somehow not being aligned to the proper power of 2. In this case, the memory should be aligning w/ 8 since we're using a double-precision array. >>> import cudf
>>> import cuml
>>> import numpy as np
>>> start = 1
>>> end = start + 10
>>> df = cudf.DataFrame(np.random.normal(size=(15,3)), columns=["x","y","z"])
>>> y = df['z'].iloc[start:end]
>>> y
1 0.588237
2 1.994371
3 -1.165460
4 1.157915
5 -1.263956
6 1.094313
7 -0.713109
8 -0.351764
9 -1.126963
10 -0.189863
Name: z, dtype: float64 Taking cuml out of the equation, one thing I notice from the start is that the pointer (in the >>> y2 = df['z'].iloc[0:end]
>>> y2.__cuda_array_interface__
{'shape': (15,), 'strides': (8,), 'typestr': '<f8', 'data': (140614112904704, False), 'version': 1}
>>> 140614112904704 % 8
0 And slicing element 1 does increment that pointer by 8 bytes. >>> y.__cuda_array_interface__
{'shape': (10,), 'strides': (8,), 'typestr': '<f8', 'data': (140614112904712, False), 'version': 1} Here's an example of a double precision cupy array, which is doing identical arithmetic: >>> import cupy as cp
>>> a = cp.array([4.0, 5.0, 6.0], dtype='float64')
>>> a.__cuda_array_interface__
{'shape': (3,), 'typestr': '<f8', 'descr': [('', '<f8')], 'stream': 1, 'version': 3, 'strides': None, 'data': (140614112903168, False)}
>>> a[1:].__cuda_array_interface__
{'shape': (2,), 'typestr': '<f8', 'descr': [('', '<f8')], 'stream': 1, 'version': 3, 'strides': None, 'data': (140614112903176, False)} At face value, it seems like cudf is doing what it's supposed to do. Perhaps there's something weird going on here w/ the conversions / memory reuse / array creation inside cuml before the data makes it to the c++ layer? It would probably benefit to do a similar analysis of the pointers inside the |
Thank you that's really helpful, let me dig into what's going on with the data within the fit call! |
Linear regression on a view can cause segfaults depending on the offsets. This could cause issues when a user is trying to do something like rolling regressions using a sliding window in a loop.
I haven't root caused this issue, but it does not occur with all "window" sizes for the view.
If we explicitly
copy
the data before fitting the model, there is no issue:The text was updated successfully, but these errors were encountered: