-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature request] Support int64_t data_size_t #2818
Comments
I have a branch (https://github.com/microsoft/LightGBM/tree/sparse_bin_int64) working on this, will revisit it when have time. |
hey, can i help this one? |
@Adam1679 yeah, very welcome |
sorry, this is my first time to make public contribution. Any suggestions about how to begin this one? Should I fork your branch and get familiar with the code and your modifications? |
yes, you can fork the code and code in your cloned repo first, and create a PR here when it is almost ready. |
ok, i will try my best. |
hey, I read your commit "5a5b26d" of your branch and I saw that you changed data_size_t to from int32_t to int64_t directly. And also you changed some "int" to "data_size_t" and some "data_size_t" to "int". I understand that you cast some "data_size_t" to "int" since now "data_size_t" is bigger. But why do you change some "int" to "data_size_t"? I don't understand this part. Also, it would be highly appreciated if you can elaborate more on the "the overhead of int64_t as data_size_t will be smaller." Is it because that "int32_t" is not big enough in some cases but you have to compromise to use it since the overhead of using int64 in the |
@Adam1679 some of For the overheads:
|
Sorry that I left for a long time. I'm trying to figure out a solution for such kind of template. But I did a lot of stack overflow search and found that there is no solution for such kind of runtime template. (template is compile time). So i think what we need is polymorphism. for example, construct a base class and subclass it using different data size. But this may result in other problems like the base class cannot have any fields because the type is undetermined. Am I on the right track? |
Closed in favor of being in #2302. We decided to keep all feature requests in one place. Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature. |
As the
cnt
in the histogram is removed, the overhead ofint64_t
asdata_size_t
will be smaller.There are several todo items:
data_size_t
completely in code, directly usedint64_t
indices
inDataPartition
, to achieve the same performance as before when the data_size_t is smaller than int32_t.The text was updated successfully, but these errors were encountered: