-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What file formats should be supported for data and models? #30
Comments
Libsvm file format has been requested here: |
ARFF and possibly unlabeled csv as commonly used by machine learning reopos |
Basic arff support is in and csv is supported now but only if you use it as a library since you need to define feature types. Wondering if sparse arff and libsvm should be included and if a sparse feature representation is needed to do them well. |
basic libsvm support is in |
How can I grow a cloudRF with libsvm file? (I don't know which a target to declare). |
-target 0 should do it since the target is in the first column and their aren't column names |
I received some errors as below: goroutine 1 [running]: |
You need to rename usps to usps.libsvm so that growforest knows how to parse it. |
Also do an update if you haven't as I recently fixed some small bugs with libsvm support. |
Great! It is running. |
ryanbressler commented "-target 0 should do it since the target is in the first column and their aren't column names" |
It checks to see if the first entry is an int or a float. Ints are handled On Mon, Apr 14, 2014 at 9:50 PM, tungntdhtl [email protected]:
|
OK, thanks! That is a good way. |
Yes, all unspecified features will be assumed to be zero. On Mon, Apr 14, 2014 at 11:05 PM, tungntdhtl [email protected]:
|
In LIBSVM file containing lots of records (e.g 60,000,000), how can I build trees in couldRF? I try setting a portion of total records using "nSamples=0.1" option, that means cloudRF works only 10% of total sample? |
Random forest bags samples independently for each tree so I think it is On Mon, Apr 14, 2014 at 11:59 PM, tungntdhtl [email protected]:
|
I mean RF struggles to build trees from large samples size because of a tree size is large. |
No description provided.
The text was updated successfully, but these errors were encountered: