NaN value of features #7

akharroubi · 2023-06-17T22:49:58Z

For NaN values generated by CloudCompare (when choosing a fixed radius), I see 2 possible solutions:

Filter these values before reading the file, or interpolate these values from neighboring points, otherwise do the classification without them and interpolate the classification afterward.
Or, if there are no points within a radius r, switch the method to feature calculation based on nearest neighbors.

Yarroudh · 2023-06-19T18:38:29Z

I'll be working on that this week. Thanks @akharroubi.

Yarroudh · 2023-07-21T12:50:13Z

I've been exploring the missing values in RF classifier and I think there are some options:

Completely drop NaN values and train the model (not recommanded).
Fill in the missing values with median, mean, or mode.
Estimates missing features using nearest samples.

In scikit-learn, there is a class sklearn.impute.SimpleImputer that replace missing values using a descriptive statistic (e.g. mean, median, or most frequent) along each column, or using a constant value. There is also sklearn.impute.KNNImputer that complete missing values using k-Nearest Neighbors.

I'm also working on resolving large datasets memory saturation. For reading the data, I'm using now chunks reading as implemented in laspy. For training the model, I think Batch Learning can be useful. As explained here, the RandomForestClassifier has a parameter warm_start that "if it's set to True, the classifier reuses the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest".

Yarroudh added the enhancement New feature or request label Jun 19, 2023

Yarroudh self-assigned this Jun 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NaN value of features #7

NaN value of features #7

akharroubi commented Jun 17, 2023

Yarroudh commented Jun 19, 2023

Yarroudh commented Jul 21, 2023 •

edited

Loading

NaN value of features #7

NaN value of features #7

Comments

akharroubi commented Jun 17, 2023

Yarroudh commented Jun 19, 2023

Yarroudh commented Jul 21, 2023 • edited Loading

Yarroudh commented Jul 21, 2023 •

edited

Loading