Get_dummies trap: Issue with collinearity #17

traveling-desi · 2017-06-26T01:13:36Z

Hello !

I know of the finding donors project that uses get_dummies(). I don't remember if there are any others.

Please refer to this: pandas-dev/pandas#12042

As you can see if you run get_dummies on any feature since it is one hot encoding, the last column can be fully predicted from the rest of the columns, in fact, it is an XNOR relationship. So the correct way to use get_dummies is to use drop_first = True.

It's, of course, left to the user to write the get_dummies command but there is not talk about this issue in the notebook. If you agree this is a valid issue and the notebook needs to be changed, please update the instructions so that students will add the drop_first argument.

If you do end up making this change, please acknowledge Nupur (https://discussions.udacity.com/t/how-to-avoid-collinearity-problem-with-pd-getdummies/284692) who pointed this out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get_dummies trap: Issue with collinearity #17

Get_dummies trap: Issue with collinearity #17

traveling-desi commented Jun 26, 2017

Get_dummies trap: Issue with collinearity #17

Get_dummies trap: Issue with collinearity #17

Comments

traveling-desi commented Jun 26, 2017