Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update feature_subset_selection.ipynb #404

Merged
merged 2 commits into from
May 23, 2020

Conversation

diegoroman17
Copy link
Contributor

Wrong objective function

Description

Let the minimization of the objective function be the default behavior of the library. Then, If the objective is maximizing the model performance and minimize the number of features, 1-Nf/Nt is incorrect on the right term. Also, the function is wrong in the paper.

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Let the minimization of the objective function be the default behavior of the library. Then, If the objective is maximizing the model performance and minimize the number of features, 1-Nf/Nt is incorrect on the right term. Also, the function is wrong in the paper.
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@ljvmiranda921
Copy link
Owner

ljvmiranda921 commented May 21, 2020

Hi @diegoroman17 ! Thanks for this and welcome to Pyswarms! Let me review this over the weekend :) Looping in @stevenbw too

Copy link
Owner

@ljvmiranda921 ljvmiranda921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @diegoroman17 ! I'm wondering what prompted this change?

I'm currently basing it on this paper and I think it does check-out.

If you're suggesting a better objective function then please feel free to comment!

@diegoroman17
Copy link
Contributor Author

Hi, @ljvmiranda921
The paper states

...The goals are the maximization of model performance and the minimization of the number of used features. The objective function is defined as a fitness function (as in EA), being our goal its maximization...

f(u_i)=\alpha(1-P)+(1-\alpha)(1-\frac{N_f}{N_t})

where P is the performance of the model and N_f is the size of the tested feature subset

The previous function is maximized with P \rightarrow 0 and N_f \rightarrow 0. However, P \rightarrow 0 means a low performance of the model, but this is not according to the main goal. So, in my perspective the paper has an errata.

In other hand, pyswarms minimize the objective function. It is performed with P \rightarrow 1 and N_f \rightarrow N_t. Although the main goal is correct (maximize the model performance), the size of the feature subset is also maximized, and this is again incorrect. Therefore, in my understanding the correct objective function for feature selecture using pyswarms should be:
\alpha(1-P)+(1-\alpha)(\frac{N_f}{N_t})

@ljvmiranda921
Copy link
Owner

@diegoroman17 Gotcha! Would you mind updating the notebook as well, i.e., the implemented objective function, into the correct one? And rerun all cells so that the results would be consistent to the stated objective function

@diegoroman17
Copy link
Contributor Author

@ljvmiranda921 , Yes, the entire notebook should be rerun with the modified objective function

@ljvmiranda921
Copy link
Owner

@diegoroman17 can you please do the rerun for us in order to complete this PR? Thanks!

- Objective function corrected for a minimization problem.
- Rerun of the entire notebook.
@diegoroman17
Copy link
Contributor Author

@ljvmiranda921 sure! I submitted a second commit with the entire notebook updated in this PR. Also, I attached the notebook below.
feature_subset_selection.zip

@ljvmiranda921 ljvmiranda921 merged commit 9ef1a8f into ljvmiranda921:master May 23, 2020
@ljvmiranda921
Copy link
Owner

Thanks for this PR @diegoroman17 ! Congrats on your first contribution!
@all-contributors please add @diegoroman17 for docs

@allcontributors
Copy link
Contributor

@ljvmiranda921

I've put up a pull request to add @diegoroman17! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants