Skip to content
This repository has been archived by the owner on Nov 18, 2023. It is now read-only.

Feature normaliser by attribute type #20

Closed
jmsfltchr opened this issue Oct 15, 2018 · 1 comment
Closed

Feature normaliser by attribute type #20

jmsfltchr opened this issue Oct 15, 2018 · 1 comment

Comments

@jmsfltchr
Copy link
Contributor

Feature values once encoded need to be normalised relative to the other values for the same attribute type. This is necessary since we can expect that different attribute types (of the same datatype) will have wildly different distributions.

Needed by #13

@jmsfltchr jmsfltchr self-assigned this Oct 15, 2018
@jmsfltchr jmsfltchr added this to the v1.5 milestone Oct 15, 2018
@grabl grabl added the v1.5 label Oct 15, 2018
@grabl grabl removed this from the v1.5 milestone Dec 14, 2018
jmsfltchr referenced this issue Dec 17, 2019
## What is the goal of this PR?

Enable ingesting numerical attributes with continuous values.

The aim has been to add a continuous numerical attribute to the diagnosis example which adds no additional information. In this case, the model should be able to stably achieve the same performance as without this attribute. Empirically, this has been achieved, perhaps taking longer to converge (about 500 training iterations minimum, compared to a minimum of 250 iterations prior).

Closes #99 

## What are the changes implemented in this PR?

- Introduce a continuous numerical attribute into the diagnosis example (`severity`)
- Create a `ContinuousAttribute` model, consisting of an MLP followed by layer normalisation
- For debugging and monitoring purposes, add histograms at strategic points in the model, plus the relevant code to execute and store these summaries
- Normalisation of common (type) embeddings for improved stability, such that attribute and type embeddings have similar magnitudes
- Gradient clipping for a very significant improvement in model convergence and stability
- Dropout for continuous attribute MLP to combat overfitting
- Improved the flow of embedder construction
@jmsfltchr
Copy link
Contributor Author

Normalising attributes with continuous values is included now that continuous attributes are embedded as of #102, see the comment above

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants