Feature normaliser by attribute type #20

jmsfltchr · 2018-10-15T15:15:00Z

Feature values once encoded need to be normalised relative to the other values for the same attribute type. This is necessary since we can expect that different attribute types (of the same datatype) will have wildly different distributions.

Needed by #13

## What is the goal of this PR? Enable ingesting numerical attributes with continuous values. The aim has been to add a continuous numerical attribute to the diagnosis example which adds no additional information. In this case, the model should be able to stably achieve the same performance as without this attribute. Empirically, this has been achieved, perhaps taking longer to converge (about 500 training iterations minimum, compared to a minimum of 250 iterations prior). Closes #99 ## What are the changes implemented in this PR? - Introduce a continuous numerical attribute into the diagnosis example (`severity`) - Create a `ContinuousAttribute` model, consisting of an MLP followed by layer normalisation - For debugging and monitoring purposes, add histograms at strategic points in the model, plus the relevant code to execute and store these summaries - Normalisation of common (type) embeddings for improved stability, such that attribute and type embeddings have similar magnitudes - Gradient clipping for a very significant improvement in model convergence and stability - Dropout for continuous attribute MLP to combat overfitting - Improved the flow of embedder construction

jmsfltchr · 2019-12-17T13:51:35Z

Normalising attributes with continuous values is included now that continuous attributes are embedded as of #102, see the comment above

jmsfltchr self-assigned this Oct 15, 2018

jmsfltchr added the type: feature label Oct 15, 2018

jmsfltchr added this to the v1.5 milestone Oct 15, 2018

grabl added the v1.5 label Oct 15, 2018

jmsfltchr added the planned label Oct 30, 2018

jmsfltchr removed planned labels Dec 14, 2018

grabl removed this from the v1.5 milestone Dec 14, 2018

jmsfltchr added the in progress label Feb 4, 2019

haikalpribadi removed the in progress label Mar 28, 2019

jmsfltchr closed this as completed Dec 17, 2019

grabl added the status: solved label Dec 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature normaliser by attribute type #20

Feature normaliser by attribute type #20

jmsfltchr commented Oct 15, 2018

jmsfltchr commented Dec 17, 2019

Feature normaliser by attribute type #20

Feature normaliser by attribute type #20

Comments

jmsfltchr commented Oct 15, 2018

jmsfltchr commented Dec 17, 2019