v6.8.0: SELU layer, attention, improved GPU/CPU compatibility
✨ Major features and improvements
- Add SELU layer, from Klambauer et al. (2017).
- Add parametric soft attention layer, as in Yang et al. (2016).
- New higher-order function
uniqued
, which wraps layers giving them a per-batch cache. - Improve batch normalization, by tracking activation moving averages.
🔴 Bug fixes
- Fix GPU usage in pooling operations.
- Add optimized code for extracting ngram features.
- Improve CPU/GPU compatibility.
- Improve compatibility of
LinearModel
class.
👥 Contributors
Thanks to @tammoippen for the pull request!