Supplementary code.
Abstract We consider the usage of neural networks to directly learn an optimal kernel from the data for nonparametric regression. By imposing additional structure into the model, we show that the estimator becomes a linear smoother and is equivalent to learning an optimal smoothing matrix from the data. In experiments, we explore settings where kernel methods might excel over neural networks and vice versa. The proposed neural kernels in this paper share properties of both methods, and could potentially be useful for nonparametric regression in settings with smooth functions.