The goal of this project is to understand recovery of a low rank matrix from a specifically constrained set of data measurements. Given a neural network with a specific underlying data model is shown to be recovered using a non convex iterative algorithm. Given a convex relaxation to solve a similar convex problem which can recover a specific rank r matrix with high probability using constrained minimization is shown. This project helps us understand how it is possible to solve a highly non convex problem using an iterative algorithm, which recovers a matrix with high probability given that the data has a specific format. We also learn to understand how to to relax a non convex problem into a convex one which is solvable with computational solvers. Additionally, we learn to understand the preconditions on the data which need to be fulfilled for a system like this to perform well.
The methods in this project are two fold. On the one hand, we have a non convex learning algorithm which provides an understandable recovery of a matrix. There are different versions available with different computational efficiencies. We use the less computational efficient method, but whose interpretation is intuitive. This method estimates a matrix L by iterating through the average of the dot product of the data features in batches and multiplying it with the difference of the prediction with the real label (error term). This error lerm uses the current L_t matrix to preduct from the data samples and compares it with the actual ground truth label. This kind of resembles a loss function which is regularly used in learning neural networks. The new 'weights' matrix L_t is then updated by the projection onto the subspace of all rank r matrices of the difference of L_t-1 with this average and a weighted bias term. The bias term is the error of the current prediction minus 0.5 the average ground truth value. Instead of using an error threshold this paper uses a fixed iteration constant. On the other hand, we have a convex relaxation of the problem which can be expressed in different ways . The broad idea of the convex relaxation is that we want to minimize the nuclear norm of our variable matrix L under the conditions that the operator norm (largest singular value) of the adjoint operator of the difference between the prediction and the estimation is smaller or equal to a constant lambda. How to choose the constant lambda has been explored in related work. I.4 where, lambda has a special structure which is proved to perform well for a given convex minimization problem.