Refine
Extracting meaningful representations of data is a fundamental problem in machine learning. Those representations can be viewed from two different perspectives. First, there is the representation of data in terms of the number of data points. Representative subsets that compactly summarize the data without superfluous redundancies help to reduce the data size. Those subsets allow for scaling existing learning algorithms up without approximating their solution. Second, there is the representation of every individual data point in terms of its dimensions. Often, not all dimensions carry meaningful information for the learning task, or the information is implicitly embedded in a low-dimensional subspace. A change of representation can also simplify important learning tasks such as density estimation and data generation. This thesis deals with the aforementioned views on data representation and contributes to them. We first focus on computing representative subsets for a matrix factorization technique called archetypal analysis and the setting of optimal experimental design. For these problems, we motivate and investigate the usability of the data boundary as a representative subset. We also present novel methods to efficiently compute the data boundary, even in kernel-induced feature spaces. Based on the coreset principle, we derive another representative subset for archetypal analysis, which provides additional theoretical guarantees on the approximation error. Empirical results confirm that all compact representations of data derived in this thesis perform significantly better than uniform subsets of data. In the second part of the thesis, we are concerned with efficient data representations for density estimation. We analyze spatio-temporal problems, which arise, for example, in sports analytics, and demonstrate how to learn (contextual) probabilistic movement models of objects using trajectory data. Furthermore, we highlight issues of interpolating data in normalizing flows, a technique that changes the representation of data to follow a specific distribution. We show how to solve this issue and obtain more natural transitions on the example of image data.