In the past 6 years, machine learning has transformed computer vision. This learning-based approach treats computer vision as a problem of finding a function that approximately maps directly from images to desired output. The success of this approach has been driven by the availability of very large, labelled datasets, advances in the training and architecture of deep neural networks and developments in GPU hardware. Recent work, particularly using generative networks, shows that learning-based approaches could have a similar impact in computer graphics. This talk will argue that learning alone, specifically supervised learning, cannot be used to solve all problems in visual computing. Using example problems in which training data is either very limited or does not exist at all, I will show how models borrowed from physics (with a bit of statistics and geometry thrown in) can be used to supervise learning, i.e. learning of a task can be “self-supervised” by explicit models. Specifically, I will show recent results in nonlinear 3D shape modelling, inverse rendering in the wild and biophysical face image interpretation that apply this philosophy.