The computational functions of the brain are largely implemented in spiking neural networks. However, how neurobiological spiking circuits develop their functionality and how to instantiate similar capabilities in-silico remains largely elusive. In my talk I will introduce the emergent class of surrogate gradient methods which try to tackle this problem. Characteristically, surrogate gradients allow robust training of recurrent and multi-layer spiking neural networks through the minimization of cost functions. Where standard gradient-based methods fail due to the non-continuous activation function of spiking neurons, sensible choices of surrogate gradients restore trainability and provide new vistas on plausible three-factor plasticity rules. Specifically, I will illustrate the effectiveness of surrogate gradient learning on several problems which require nonlinear computations in the temporal domain. Additionally, I will show that the performance of spiking networks trained using surrogate gradients is comparable to conventional multi-layer networks with graded activation functions. Moreover, suitable regularization can result in sparse spiking activity as often seen in neurobiology while the impact on performance remains negligible. Finally, I will discuss the biological plausibility of the learning rules behind surrogate gradient learning.