Abstract: Activation functions provide deep neural networks the non-linearity that is necessary to learn complex distributions. It is still inconclusive what is the optimal shape for the activation ...