Perceptual learning is often considered one of the simplest and basic forms of learning in general. Accordingly, it is usually modeled with simple and basic neural networks which show good results in grasping the empirical data. Simple meets simple. Complex forms of perception and learning are, then, thought to rely on these simple networks. Here, we will argue that the simplicity is in fact the Achilles heel of models of perceptual learning. We propose, instead, that perceptual learning of simple stimuli cannot be modeled with simple networks. We will review some of the empirical results yielding to this conclusion.