We are usually able to recognize novel instances of familiar faces with little difficulty, yet recognition of unfamiliar faces can be dramatically impaired by natural within-person variability in appearance. In a card-sorting task for facial identity, different photos of the same unfamiliar face are often seen as different people. Here we report two card-sorting experiments in which we manipulate whether participants know the number of identities present. Without constraints, participants sort faces into many identities. However, when told the number of identities present, they are highly accurate. This minimal contextual information appears to support viewers in “telling faces together”. In Experiment 2 we show that exposure to within-person variability in the sorting task improves performance in a subsequent face-matching task. This appears to offer a fast route to learning generalizable representations of new faces.