When sounds are “different”, such that swapping one sound for the other changes a word’s meaning (for example, “pat” vs. “bat”), this difference is usually *binary*. In other words, the sounds can easily be classified into *two* distinct categories, rather than belonging to a continuum from one sound to the other.
In the case of “pat” and “bat”, the first consonants of each word differ in terms of their “voicing”: whether or not the vocal cords are vibrating. (Try it: if you put your hand on your throat, you can feel your vocal cords vibrate when you say “zzzzz”, but not when you say “ssssss”.) The vocal cords vibrate during the <b> in “bat”, but don’t during the <p> in pat.
Interestingly, no language in the world makes distinctions based on the *degree* of vocal cord vibration. Languages care whether the vocal cords are vibrating, or not—but never base distinctions on whether the vocal folds are vibrating *slightly* vs. *medium* vs. *vigorously* vs. *extremely vigorously*. In technical terms, the voicing distinction, like almost all phonological distinctions, is binary.
However, when we look at actual language use, the theoretical ideal of binarity breaks down. In English, for example, we can say: (a) “thank you so much”, or (b) “thank you sooo much” or even (c) “thank you sooooo much”. The length of the vowel in “so” determines the degree to which we express our gratitude—(c) expresses a greater degree of gratitude than (a) or (b).
This study investigates a similar phenomenon of consonant lengthening found in Japanese, and shows that Japanese speakers can distinguish up to 6 different levels of consonant duration to express emphasis. For example, “katai” means “hard”, “katttai” means “very hard”, and “katttttai” means “extremely hard”.
This result suggests that Japanese speakers do, in fact, have the articulatory ability to make many fine grained distinctions along a continuum of duration—going against the hypothesis that all distinctions should be binary.