something really interesting about linguistic phenomena is that they often seem to operate on principles of equivalence that are completely contrary to how typical machine learning functions
something really interesting about linguistic phenomena is that they often seem to operate on principles of equivalence that are completely contrary to how typical machine learning functions
in phonology, for instance, we form equivalence classes of sounds on the grounds of not phonetic similarity but rather contextual non-overlappingness. a kind of anti-similarity
one of the core criteria for two phonetic sounds to be equivalent in a language is if those two phonetic sounds have strictly non-overlapping distributions
the more overlap they have, the LESS equivalent they are. similarity = difference
ok, so some folx wanted to know more about this in linguistics/phonology, so here goes
when we study how the sounds of different languages work, we find that a language doesn't just have a set of sounds that are used in the language
we also find that those sounds relate to one another in various ways
for example, if we consider the <t> in the word <top> and in the word <stop>, we can determine acoustically (via the waveforms/spectrograms) and physically by feel, that these are different sounds
but you, as an english speaker, perceive them and treat them as the same sound
to see that they're different, place your hand in front of your mouth and say "top top top top top" then say "stop stop stop stop stop"
you should feel a stronger puff of air for "top" than "stop"
if you find that you puff air out when saying the <p> as well, say the words "tub" and "stub", to reduce the puff from the <p>
English treats these two sounds as effectively the same, and if you edit a recording so that "stop" has the <t> of "top" instead, people will still perceive it as "stop", but just a little funny. it's the same word, with a weirdness to it, breathy and airy but still "stop"
Korean, on the other hand, treats these two sounds as different, and so if you play the same trick with a Korean word, you might end up with a totally different word that means something different!
these two sounds, btw, are written phonetically as [t] and [tʰ]
so languages don't just have sounds, they have equivalences. in English, [t] and [tʰ] are equivalent, while in Korean, they're not equivalent
but it's more than mere equivalence, because we also find that the language preferentially uses equivalent sounds in different contexts
sure, English treats [t] and [tʰ] as equivalent, and if you swap the one for the other, it doesn't make it a different word, but ALSO...
...but ALSO, English really wants you to use [t] after an <s>, and using [tʰ] there still sounds peculiar. not enough to be a different word! but peculiar
and there are some really interesting examples where changing the surroundings of a sound, for instance with suffixes, prefixes, or other words, can actually force you to change which of the two sounds you say
so generally, we describe this phenomenon as "allophony"
the sounds [t] and [tʰ] are called "phones", the basic sounds of the phonetics of a language, and they are "equivalent" in some sense, which we describe by saying the are part of, or come from, the same "phoneme"
that phoneme being called /t/ (notice we use the brackets [] for phones and // for phonemes, and also <> for spelling)
and because they are both phones of the same phoneme, we call them allophones (of that phoneme)
so the allophones of /t/ are [t] and [tʰ]