something really interesting about linguistic phenomena is that they often seem to operate on principles of equivalence that are completely contrary to how typical machine learning functions
something really interesting about linguistic phenomena is that they often seem to operate on principles of equivalence that are completely contrary to how typical machine learning functions
in phonology, for instance, we form equivalence classes of sounds on the grounds of not phonetic similarity but rather contextual non-overlappingness. a kind of anti-similarity
one of the core criteria for two phonetic sounds to be equivalent in a language is if those two phonetic sounds have strictly non-overlapping distributions
the more overlap they have, the LESS equivalent they are. similarity = difference
ok, so some folx wanted to know more about this in linguistics/phonology, so here goes
when we study how the sounds of different languages work, we find that a language doesn't just have a set of sounds that are used in the language
we also find that those sounds relate to one another in various ways
for example, if we consider the <t> in the word <top> and in the word <stop>, we can determine acoustically (via the waveforms/spectrograms) and physically by feel, that these are different sounds
but you, as an english speaker, perceive them and treat them as the same sound
to see that they're different, place your hand in front of your mouth and say "top top top top top" then say "stop stop stop stop stop"
you should feel a stronger puff of air for "top" than "stop"
if you find that you puff air out when saying the <p> as well, say the words "tub" and "stub", to reduce the puff from the <p>
English treats these two sounds as effectively the same, and if you edit a recording so that "stop" has the <t> of "top" instead, people will still perceive it as "stop", but just a little funny. it's the same word, with a weirdness to it, breathy and airy but still "stop"
Korean, on the other hand, treats these two sounds as different, and so if you play the same trick with a Korean word, you might end up with a totally different word that means something different!
these two sounds, btw, are written phonetically as [t] and [tʰ]
so languages don't just have sounds, they have equivalences. in English, [t] and [tʰ] are equivalent, while in Korean, they're not equivalent
but it's more than mere equivalence, because we also find that the language preferentially uses equivalent sounds in different contexts
sure, English treats [t] and [tʰ] as equivalent, and if you swap the one for the other, it doesn't make it a different word, but ALSO...
...but ALSO, English really wants you to use [t] after an <s>, and using [tʰ] there still sounds peculiar. not enough to be a different word! but peculiar
and there are some really interesting examples where changing the surroundings of a sound, for instance with suffixes, prefixes, or other words, can actually force you to change which of the two sounds you say
so generally, we describe this phenomenon as "allophony"
the sounds [t] and [tʰ] are called "phones", the basic sounds of the phonetics of a language, and they are "equivalent" in some sense, which we describe by saying the are part of, or come from, the same "phoneme"
that phoneme being called /t/ (notice we use the brackets [] for phones and // for phonemes, and also <> for spelling)
and because they are both phones of the same phoneme, we call them allophones (of that phoneme)
so the allophones of /t/ are [t] and [tʰ]
and we say that we use the [t] allophone in some contexts, and the [tʰ] allophone in other contexts, and its the job of the phonologist -- a person who does phonology (ie the study of the relationships between sounds in a language) -- to describe and explain these patterns
and there's a methodology we use to do this. it's hinted at above actually
the standard methodology is basically as follows:
if you want to know if two phones [X] and [Y] are allophones of a common phoneme /Z/, you can prove they are NOT, by simply finding two words...
...that differ ONLY by [X] and [Y]
by different words, what i mean is each sound is associated with a specific, different meaning
for example, let's show that [t] and [p] are different sounds in English
can we find two words, with different meanings, that are pronounced the same, except for swapping [t] for [p]? yes!
<stud> pronounced [stʌd]
and
<spud> pronounced [spʌd]
the context is [s_ʌd]
put [t] in for [_], you get one meaning
put [p] in, you get a different one
so we _differentiate_ sounds by showing that swapping one for the other is sufficient to force a different meaning
we call such a pair of words a "minimal pair", that is to say, its a pair of words (ie different meanings) with a minimal sound difference
if we can find a minimal pair for two sounds, they're not allphones of the same phoneme
if we can NOT find a minimal pair for two sounds, then they MIGHT be allophones of the same phoneme, but it's not guaranteed
for instance, we'll never find a minimal pair for [t] and [pʰ], or for [tʰ] and [p]
so what we know is merely that [t] and [p] can't be allophones of the same phoneme, nor can [tʰ] and [pʰ]
we also never find minimal pairs for [t] and [tʰ], nor for [p] and [pʰ]