something really interesting about linguistic phenomena is that they often seem to operate on principles of equivalence that are completely contrary to how typical machine learning functions
something really interesting about linguistic phenomena is that they often seem to operate on principles of equivalence that are completely contrary to how typical machine learning functions
in phonology, for instance, we form equivalence classes of sounds on the grounds of not phonetic similarity but rather contextual non-overlappingness. a kind of anti-similarity
one of the core criteria for two phonetic sounds to be equivalent in a language is if those two phonetic sounds have strictly non-overlapping distributions
the more overlap they have, the LESS equivalent they are. similarity = difference
ok, so some folx wanted to know more about this in linguistics/phonology, so here goes
when we study how the sounds of different languages work, we find that a language doesn't just have a set of sounds that are used in the language
we also find that those sounds relate to one another in various ways
for example, if we consider the <t> in the word <top> and in the word <stop>, we can determine acoustically (via the waveforms/spectrograms) and physically by feel, that these are different sounds
but you, as an english speaker, perceive them and treat them as the same sound
to see that they're different, place your hand in front of your mouth and say "top top top top top" then say "stop stop stop stop stop"
you should feel a stronger puff of air for "top" than "stop"
if you find that you puff air out when saying the <p> as well, say the words "tub" and "stub", to reduce the puff from the <p>
English treats these two sounds as effectively the same, and if you edit a recording so that "stop" has the <t> of "top" instead, people will still perceive it as "stop", but just a little funny. it's the same word, with a weirdness to it, breathy and airy but still "stop"
Korean, on the other hand, treats these two sounds as different, and so if you play the same trick with a Korean word, you might end up with a totally different word that means something different!