Important discovery: The Colbert Report is virtually impossible to speechread.
Oh, Colbert faces the camera straight on, and enunciates pretty well, probably a side-effect of un-learning his Southern accent as a young adult. (He gets it back a wee little bit when campaigning for the President of the United States of South Carolina. I've seen him mention he learned to quash it at college when he realized it made people assume he was less than brilliant, but environment is hard to resist.) It's just that lip reading is mainly guessing what fits the visemes you can see, based on body language and context, and "Colbert" runs almost entirely on sarcasm and insane troll logic. It leads to a lot of moments where you have a graphic of, say, a convicted money launderer up on the left side of the screen, but find yourself watching the lip flaps and going, "...wait, did he just say 'banana'? Angrily?" We shall not even speak of "The Wørd", where the snark-titles on the right are based as much on his bizarre tangential turns of phrase as they are the actual Wørd in question.
I lip read better than I probably ought to, for someone whose hearing is perfectly good. Despite years upon years of abusing the volume control on all of my music players, I've always come out fine on pure tone tests. I used to hear monitor whine, back when we all used CRTs, and I still hear borderline-ultrasonic feeps and whiffles from the rats, if I get close enough. I think everyone else has the TV up hellaciously loud, and when left to my own devices, crank the sound down by at least 50%. I'm not sure my hearing is stupendous so much as I've spent enough time picking apart synaesthesiae that I just pay way too much attention to it all. I usually classify things like diesel rumbles as infrasound, even though the primary way I pick them up is that they make the room feel oppressive and my head feel stuffed full. Other people think I am puzzlingly insane until the truck finally leaves and they realize the low level hum is gone.
The reason I speechread so well, I suspect, is that I have a sporadic central auditory process something something. Sometimes the parser just kind of fall down go boom. I have an unfortunate tendency to say, "What?" when this happens, which makes people repeat the same string of syllable mush word-for-word in the exact same cadence, when what I really need before it will coagulate into words is for them to rephrase it. I guess a lot from attention and gestures. This fails utterly on the phone, obviously, and is one of the reasons I hate talking on the damnable things. If I really try, I can get about 50% of what you're saying, especially on a cell phone, especially on a crap cell phone, especially on a crap cell phone while on a train or standing in the middle of the sidewalk with random noises coming in my other ear, and you know what? Just fucking text me. You already have the number. PLEASE.
There are a couple of interesting corollaries to this. One is that I'm largely immune to the McGurk effect. I won't swear it never works on me, but at least when I'm watching videos that purport to demonstrate it, I automatically hear the spoken syllable, and with a tick of attention I can generally figure out what syllable is being depicted. Depending on the demo, I sometimes have no idea what combined syllable I'm supposed to be hearing, although I know enough about linguistics and speech physiology that I can make an educated guess.
Another is that I've gotten kind of indifferent to dealing with crap internet video where the sound and picture are out of sync. I find it a great deal more annoying when the audio and video are matched, but the subtitles are out of whack. I'm a great deal more reliant on the relationship between mots and paroles than I am between phonemes and visemes, to the point where the only two languages I have had classes in and failed to retain any of are the two where the instructor refused to give me a large wadge of written materials. The "textbook" did not as such exist for Navajo when I tried to take that, and the instructor was unable to give me a grammar to go with the packet. The Arabic teacher just refused to write down anything for which we hadn't learned all the letters in the Arabic abjad. The coursebook tried to bang everything into our heads with raw verbal repetition, without giving me any written source to match it to. I remember about three words, all of which commonly occur on takeout menus.
I have no idea how most people process lip-reading. The Wiki article inconveniently doesn't say. I do it through a sort of reverse subvocalization process: I look at the mouth position, imagine my own mouth in that position, and take a stab at what noises I could be making if that were the case. Some of the lesson pages seem to indicate that you're supposed to learn it by directly linking the visual "mouth that looks like this" to "noise that sounds like this" in the same sort of symbolic way that printed letters symbolize sounds, but I would guess that the pages are trying to teach lip-reading to people who aren't d/Deaf -- that method wouldn't work if you'd never heard any of the noises, and thus don't have an instinctive idea of what sequence of noises make up what words.
My method, conversely, wouldn't work if you'd never known how to speak those words yourself. There are a lot of near-minimal pairs among English phonemes, particularly with voice/voiceless pairs like m/b, t/d, and f/v. Having to learn to speak these from watching mouth movements, but without being able to audibly distinguish between positions of speech organs that can't be seen from the outside, results in the characteristic "accent" that hearing people perceive in Deaf speech. Think about it. It's crazy difficult to figure out how to manipulate your soft palate and throat muscles to control the flow of air through your nose if you can't hear the difference. It's at least as difficult to change as it is trying to smash your native accent in an adult-acquired second language. Even Marlee Matlin has it, and she spends so much of her time working and performing outside of the Deaf community that I can lip read most of her English while she signs.
Oh, Colbert faces the camera straight on, and enunciates pretty well, probably a side-effect of un-learning his Southern accent as a young adult. (He gets it back a wee little bit when campaigning for the President of the United States of South Carolina. I've seen him mention he learned to quash it at college when he realized it made people assume he was less than brilliant, but environment is hard to resist.) It's just that lip reading is mainly guessing what fits the visemes you can see, based on body language and context, and "Colbert" runs almost entirely on sarcasm and insane troll logic. It leads to a lot of moments where you have a graphic of, say, a convicted money launderer up on the left side of the screen, but find yourself watching the lip flaps and going, "...wait, did he just say 'banana'? Angrily?" We shall not even speak of "The Wørd", where the snark-titles on the right are based as much on his bizarre tangential turns of phrase as they are the actual Wørd in question.
I lip read better than I probably ought to, for someone whose hearing is perfectly good. Despite years upon years of abusing the volume control on all of my music players, I've always come out fine on pure tone tests. I used to hear monitor whine, back when we all used CRTs, and I still hear borderline-ultrasonic feeps and whiffles from the rats, if I get close enough. I think everyone else has the TV up hellaciously loud, and when left to my own devices, crank the sound down by at least 50%. I'm not sure my hearing is stupendous so much as I've spent enough time picking apart synaesthesiae that I just pay way too much attention to it all. I usually classify things like diesel rumbles as infrasound, even though the primary way I pick them up is that they make the room feel oppressive and my head feel stuffed full. Other people think I am puzzlingly insane until the truck finally leaves and they realize the low level hum is gone.
The reason I speechread so well, I suspect, is that I have a sporadic central auditory process something something. Sometimes the parser just kind of fall down go boom. I have an unfortunate tendency to say, "What?" when this happens, which makes people repeat the same string of syllable mush word-for-word in the exact same cadence, when what I really need before it will coagulate into words is for them to rephrase it. I guess a lot from attention and gestures. This fails utterly on the phone, obviously, and is one of the reasons I hate talking on the damnable things. If I really try, I can get about 50% of what you're saying, especially on a cell phone, especially on a crap cell phone, especially on a crap cell phone while on a train or standing in the middle of the sidewalk with random noises coming in my other ear, and you know what? Just fucking text me. You already have the number. PLEASE.
There are a couple of interesting corollaries to this. One is that I'm largely immune to the McGurk effect. I won't swear it never works on me, but at least when I'm watching videos that purport to demonstrate it, I automatically hear the spoken syllable, and with a tick of attention I can generally figure out what syllable is being depicted. Depending on the demo, I sometimes have no idea what combined syllable I'm supposed to be hearing, although I know enough about linguistics and speech physiology that I can make an educated guess.
Another is that I've gotten kind of indifferent to dealing with crap internet video where the sound and picture are out of sync. I find it a great deal more annoying when the audio and video are matched, but the subtitles are out of whack. I'm a great deal more reliant on the relationship between mots and paroles than I am between phonemes and visemes, to the point where the only two languages I have had classes in and failed to retain any of are the two where the instructor refused to give me a large wadge of written materials. The "textbook" did not as such exist for Navajo when I tried to take that, and the instructor was unable to give me a grammar to go with the packet. The Arabic teacher just refused to write down anything for which we hadn't learned all the letters in the Arabic abjad. The coursebook tried to bang everything into our heads with raw verbal repetition, without giving me any written source to match it to. I remember about three words, all of which commonly occur on takeout menus.
I have no idea how most people process lip-reading. The Wiki article inconveniently doesn't say. I do it through a sort of reverse subvocalization process: I look at the mouth position, imagine my own mouth in that position, and take a stab at what noises I could be making if that were the case. Some of the lesson pages seem to indicate that you're supposed to learn it by directly linking the visual "mouth that looks like this" to "noise that sounds like this" in the same sort of symbolic way that printed letters symbolize sounds, but I would guess that the pages are trying to teach lip-reading to people who aren't d/Deaf -- that method wouldn't work if you'd never heard any of the noises, and thus don't have an instinctive idea of what sequence of noises make up what words.
My method, conversely, wouldn't work if you'd never known how to speak those words yourself. There are a lot of near-minimal pairs among English phonemes, particularly with voice/voiceless pairs like m/b, t/d, and f/v. Having to learn to speak these from watching mouth movements, but without being able to audibly distinguish between positions of speech organs that can't be seen from the outside, results in the characteristic "accent" that hearing people perceive in Deaf speech. Think about it. It's crazy difficult to figure out how to manipulate your soft palate and throat muscles to control the flow of air through your nose if you can't hear the difference. It's at least as difficult to change as it is trying to smash your native accent in an adult-acquired second language. Even Marlee Matlin has it, and she spends so much of her time working and performing outside of the Deaf community that I can lip read most of her English while she signs.
Comments
Post a Comment