In this bonus track for episode 37 with Michael Erard, we hear his story about the piano in the MPI basement, and people working at the institute tell us about their research and themselves. Featuring: Hans Rutger Bosker (https://www.mpi.nl/people/bosker-hans-rutger), Mark Dingemanse (https://www.mpi.nl/people/dingemanse-mark/), and Charlotte Horn.
In this bonus track for episode 37 with Michael Erard, we hear his story about the piano in the MPI basement, and people working at the institute tell us about their research and themselves. Featuring: Hans Rutger Bosker (https://www.mpi.nl/people/boske
Hey, thanks for tuning into this LangFM bonus track. As I mentioned in the main episode with Michael Erard, he was kind enough to introduce me to several researchers at the Max Planck Institute in Nijmegen. But before we listen to what they have to tell us about their research, how about we start with a little story? The story of the piano in the basement.
One day in late November I was at lunch again with Charlotte, Mark, and his officemate, Edwin, who works on chimp behavior. “Sometimes I hear music being played, piano music,” Edwin said, “but only after work hours.” “Where is there a piano?” I asked. In the basement, they said. There’s a basement? I hadn’t known. Two days later, my oldest son came to the institute with me; there was a teacher’s strike, and he had no school. On such days, usually we sat for a while in my office, the door closed, while I wrote; he watched videos or looked at photos of trains on my desktop computer to sketch. Then at 10:30 we wandered down to the canteen for a muffin. That day I suggested we explore the basement, and he agreed, but as soon as we started down the dark staircase, he balked. He’s been like this, hesitant to enter spaces that might look forbidden. There’s no sign here, I said, no reason to avoid this. Plus there’s the piano they were talking about. I have to see the piano. I persuaded him further down. As soon as we entered the hallway, a light snapped on. We crept along the corridor and found an entrance to the library’s lowest stack, which we’d explored from above during a previous visit. “It’s not the library,” he insisted. “It is! It’s the same,” I said. Magic bunny holes existed all over the institute. Do people map out routes using the basement to avoid others they don’t want to see? If they don’t, they could. We rounded the corner, looked into an office, and saw Jan, the main janitor. He’s jolly, bald, plain-talking and friendly. Booming voice: “Hi Michael!” He’s the one responsible for all the physical systems. If it can be moved or can break, Jan is in charge of it. Don’t take Jan for granted. The man has keys. “We’re checking out the basement,” I said. “I’d never been down here before.” “Well, let me show you some other things,” he said.
Well, now it’s my turn to show you some other things that happen at MPI. Let’s listen to Hans Rutger Bosker, Mark Dingemanse and Charlotte Horn. First up, Hans:
Nice to meet you! - It’s much cooler down here, isn’t it? - It’s quite pleasant, yeah! Especially compared to two floors up.
I have always been interested in speech and not necessarily language per se, but rather the speech signal, the spoken signal and most of my work is on speech perception so how come… I mean you can understand me right now perfectly well because this is a quiet quiet surrounding right now. But once we get into a situation with this loud traffic or competing talk within the bar, music… - It’s the worst! - Or even just a non-native speaker or someone who produces noise inside the signal, inside the compound communicative signal itself, still we somehow cope, right? We don't have that much struggle in a bar. So somehow we cope and we do that much better than Google does, right? And all these speech recognition software.
I'm interested in this speech perception especially when it gets bad. Right. And it gets difficult. So for instance, one of the topics is very fast speech. At some point it breaks down, right? We can compress speech to a certain degree, make it really fast. But at some point it breaks down. Why does it break down at a particular rate? What is it in the brain that fails then?
I mean, I can produce this simple sentence and speed it up by two, you have no problem, right? But acoustically, it’s very, very different. If you just look at the signal, it's completely different. But we understand the same words. If we compress it by three, you can still kind of somehow pick up the words but somewhere along, you know, when you compress by four or five, we lose that ability and it becomes too fast. Now why is that? Right? So we look at how our brain processes fast speech, for instance.
Another topic is is disfluency. Communication can also get difficult when there's lots of ums or pauses or breakdown. And somehow the brain has developed a clever way of dealing with those. So we also have done some research into, for instance, how the brain processes online, right? So we use eye tracking research where we show pictures and we then present disfluent speech and see how that influences how people interpret the speech. Because the eyes follow what you think is gonna come up. It's a picture of, say, a sewing machine, something really low-frequency that we don't see every day. A spinning wheel, you know, very low-frequency items, and high-frequency items: cars, bikes. You see two pics on the screen. That's it. And then you hear a sentence: now click on the… It's not gonna be the hand, right? It's not gonna be difficult to name. So somehow we use those ums to already predict that it's gonna be the difficult item. Well before we hear the S of sewing machine or we hear the S and P of spinning. So it's evidence of that, that we actively use these kind of meta-cognitive, meta-linguistic cues, these performance cues that don't have really meaning themselves. Um doesn't mean anything, but nevertheless… - Maybe it does? - Yeah. Listeners can cleverly use those symptoms of speech production in perception.
For the speech rate thing, we also use neuro-biological methods, we’ve used EEG and MEG, which is measuring brain activity at a given moment in time and that has a very, very small temporal resolution. Very good in terms of the temporal resolution so you can measure every millisecond what's going on the brain. And what we've been showing is that people actually track the rate of speech, so if we take a normal speech signal that has, I don't know, I’ll say five syllables every second, then we actually see in the brain a kind of a five-beats-a-second response as well. So it's like the brain adjusts to what you hear. Now if you compress that by two, you end up with 10 beats a second. You again see the ten beats a second back in the brain. Once it gets too fast, so when compressed by three or four or five, we lose that correlation between brain and speech. So it's like the brain tries to keep up with the signal as it comes in. But somewhere it breaks down. And the moment it breaks down is said to be relevant for the kind of brain waves that are in the brain. The edge is around 10 Hertz, right? Everything below 10 Hertz is trackable, syllable rates below 10. And you'll find that languages all around the world will have syllable rates below 10 Hertz. - They do? - Yeah. One reason might be, given all this neuro stuff, that the brain has been designed to track rhythms within this range and not outside. And therefore, languages have evolved to be within that range. It's like, you can fool a brain as long as you follow its rules. You can help it to understand speech and ideally that's where you want to go. You know, people with hearing aids, elderly people lose their hearing acuity, and cochlear implants. That's ideally where you want to go. Understanding how speech perception works. What are the constraints that the brain imposes over the years or whatever. If we know, if we can understand that, then we can apply it. Right? So it's in the end, of course, it's fundamental research but it's driven by this idea of wanting to understand in order to be able to then tweak perception
All these languages, I mean they're all spoken languages. We all speak with our mouths, and our mouths constrain how we speak. We can only speak so fast. We can't speak at 10 Hertz. Right? So therefore we also don't find that that much. -- But it looks fairly consistent. Is that true or is that just the layman…? That's my point. Yeah. Across all languages. This is just Dutch, but the theory is, the suggestion is that this is general across the board simply because this is how our jaws work. And maybe even how our brains work. Our brains can’t keep up with 10 Hertz, so why would you speak above 10 Hertz? -- Because sometimes, there's this perception at least, that some languages are quicker than others. Like Spanish would be the typical example. And they sound so quick, but apparently… That may just be individual perception. That's clearly a perception issue because if you do the math, if you look at the acoustics, the signals yourself, they're very compatible. -- It checks out. There's also stuff on speech rate where it's all within the same range. But of course, perception deviates from merely objectively observing the world. If you listen to Spanish, we can’t pick up the sounds, right, because they might have sounds that we don't use. We don't hear the word boundaries. I mean if you look at the speech signal, there's no pauses. -- If you don't know Spanish. Yeah. So it's very difficult to pick out the words simply because we don't know where they end. We just get a blur of sound and it’s just like, what? So perceptually, that sounds a lot faster and it's been called the gabbling foreigner illusion, right? -- Oh yeah, sounds very familiar. And everybody has this intuition. But of course that can't hold because a Spanish person will think Dutch is fast. -- Because it's a foreign language. Exactly. It can't be in the acoustics. It's clearly a perception.
Michael: What that speech modulation curve also tells me is that it's probably very, very unlikely that someone who's an interpreter would have to be dealing with or resolving differences in modulation rates between the two languages.
Yeah. I think so. But that's mere acoustics. That's just listening to sound. Of course, there's much more when it comes to interpreting because you also have to understand the sound, pick out the words, make sentences in your mind, understand them and then go all the way back from that meaning, construct a sentence in a different language.
-- It looks, from your research, like there's probably almost no difference between a professional language user like an interpreter, let's say, and a normal person because there are some kind of physical limitations to the rate that you can sort of tolerate. Yeah, of course there's a range, right? -- Because I would have said maybe, you know, a professional language user who has more training, maybe that makes you difference? Maybe it doesn’t, I don't know. Yeah. No, certainly. I mean it's only a range. I'm just saying that extremely fast or extremely slow won't happen. And the reason might be the jaw, it might be the brain. But within that, there's still considerable variation of both. If we are proficient speakers of the language much faster than non-native speakers and there's individual variation in people who are just generally slow. So there's still of course considerable variation. Nevertheless, that variation is constrained. It doesn't go up to 15 Hertz. There are boundaries and that's that's what we try to to pinpoint, to limit the variability because there is plenty of variation.
A few floors up, there’s Mark Dingemanse:
[Introductions]
Michael: I’m gonna step out and get some coffee.
So there are two main strands to my work. And the one that Michael [Erard] mentioned that I think is the one to do with… he mentioned this little word “huh”. That's the word that you use when you didn't quite catch what somebody said - or at least one of the ways in which you can do that. It's the most informal way. And the most widely used way. But the larger research program is more broadly about how we manage to communicate at all, you know, against overwhelming odds. So much can go wrong and goes wrong all the time that it's kind of a miracle that we manage to be able to talk somehow. -- Even if you speak the same language. Yes. Even without me thinking about all the formidable issues of different languages. But this is just about using a word that the other doesn't quite know or using a name that has multiple possible references but also communicating in noise, of course, you know: traffic passing by, all sorts of other trouble, overlapping. So you know there are very many ways in which things can go wrong. And what's so interesting about human language as opposed to many other animal communications systems is we actually have pretty good ways to deal with that. We can do some of the same things that other animals also do. This is one of my interests, actually. You know, animal communication and how language relates to that. But animals’ options are limited. What they can do is basically just keep repeating what they did until they get some form of success. Or give up or not. That's it. I mean we can do that, too, but fortunately we can do more. And so our research program into what we call repair was precisely aimed at trying to find out how exactly we solved these kinds of breakdowns. And so in the course of that we discovered a couple interesting things, one of which was this little word “huh” which we found to be basically universal in spoken language. And that was totally unexpected for us as linguists. Doesn't make sense at all. You know, in unrelated languages, you would have a word with a similar kind of function that also has a similar kind of sound. It doesn't work like that normally. As you as an interpreter well know. -- Yeah, exactly. I mean, even words like you when you hurt yourself, even those are a little bit different. Or sometimes very different. -- Yeah.
This word is, I mean, it's a nuanced story in the sense that languages do have their own version of this word. But I can put it like this: every spoken language that we've been able to check and that anybody else has been able to check has a simple mono-syllable with questioning intonation being used to ask the other to repeat what they just said. And so in some languages it might sound like “he?” - that's Dutch, my native language. I think German has... -- German is very similar. -- Yeah. Now in English, of course, it sounds a little more nasalized and with a less clear fricative at the beginning, so you don't quite go “he” but more like “uh”. In Spanish, the vowel is slightly different, so you go “eh”. But still, it's this mono-syllable with questioning intonation in all of those languages, so it's never in another part of the vowel space. You know, if you think about it as, you know, you can go “e” or “oo” or “ah”, and then in this region, there would be “eh”, “uh”… And so it's always in that region. It's never “e” or “oo”. And there's no particular reason that it couldn’t be, but it so happens never to be like that. So that was our first discovery. One of the most interesting ones to do with, you know, how people solve this universal problem. -- And the differences that you've seen between languages are linked to phonology, for example? So a Spaniard would be hard pressed to pronounce a “h” as we pronounce it. Those are the main differences? Yeah, the only ones as far as we can tell. So basically what you can say is this is a word that is always in that same part of the possibility space and it adjusts slightly within that space to the phonology of the language. That's it. It never, you know, there is no example that we know of where the vowel goes beyond that little “a” or “u” sound, for instance. And it so happens of course that most languages do have a vowel somewhere there and then you pick it, that’s basically how it works. The reason that we think this is, is what we call convergent cultural evolution. We think that across the world this is such a common communicative need that languages converge to find the same solution, essentially, to that need. And the need is not just “can I get you to say again what you just said” but we're on under some pressure in conversation. You know, we take turns and we do so at quite an amazing pace. We do so very quickly, normally, and we also know that when we're a bit slower that that invites all sorts of implicatures. If you ask me what I'm doing tonight and I'm slow to reply, you already know that this is gonna be “I’m busy” or whatever. -- The pragmatics of... -- Yeah. So that pressure is always there. Because it's always there, when we didn't quite catch what the other one said in the first place, we need some quick way of indicating that. And that's precisely what this little word gives you. So it's basically the simplest possible question where it's really easy to plan. You basically just have to leave your mouth open, emit a little, you know, make it a syllable without the questioning intonation and off you go. -- So basically, what you're saying is that this was a conversion, I think you said, of languages. This is not just some leftover from an ancient proto-language? So that's the thing that we are at present of course not able to say. It's difficult to say. In fact, when when the paper came out, some newspapers who were writing about it said: Oh, this is the oldest word, it's the most ancient word, the only word that we still have left from our... And it's just, we can't say anything about that. So it could be that, but it could also be that languages independently just converge on that same option. And one reason to think that it's at least partly that second possibility is that we can now see this happening.
Danish is an excellent example. In Danish, you do have something like “hm”. Danish is of course known as the language that minimizes everything, all the vowels smudged together. -- Oh yeah. -- So you know it makes kind of sense that what they have in the way of this interjection would be just a nasal like “hm”. But they also have another form which they write with four letters as H-V-A-D. So it's cognate with the Germanic “what”. And it sounds nothing like it, it sounds like this: “hve”. -- That’s so Danish! -- There's a tiny bit of labio-dental closure there which, you know, if you really want, you can hear it as “hvad”. But if you listen to Danes speaking casually, it just sounds like “hæ”. Now what's interesting there is that it seems that... the way I look at it is this is the case of the question word in Danish, you know, the one that we know from other Germanic languages to be like “vaht” or “vahs” - that question word getting caught up in this vortex of selective pressure is wanting this interjection to be minimal. -- Super-efficient. -- And you know it’s being pulled into that same part of the possibility space where the other interjections of all the other languages already are. That's my reason for thinking it's not just that we are stuck with the oldest word, but even if we try to use a new word, it ends up getting pulled towards that same part of the space.
-- Transcribed text, is that what you use? Only transcribed audio recordings of informal conversations. So, you know, you can never ask somebody about these things because you can’t trust people's intuition. I can't trust my own intuition about this. Because these things are so much below our awareness. Ask anybody about whether they think “uh” or “um” is a word or whether they think “huh” is a word and they'll say, nah, these are just sounds and I don't use them anyway because they're impolite and so on. Whereas: record them in an everyday conversation with their friends and you see them or hear them using these words all the time. So that's what we did. We made recordings in field sites all over the world and traced. We didn't go listening for these sounds in particular. We instead looked for situations in in which you can simply see things going wrong. So where, you know, you say something and then the other person does something, whatever it is. And then following that something you do your thing again. So you know you repeat yourself in response to something that the other person did. Yeah. Now having these two cases of you speaking and, in between, a person doing something else, we can look at all of the cases where that person does something and we can say these are all cases that somehow elicit repetition from you. So those are like what we call repair initiations. Those are cases where I ask you for clarification and then we can do a typology, we can compare them across languages, and that's how we found that one of the things you can reliably do in all those corpora of all those languages is use this simple syllable to elicit repetition.
I would say it’s speech, I would say even it's a word yeah. Some people think it's not. -- “It’s just a sound!” -- But it doesn't make sense to me to say it's just a sound because you have to learn it.
We found that it's not very common. -- You mean the polite forms are not very common? -- Yeah, exactly. So even in the most, you know, even in English conversations or you know the other Western languages that we had in our sample, we know that these forms exist. But in informal conversations, they are highly infrequent. Instead, what you find is that when there are larger social asymmetries or when there are special work situations or more formal situations, that's when these things get mobilized. But in informal interaction, in the form that we use language most, these polite forms are the least frequent. They're there, but not in all languages, interestingly.
What's so nice about “huh” is it's extremely efficient. So I put very low demands on your time by using it. And it's a very efficient way of solving that problem. So in a way it's a good use of our time. If I use the most efficient way that's available what you do when you're being polite is you choose to convert some of the time that you might want to pay to efficiency, you can choose to convert it to politeness to show that you are doing more work essentially to be more polite.
-- Um… Um! There you go. -- Here I go. Michael wrote a book about it. -- Yeah exactly.
There's another cool thing that we found which was that in all languages you can do something special with that word to signify that you're doing something special with the meaning. So in particular what you can do is you can make it more exaggerated, so to say, so you can go “huh???” to signify that now it's not just that I didn't hear you. In fact I heard you quite well. Perfectly fine. But I'm doing surprised in a way, I'm acting surprised, and so I invite you now not to repeat it but just to say, yeah, isn't that amazing? And so on. And you know we know that use from American English and from Dutch. It's been documented in a few Western languages but it's kind of cool that we've found that in all of the languages in our sample. So no matter where you are, you can do the same thing, exaggerate the form to, in a way, exaggerate the meaning. That same kind of operation on the form to get a particular meaning is available in all those languages.
So there are several cues in simple dyadic interactions that people use. I'm sure you know all of them. The most obvious one is syntax. Does it sound like it's syntactically complete? But that's never a single cue - the interesting thing is that these cues are always cumulative. Usually, a single cue isn't enough so complete syntax in itself is just one sort of point in favor of “this may be a possible place for a turn transition”. -- So for example when it's a question you would go up? Yeah. -- But there could be a second question after that!. Exactly. Yeah. But that would be two cues already; in a way you, can tell something from the syntax, you know, the word order in an English question is different from that in a statement. So that's one of the cues. The other cue is intonation - does it indeed go up then. OK. More evidence it's a question and therefore now a response is required. And then there are things like - and I'm wondering whether this holds up in this scenario and I don't think fully - there are things like gaze patterns which are actually hugely important. So what people do in dyadic interactions is, when they are close to finishing their turn and it is something like a question, where, you know, the next speaker is obviously selected as next speaker, then they'll turn their gaze to that speaker. And then you know “this is my cue” and “now I go”. And if that lines up with the syntax and the intonation and other available cues, then clearly you can go. And I'm wondering how much of that is available in this because… essentially, all of the turns are pulled apart in this four person scenario certainly. Let's keep with Trump and Putin just because it’s easy. Is Trump actually, when asking his question and nearing the end of it, is he gonna look at Putin? And if he is, then that is the cue for the interpreter to say “okay, now I know it's basically the end of the turn and so I can start translating.”
Well, this was you know a great sort of conversation. -- Yeah! -- About your work and a little bit about my work. And lots of follow-up questions. Michael: Well, both of you are remaining in Europe. So I encourage you to stay in touch. -- We might actually do that, yeah.
One person that Michael worked closely with was Charlotte Horn, then the Public Outreach Officer at MPI and a dyed-in-the-wool Dutch woman:
Michael: A couple of years ago, she spent 7 months living in New York City; she worked as a bike courier, a job she got as soon as she arrived. “How did you think you could work as a courier if you didn’t know the city?” I asked. She shrugged. “I’m Dutch. Dutch people figure, if it involves a bike, they can do it.”
Michael: Charlotte and I spent a lot of time together. Actually, the first day that I was here, it’s 9:00 in the morning: knocks three times “You're here!” laughter Charlotte: I am doing PR and communications for the institute. So, yeah, I'm not a researcher. Alex: And were you the one who brought Michael on board? Charlotte: Well, that was Simon Fisher, who is a director here. Michael: Just say yes! Charlotte: Yes, I did that! Yes/no! laughter Yeah. And obviously we worked together. Just back and forth a little bit with some outreach things or press releases or stories or that kind of thing. Alex: And your background is in communication or in science? Charlotte: Neurobiology! Alex: And you then decided at some point that you're more interested in the communication aspect of the work? Charlotte: Yeah, the translating of that work. Alex: Translating as in… Charlotte: … making it understandable. Not a different language. Alex: It is a way of translation. Michael: You do work in both English and in Dutch. Charlotte: I do. I work also in German because the Max Planck Gesellschaft obviously is German and they communicate all their work in German. Alex: Only in German? Charlotte: Mostly in German. Alex: Interesting.
Charlotte: I started working here and there was no one before me that did communication. So I was phoned a lot, which was really great, by a lot of people. It was a very busy job. So I just got all of these requests like: We have this, this is cool, this is cool. And so I went in a lot and talked to people and I saw the experiment and at some point you know - and this is what we spoke about as well - what is the institute actually doing. That it's actually the only place in the world where we do only language research all in one building on so many levels. And that is sort of your selling point. There's a lot of really interesting work here, obviously. Alex: Yeah, I was just saying to Michael, you could pop into any office and you could have a three-hour conversation which would be fascinating. Charlotte: Everyone’s so driven and open. Michael: I should’ve done that actually, just have, like, a random number generator. And just spun a wheel and then every day: Well, let's see what's happening in 17. Alex: Because even if you're here for a year, you don't get to meet everyone necessarily. Michael: No. I mean there's just so much. You realize at the beginning, like, oh my God, I’m in a candy store. And by November, you go: I’m so tired of candy. I can’t eat any more candy. So you gotta step back and digest it a little bit.