1. image: Download

    What happens when you re-cycle automatically transcribed text…10 times?

I recently discovered that videos uploaded to YouTube are automatically transcribed (if they’re in English).  As you might guess, the transcriptions are not perfect, so there will be a discrepancy between what the speaker actually said and what is transcribed.  This is essentially all you need to run an iterated learning experiment (e.g. Kirby, Cornish & Smith, 2008).  Iterated learning is a process of repeatedly transmitting a signal through a bottleneck.  For instance, language is transmitted from adults to children, who learn its rules.  These children then go on to transmit this language to their own children.[…]
Here’s how iterated learning with YouTube works:
Record yourself saying something.
Upload the video to YouTube
Let it be automatically transcribed (usually takes about 10 minutes for a short video)
Record yourself saying the text from the automatic transcription
Go to 2

YouTube video of the results: 

Read more about this experiment from Replicated Typo. 
This reminds me of sites like MultiBabel or Bad Translator that feed a text through multiple machine translators to see how much gibberish ends up resulting. Some interesting modifications of the YouTube captions experiment would be to use speakers with different accents or various text-to-speech programs. Or combine translation and transcription, maybe. 
Edit: I should perhaps clarify that the video and image are also by Replicated Typo. 

    What happens when you re-cycle automatically transcribed text…10 times?

    I recently discovered that videos uploaded to YouTube are automatically transcribed (if they’re in English).  As you might guess, the transcriptions are not perfect, so there will be a discrepancy between what the speaker actually said and what is transcribed.  This is essentially all you need to run an iterated learning experiment (e.g. Kirby, Cornish & Smith, 2008).  Iterated learning is a process of repeatedly transmitting a signal through a bottleneck.  For instance, language is transmitted from adults to children, who learn its rules.  These children then go on to transmit this language to their own children.[…]

    Here’s how iterated learning with YouTube works:

    1. Record yourself saying something.
    2. Upload the video to YouTube
    3. Let it be automatically transcribed (usually takes about 10 minutes for a short video)
    4. Record yourself saying the text from the automatic transcription
    5. Go to 2

    YouTube video of the results: 

    Read more about this experiment from Replicated Typo

    This reminds me of sites like MultiBabel or Bad Translator that feed a text through multiple machine translators to see how much gibberish ends up resulting. Some interesting modifications of the YouTube captions experiment would be to use speakers with different accents or various text-to-speech programs. Or combine translation and transcription, maybe. 

    Edit: I should perhaps clarify that the video and image are also by Replicated Typo. 

     
    1. fluffabutt reblogged this from violasarecool
    2. violasarecool reblogged this from allthingslinguistic and added:
      well, LBD people, if you thought our closed captioning was bad… look what happens when you repeat it over and over :P
    3. thothofnorth reblogged this from estifito
    4. andperseampersand reblogged this from allthingslinguistic
    5. gayblooddonation reblogged this from vampireteethvagina
    6. thinkingingallifreyan reblogged this from allthingslinguistic and added:
      How many attempts did it take each time for you not to laugh at the resultant mangling of meaning? :P
    7. aeternamente reblogged this from allthingslinguistic and added:
      Interesting idea, and the video is hilarious!
    8. vampireteethvagina reblogged this from allthingslinguistic
    9. rien--de-rien reblogged this from allthingslinguistic
    10. wugsandhugs reblogged this from allthingslinguistic
    11. allthingslinguistic posted this