
The BrailleSC project currently has almost 30 oral histories on video, some of them already transcribed but most of them not. For transcription work, we’ve been using Amazon’s Mechanical Turk service, which is certainly reliable, affordable and efficient (as I wrote here), but it’s not exactly ethical (as I wrote here). Is there another solution? What about crowdsourcing? Could we, in other words, make the task of transcription available to an army of potential volunteers from across the Internet and get good results?
Plenty of online projects rely on crowdsourced work with (mostly) good results: see, for instance, LibriVox, Project Gutenberg, and Wikipedia. And recently, George Mason’s Center for History and New Media was awarded an NEH Digital Humanities Start-Up Grant “to support the design and development of a tool for crowdsourcing documentary transcription:”
The $49,215 award will enable CHNM’s dev team to to build an open source tool to enable researchers to contribute document transcriptions and research notes to digital archival projects, thus harnessing the power of the community of users to improve the discoverability and usefulness of the archive.
That’s a very exciting project, and I look forward to seeing the results. What about crowdsourcing audio transcription?
Consider this an invitation to participate in an informal experiment in volunteer transcription work. Here’s the question we’d like to answer:
Can a project involving audio or video recordings of spoken words rely on volunteers for transcription of interviews broken up across short clips?
Transcriptions will allow for various forms of textual analysis and re-use of the interviews, and transcriptions will also aid in creating captions to accompany the videos, which will make them accessible to users with hearing impairment.
What I’ve done is take one interview and break it up into 2-minute clips. I chose that length somewhat (but not completely) arbitrarily. Each clip is hosted on YouTube, where you may view it while transcribing. And while it’s true that YouTube has added automatic captioning of their videos, these captions don’t always work and when they do their accuracy leaves something to be desired.
Here are the details:
- Unless you instruct us otherwise, we will credit you by name on the web site for your work.
- All of the materials we produce at BrailleSC will be published with a Creative Commons license allowing others to make use of them under certain conditions (Attribution-Noncommercial-Share Alike), so your work could potentially benefit many projects (if any other projects take our materials and work with them, that is).
- To volunteer, go to this page and follow the directions.
Any questions or comments about this process (or about the challenge of transcribing audio)? Please leave them below.
Thanks!
[Creative Commons-licensed flickr photo by Beverly & Pack]