VoxDB Contributors' Guide
Photographs are generally how we recognize and connect with culturally significant individuals. Finding pictures of people online is easy, and the personal and pedagogical value of these representations is enormous.
It is also possible, of course, to make these same connections auditorily: by listening to a person's voice. Unfortunately, this important means of identifying people and forming connections with them plays at best a secondary role for most people. Distracted by the easy availability of visual images, we underestimate the tremendous emotional and informational power of listening to the sound of another human being's voice.
The goal of VoxDB is to re-prioritize this resource. Where ever an image exists of a culturally significant figure born after the advent of recording technology, VoxDB seeks to provide a clear, universally accessible recording of that person speaking.
We are creating a searchable online database containing short samples of voices of current and historical figures such as scientists, politicians, authors, actors, religious leaders, artists, musicians, and elected officials. It will be an "auditory photo gallery", providing the acoustic equivalent of the thumbnail visual image which is so omnipresent online.
We are at the very beginning stage of this project, gathering the voice clips which will populate the database. The VoxDB collection will naturally reflect the tastes, interests and abilities of those who contribute to it. You have the chance to help shape this collection, adding your touch to this rapidly emerging internet resource.
Note to Students
If you are a student participating in the VoxDB project as part of a service learning course requirement, your instructor will determine at which stage you will participate, which clips you may contribute and how many you will contribute. Although we will assist in any way we can,
your instructor is the sole determiner of your course grade. VoxDB reserves the right to decline the offer of particular clips or groups of clips for any reason; this is independent of any grade you may receive in any course at Bowling Green State University or other institution. Your instructor may have added requirements for you, but what follows is the basic process for contributing to VoxDB.
There are three stages to the clip contribution process: (1) location, (2) collection and editing, and (3) verification.
Stage 1: Location
A. Choosing your Contribution
Ensure that the person you are considering is appropriate and workable.
- Verify that your candidates were alive during a time when recording technology existed: no historical reenactments, role plays, or similar theatrical endeavors are useful here.
- VoxDB does not discriminate on the basis of ability, political affiliation, ethnicity, race, gender, national origin, sexual identity or preference, religious belief or afiliation, age or reason for which the individual came to be in the public arena. However, we can only accept voice clips from reasonably well-known figures, and out of consideration for the rights of private citizens we reserve the right to decline to include particular individuals who do not meet this requirement.
- Verify that no one else is planning to collect (or has already collected) samples from your candidate(s). Visit the wiki at:
If your choices are not already taken, log in and put them on the list along with your name and the course (if any) you are in.
- Once that is done, you are responsible for that voice and you may begin.
B. Finding the Best Source
Use your favorite search engine to locate Internet sites that contain the voice you are looking for.
To find the best source, you will have to listen to many possibilities and discard most of them. Do not be satisfied with the first one you find. We are looking for the highest quality recording from the most reliable source, and this will take some searching.
Here are things for you to consider as you search:
It is essential that your sampled voice belongs to the person you think it belongs to. To ensure this:
- You may not use previously clipped files such as samples from "Entertonement" or "The Daily WAV".
- Consider only reputable news sites such as NPR or the BBC.
- Locate the date each recording took place and as much information as you can about the circumstances under which the recording occurred: E.G. place, interviewer, occasion for the interview. This information may be on the website or may be spoken in the recording itself.
- Listen for the person's name in the recording. the interviewer may address the person by name or introduce him or her by name to the listening audience. If you are looking for a clip of a newscaster's voice, check the beginning or end of the recording where the newscaster introduces him or herself.
- Email firstname.lastname@example.org if you have questions about reliability.
Just as you would not want to look at a blurry picture, VoxDB cannot use muffled, noisy or otherwise distorted sound where better samples exist. For very well-known people in the 21st or late 20th century, high quality sound recordings will be readily available. For less-well-known figures or those from the early or middle 20th century, there will be fewer choices and thus we may have to be satisfied with lower sound quality.
Listening to the examples provided at http://www.VoxDB.org/training.html (preferably several times on different occasions) will help you to develop a sense of the sound quality you should be aiming for.
- Avoid muffled, echoey, fuzzy or otherwise distorted speech.
- Avoid samples that are recorded too loudly or too quietly.
- Avoid samples with background noise.
Be aware that the sound quality of even well-recorded press conferences is often compromised by the sounds of many cameras being used simultaneously.
- Unless there is no other option, do not choose recordings made over the telephone.
- Email email@example.com if you have questions about sound quality, and remember the samples available at http://www.VoxDB.org/training.html
C. Selecting Excerpts from within the Interview: content and other considerations
Once you have chosen a reliable source with good sound quality, you will locate 3 usable portions of that source.
For each voice, you need three separate clips: one which is 8-10 seconds in length and two that are 3-4 seconds long.
It is not essential that the lengths of the clips are exact. We are aiming for a group of clips that are subjectively (not precisely) "equivalent" in length.
You will identify usable portions by recording the time each begins and ends. For example, you might find a usable 10-second portion which begins at 4 minutes 13 seconds from the beginning of the interview and lasts until 4 minutes 23 seconds.
- Listen to the entire interview and keep track of the elapsed time as you go.
- The content of the clips (what the individual is saying) should be characteristic of that person's speech
so that it does not distract from the sound of the voice. Just as a driver's license or passport photo is intended to capture the subject "looking natural," the VoxDB voice collector listens for a passage that is generally characteristic of that person's usual manner of speaking. Do not deliberately seek out
startling or embarrassing utterances.
- Look for
speech that does not contain "adult" language or deliberately inflammatory material.
- The speech sample should consist primarily of declarative sentences with normal intonation (rather than yes-no questions, for example), and should be spontaneous -- or at least sound spontaneous and natural, as opposed to "lines" obviously being read or recited. You might make an exception to this rule if the person is a newscaster or someone else who is usually heard reading.
- Do not choose a sample in which two parties' voices (interviewer and candidate for example) are audible.
- Do not choose a passage in which the person mentions his or her own name.
- The beginning and end of your clip should not draw attention to themselves: Begin your sample at the start of a sentence and end at a syntactic boundary (the end of a phrase or sentence). The speaker should not sound as though he or she has been cut off or interrupted. You are looking for a complete thought.
- Email firstname.lastname@example.org if you have questions about clip selection, and remember the samples available at http://www.VoxDB.org/training.html.
Summary of stage 1:
For each person, you should have:
- The URL of the high quality, reliable source
- The start and finish time of a coherent 8-10 second passage of speech.
- The start and finish times of 2 additional coherent 3-4 second passages of speech.
With this information in hand, you have completed the Location Stage. If you are not collecting the sample yourself, skip to the last part of this document: Turning in your Work
Stage 2: Clip Collection
If you are on BGSU's campus, the best place to work is most likely the Language Learning
Center (LLC), 303 University Hall. Computers at the LLC have all the software you need already installed (including Audacity and the LAME mp3 encoder), and LLC personnel can lend you the headphones and patch cord you need. The following instructions assume that you are working at the LLC, but you may also choose to gather your own headphones and patch cord, install the software on your own computer, and work from there. However, the techniques described below may not work on all computers. (Important: the LLC computers are equipped with headphones with a USB plug. To make this procedure work, however, you must disconnect those headphones, and instead use headphones with a standard 3.5mm (1/8-inch) plug.)
You will need two kinds of software: a sound editor and an application (or browser plug-in) to play audio.
We recommend the use of Audacity, a free, open source aplication for recording and editing sounds. It is available for Mac OS X, Microsoft Windows, GNU/Linux, and other operating systems. You can install the latest version from http://www.audacity.sourceforge.net.
Most computers should already have the software to play sound files installed. If you have trouble with other computers, work at the LLC.
A. Step by Step Instructions for Recording Audio
We will shortly be including here keyboard equivalents to Audacity commands and other tips for users of screen readers.
- Plug one end of the patch cord into the microphone jack, and plug the "1/8-inch" headphones into the headphone jack. Go to the Apple symbol in the top left corner and select "System Preferences". From there choose the "Sound" option. Under "Output" select "Built-in output" and then under "Input" select "Audio line-in port".
- Open the Audacity program, go to the "Audacity" menu, and select "Preferences". Make sure that the "Audio I/O" tab is selected, and from there make sure the Playback Device is set at "Built-in Output" and the Recording Device is set at "Built-in Line Input".
- Go to the URL that has the audio that you want to record. In the application that will play the sound, cue the audio to a point slightly (about three seconds) before the content that you need.
- Remove the headphone cord from the headphone jack and insert the free end of the patch cord into the headphone jack. (With this arrangement, the playback will be recorded directly into Audacity, bypassing the microphone.)
- In the Audacity program, re-verify that the preferences are set as mentioned above ("Built-in Output" and "Built-in Line Input").
- Press "record" in Audacity. You should see a flat line begin moving across the screen. This means Audacity is recording. If the line is not flat, it means that external noises are being recorded. To fix that problem check the input settings again.
- After you verify that Audacity is recording properly, begin playing the audio sample. Since you won't be able to listen as you record, pay attention to the playback time to ensure that you've recorded enough audio. While the audio is playing, you should see a sound wave appear in the Audacity window. Ideally, the peaks of the sound wave should almost fill the entire height of the window. If they don't, you might need to re-record the sample at a higher playback level (If the playback level is already at maxiumum, you may also need to increase Audacity's sound input level). However, if the sound wave touches the top or bottom of the window, you might need to re-record at a lower input level.
- When the desired amount has been recorded, stop the playback and stop recording in Audacity. Unplug the patch cord from the headphone jack and plug in the headphones once again. You are now ready to edit the voice clip.
- Before you begin editing the clip, save it in its original form. If you have recorded one 30-second clip to start with, it may have all the content you need for all three clips. In this case, you save the original unedited version once. All clips must be saved in mp3 format (44100 Hz sample rate, 128kbps bit rate, which should be Audacity's default settings - if in doubt, check the "quality" and "file format" tabs in the Audacity Preferences menu).
To save, go to Audacity's "file" menu and select "Export as MP3". To avoid future confusion, title the file with the person's name, length of the clip and an indication that this is your first version, for example
B. Step by Step Instructions for Editing Audio
Although the maximum sound quality of your clip is determined by the original source (i.e., it is not likely that you can improve on the sound quality), it is quite possible to diminish the quality with sloppy editing. Listen carefully at each stage, and don't be afraid to start over.
- Once you have the audio recorded into Audacity, a ten-second or a three-to-four-second chunk needs to be excerpted. Highlight unwanted portions of the sound wave and use the "cut" command or press the backspace key.
- Remember that the clip should end at a syntactically permissible point such as the end of a sentence or clause. The end of the clip should not seem startling.
- After you have a clip of the appropriate length, highlight the entire clip, go to the "Effects" menu, and click "Normalize". This ensures that the loudest parts of all the clips in the collection will be approximately equal in volume.
- To avoid an overly loud beginning to the clip and an abrupt finish, use the fade-in and fade-out effects. Highlight the first .1 (one-tenth) second of the clip and under "Effect" select "fade in." Similarly, highlight the last .1 second, and select "fade out".
- Select "Export as MP3", and save your edited file under a new name such as
- IMPORTANT: Remember to document the date the candidate was recorded. (Not when the recording was placed online or made available for sale on a CD) Also, you need to document the URL where the audio is from and where in that URL's audio you've taken your sample by indicating the minute and second mark when you began recording.
- Play back your edited clip, and listen carefully and critically. Again, sound quality is vitally important. For examples of unacceptable clips go to http://www.VoxDB.org/training.html. If your clip sounds like one of the unacceptable examples, you will (at least) need to re-edit your sample; you will probably also have to re-record it, and you may even need to find a better source recording.
When you have finished three clips per person (two short ones and one long one), and are sure that they meet the sound quality and other criteria, you are ready to submit your work. Someone else will verify your work (Stage 3.)
Stage 3: Verification
In addition to ensuring that the submitted clips meet VoxDB standards for content, quality, and reliability, a third party will verify that the URL is correct for the given clips, that the dates are correctly notated and that the speaker is correctly identified. The clip will additionally be evaluated for sound quality. Clips may be rejected at this point or recommendations made for re-editing or re-recording.
Go to www.VoxDB.org/submit/ [not yet available] and fill in the form provided. We will keep your personal information confidential, and we will use it only to assist your instructor in record keeping or to contact you should there be problems with your submission.
VoxDB requires with the submission of each set of clips:
- your name (e.g. Robert P. Mueller)
- your email address (e.g. email@example.com)
- the class in which you participated if any (e.g. ENG 290)
- the date of submission of the clip (e.g. February 10, 2009)
- the name of the candidate (e.g. Bob Newhart)
- his/her place of birth (e.g. Oak Park, Illinois, USA)
- his/her year of birth (e.g. 1929)
- the source of the clip, such as the URL where the online interview is archived (e.g. www.npr.org/bob/)
- the date the candidate was recorded. (Not when the recording was placed online or made available for sale on a CD)
- A brief description of the speaker's accomplishments or status which makes him or her a candidate for inclusion in VoxDB I.E. book(s) written, office(s) held, membership(s) on sports team(s), or other accomplishments.
- We will also ask you to certify that the mp3s you are submitting are, to the best of your knowledge, truly recordings of the people you say they are, and that you are contributing them in good faith to be used as part of the VoxDB project.
Thank you for your participation. We hope it will be interesting and rewarding. Please contact us at firstname.lastname@example.org if
you have questions or if we can be of help to you in any way.