Please return any job that has blank audio, is mostly unintelligible, has only music, and/or is 100% foreign language and send an email to worker@speechpad.com.
Contacting any Speechpad customer or anyone referenced in an audio/video file under any circumstance is a violation of your Confidentiality Qualification and will result in immediate termination of your account.
Please contact worker@speechpad.com with any questions or concerns about your account, your ratings, or files. We're always here to help.
This is the Web Captions style guide. Note the highlighted sections that indicate major differences between captions and the general style guide.
You are expected to produce a highly accurate transcription which is client ready.
Transcripts should be non-verbatim (clean copy) with all "umm"s, "ahh"s, false starts, and stutters omitted unless requested by the customer.
Do not paraphrase or rewrite what is said.
Do not leave out conversations or other material unless requested by the customers. If there is a video or audio presentation that is audible, please type it.
Promotional ads and commercials should be transcribed unless they are not relevant to the overall file. If in doubt, include the promotional ads and/or commercials.
Do not type descriptive tags or sound effects (e.g. [laughter], [cough], [doorbell rings]).
You can install Grammarly for your browser for free. It helps with spelling and punctuation. No program will detect all of your mistakes so it’s always a good idea to re-review your work before you submit it.
Here's a great resource for fine-tuning your grammar with practice exercises.
Always use U.S. English spelling and grammar in all transcripts unless requested specifically in the special instructions.
Slang terms should not be used. (E.g. Gonna, wanna, shoulda, etc. Should be going to, want to, should have, etc.)
You may start a sentence with a conjunction (e.g. And, But, Or, Yet, So, etc.) including commas.
Look up all names and acronyms to make sure they are correct. Reference the video (on-screen text) to confirm spelling/capitalization if applicable.
Because" should always be typed out. Never use 'cuz, 'coz, or 'cause.
"All right" should always be used. Never use "alright."
"etc." should always be used. Never use "et cetera."
In ranges and ratios, always type out "to" never use a "-" (hyphen) or “:” (colon)..
[SP] should be used after words, places, and names that you are not sure how to spell. Use [SP] only after the first occurrence of the word, name, or place in the transcript. [SP] is not an appropriate substitute for [inaudible].
Use all lowercase letters for websites and type the website as you would when searching for it.
Example:
If the speaker says "S-P-E-E-C-H-P-A-D-DOT-COM" type it as speechpad.com
When someone spells a word, use capital letters, separated by dashes. Separate the complete word from the spell-out with a comma and transcribe the letters in uppercase.
Example:
- [Joe] His name was Bobby, B-O-B-B-Y.
When a hyphenated name is the subject of a spell-out, the words "dash," or "hyphen," should be typed out for clarity.
Example:
- [Joe] S-M-I-T-H-dash-J-O-N-E-S
Filler words such as "like, you know, kind of, sort of, I mean" must be included unless they are used three or more times per sentence and they are not necessary to the meaning of the sentence.
Note not all instances of like, you know, kind of, etc. are, in fact, fillers.
Examples where fillers should be removed:
- [Student 1] Like, I'm gonna, like, get my, like, hair done, like, tomorrow.
- [Student 2] Oh my gosh, you know, I shouldn't, have, like, you know, said that, you know?
Examples where fillers should not be removed:
- [Student 1] Like, you know, it’s sort of cloudy out.
- [Student 2] "I mean, like, sometimes fillers just shouldn’t be removed, you know?"
Examples of fillers and not fillers.
The story kind of went like that, but not exactly.
Like, you know, it’s sort of cloudy out.
Do not transcribe interjections such as "okay," "yeah," "uh-huh," etc. if they are not answers to questions and are only interrupting the previous speaker.
Example incorrect:
- [Joe] I was walking down the road…
- [Mike] Uh-huh.
- ...and saw a cat.
Example correct:
- [Joe] I was walking down the road and saw a cat.
Single spacing after end punctuation.
As per U.S. grammar, all punctuation happens inside quotation marks.
Transcripts should abide by U.S. English grammar, spelling and punctuation rules unless requested otherwise.
Useful guide for brushing up on comma rules.
http://grammar.ccc.commnet.edu/grammar/commas.htm
Some commas are optional and including or removing them is subjective. Adding or removing optional commas by the reviewer has a negligible effect on your overall rating for a file.
You may start a sentence with a conjunction (e.g. And, But, Or, Yet, So, etc.) including commas.
Oxford/serial comma is allowed. (Grammarly’s explanation of the Oxford comma below)
Colons: should only be used when you are typing a list of items within the sentence or you’re separating two clauses of which the second expands or illustrates the first.
Semicolons; should not be used in Speechpad transcripts (due to frequency of misuse vs. correct use).
Use ellipses ... (no space before or after) when there is a long thinking pause, interruption in the sentence, or if a speaker trails off and doesn’t complete a sentence.
Example:
Then we decided to go see Harry and that was…
Is this where we look at his...his illustration about the talking fish?
Use ellipses ... when there is a change of thought in mid-sentence.
Ellipses should have no space before or after the ellipses if a new sentence is not started and it is the same thought.
Example:
He was going to...well, maybe not.
Ellipses should have a space after the ellipses if a new though and new sentence is started.
Example:
He was going to... Hey, get that camera out of my face.
Microsoft Word auto-formats to “curled” or “smart” quotes by default. To prevent our system marking them wrong, turn them off. Here’s how in the most recent versions of Word:
Common abbreviations should be typed as such:
Examples:
United States = U.S. not US
United Kingdom = UK
United Nations = UN
Do not abbreviate state names when said in full.
If someone says "Mass," meaning Massachusetts, typing Mass is correct.
Type out “versus.” Do not use “vs.”
In legal files, use v. if “v” is spoken instead of “versus”
Example:
Roe v. Wade
Type “etc.” Do not use "et cetera."
Type "i.e." Do not use "id est."
Type junior or senior unless it’s part of a person’s name.
Examples:
He’s a junior member of the team.
His name is Robert Downey Jr., but please don’t call him just Junior.
In ranges and ratios, always type out "to" never use a "-" (hyphen) or “:” (colon)..
Type “Okay.” Do not use "Ok or OK"
**see exception for technical files**
Do not abbreviate measurements.
Examples:
Ounce, not oz.
Millimeters, not mm
Pound, not lb.
1920 by 1080, not 1920 x 1080
Only use the pound sign ( # ) when someone says "hashtag" if a specific hashtag is being referred to.
Examples:
#speechpad
#isapoundsign
#lookslikeasharp
Contractions should be typed as spoken.
Be sure to use the following pairings correctly when transcribing or reviewing. If they are typed or reviewed incorrectly three times, your qualifications will be revoked:
"Your" (possessive) vs. "You're" (short for you are)
"Its" (possessive) vs. ”It's" (short for it is)
"There" (directional) vs. "Their" (possessive) vs. "They're" (short for they are)
We’re (short for we are) vs. were (past/plural)
We’ll (short for we will) vs. will (verb)
You must include timestamps every 30 seconds. For foreign language transcription or translation, timestamps should be included every 15 seconds.
Timestamps must coincide with the running length of the video. Do not use burned-in timecodes if they are on the video.
Timestamps can appear in the middle of sentences.
Example:
Today I'm talking with Sean Platt. I think you [00:02:00] were in your late 20s when you decided to pursue writing, is that right? [00:02:32]
You may find it useful to use timestamps as "paragraph breaks." Paragraph breaks are not needed in captions transcriptions unless otherwise specified, but may help with readability for proofreading.
Do not put punctuation immediately before or after timestamps.
Incorrect Example:
[00:00:30].
Never round timestamps up or down. If the timer says [00:01:58] do not write [00:02:00]
Never include a [00:00:00] timestamp.
Never include an end of file timestamp.
Use format [hh:mm:ss] or the Insert Timestamp button on the Speechpad console.
Files must pass validation in order to submit. Follow any error message instructions to resolve.
Common validation errors are including timestamps out of order (5:00 before 4:30) etc.
All speakers/conversation must be identified/transcribed unless the special instructions indicate otherwise.
Use the format - [speaker name] (hyphen space open bracket...)
Identify speakers by name the first time only. After the first time, only use a - (hyphen) to indicate speaker change. (multiple speaker files only)
Do not add a - (hyphen) after ♪ [music] ♪ or [silence] if the speaker is the same.
Only use first names for speaker tokens. Exceptions are as follows:
If two speakers have the same first name, use first name and first initial of their last names, if known. For example, the speakers are Jennifer Garner and Jennifer Aniston. The speaker tags would be Jennifer G. and Jennifer A.
If a speaker is referred to by a title and last name, use their title and last name.
Do not use first names for speakers with titles.
Examples:
- [Dr. Jones]
- [Senator Smith]
- [President Carter]
- [Father Brown]
Use generic tags for unnamed speakers.
Examples - acceptable generic tokens:
- [Interviewer]
- [Interviewee]
- [Man] or - [Woman]
- [Male] or - [Female]
- [Boy] or - [Girl]
- [Facilitator] or - [Moderator]
- [Instructor] or - [Teacher]
- [Announcer] or - [Voiceover]
- [Student]
- [Audience Member]
Number generic speakers as - [Man 1] - [Man 2] etc., if possible.
- [Speaker] is not an acceptable generic tag.
Refer to any video (if given) for a speaker's name or title before using a generic tag.
Files with one speaker.
Your transcript starts with the speaker's first words. Do not include a - [speaker ID] or speaker - (hyphen).
Files with 2+ speakers.
Identify speakers by name the first time only. After the first time, only use a - (hyphen) to indicate speaker change.
Example correct:
- [Wendy] Hello.
- [Amy] Hey, Wendy. How are you?
- I'm great. How are you doing?
- Good. Just trying to think of something to write.
Example incorrect:
- [Wendy] Hello.
- [Amy] Hey, Wendy. How are you?
- [Wendy] I'm great. How are you doing?
- [Amy] Good. Just trying to think of something to write.
Short files (approx. 5 minutes or under) with multiple (approx. 4+) unidentified speakers.
Use only a - (hyphen) to indicate speaker change. Do not use - [Male/Female 1,2,3,4,5....]
Some files may have a participant who translates or interprets for another speaker. (Sign language, foreign language, etc.)
How you label this participant depends on their participation as an individual versus as a strict interpreter.
Interpreter strictly "speaks" for another participant and does not participate as an individual - does not have their own speaker identification.
Example - strict interpretation situation:
Interviewer asks, "How do you feel about that?"
Interpreter translates question to interviewee.
Interviewee answers in a [foreign language].
Interpreter translates for the interviewee, "Well, how I feel is..."
Example - strict interpretation transcript looks like:
- [Interviewer] ...How do you feel about that?
- [Interviewee] [foreign language] Well, how I feel is...
Interpreter speaks for themselves and for another participant - does have their own speaker identification (only when speaking for themselves).
Example - own participant:
- [Interviewer] ...How do you feel about that?
- [Interviewee] [foreign language] Well, how I feel is...
- [Interpreter] [foreign language]? (back and forth [foreign language] conversation with Interviewee)
All titles that would be in italics or quotation marks will use quotation marks.
If quotations need to be changed to italics, it will have a negligible effect on your file rating.
Microsoft Word auto-formats to “curled” or “Smart” Quotes by default. To prevent our system marking them wrong, turn them off (WebCap2c).
Spell out all numbers from zero to nine. Use numerals for all numbers 10+
**There may be exceptions explained below**
Spell out any number that begins a sentence.
Example:
Two hundred guests and 23 guides entered.
Ten percent of them were women.
Spell out casual, non-emphatic numbers.
Example:
He gave me hundreds of reasons.
Numerals with four digits can have (but do not need) a comma. Numerals with five or more digits must have a comma.
Example:
I started with 1000 tickets and somehow ended up with 20,000!
Use numerals for all numbers in the sentence if one or more numbers is also a numeral
Examples:
She has 21 books, 11 oranges, and 3 cats
I'm the father of 6 children and they're aged 17 through to late 20s now. But at one stage, we had 6 children under 12.
It added up to $2.6 million worth of free publicity in the first 4 days of launching.
And I believe that probably 7 or 8 out of 10 websites…
When indicating sequence, capitalize the noun and use numerals. Exceptions are line, note, page, paragraph, size, step, or verse.
Examples:
Building 2, floor 31
Channel 5, channel 12
Chapter 1, page 3
Room 438, building 2
Session 2, module 4.
Episode 5
Ranges should be written in numerals. “To” must be spelled out.
Examples:
"three to four thousand dollars." should be "$3,000 to $4,000."
"five, ten, fifteen thousand." should be "5,000, 10,000, 15,000."
Always use numerals when the abbreviation "a.m." or "p.m." is present.
Double zeros (00) are not necessary to indicate minutes of the hour when a whole number is used with a.m. or p.m.
Examples:
Spoken - "I woke up at five." (or five o'clock)
Transcribed - "I woke up at 5:00."
Our hours are from 9 a.m. to 5 p.m.
If you wish to attend, you must arrive by 6:25 p.m.
We were expected to report no later than 1400 hours.
We’re leaving at 6 in the morning.
Dates should be written in numerals with any suffixes, as spoken.
Examples:
It's 11/20/2017. (Spoken eleven, twenty, two-thousand seventeen or "twenty seventeen")
Today is Monday, November 20th. (Spoken twentieth.)
Today is Monday, November the 20th.
Today is the 20th of November.
Today is November 20. (Spoken November twenty.)
When a year is the first word in a sentence, use numerals.
Example:
1986 was the year I was born.
Decades should be written without apostrophes.
Correct:
the 1980s
the '50s
Incorrect:
the 1980’s
the 50’s.
Use a currency sign only if you are sure of the correct currency being referenced.
All notation of "dollars" will use the U.S. $ (dollar sign) sign unless specific alternate currency is known.
Use $ plus numerals for dollar amounts under 1 million.
For whole dollar amounts of 1+ million, use numerals, but spell out "million," "billion," etc.
Examples:
Taxes will be reduced by a total of $13 million.
He owes $13,656,000.
Use the word “dollar(s)” and standard number rules for amounts under $10.
Examples:
I hope to find three or four dollars.
John brought only $11. (spoken "dollars" or "bucks")
For amounts under one dollar, use the numeral and the word "cents."
Example:
I need 15 cents.
Currencies commonly mentioned and their symbols:
Examples:
British Pound. (£)
Euro. (€)
Australian Dollar ($, or AU$).
Candian Dollar ($, or CA$)
Spell out common fractions.
Examples:
One-half
One-third
Three-quarters
Use numerals for decimals or complex fractions and if there are already numerals in the sentence, based on topic.
Examples:
$5.5 million dollars. Not $5 ½ million dollars. (spoken as five point five or five and a half)
4 and ½ cups of flour and 12 eggs. (common recipe notation indicates using numeric fractions)
Percentages should be numerals with the % (percent) sign.
Spell out the number and “percent” if the percent is the first word in the sentence.
Example:
Fifty-one percent of the people voted, but only 6% of the votes were counted.
Ninety-nine percent of the population use 10% of their brains.
We have 10.5% participation. (Spoken ten point five percent)
That isn’t 2% milk, it’s 2.5% milk. (Spoken two and a half percent)
If you cannot understand something, listen to that section of audio again. The context often helps in figuring out the word.
Check if technical terms are spelled out in the video.
If you still can’t understand, tag [inaudible hh:mm:ss].
Captions reviewers will remove the timestamp if they cannot fill in what the inaudible should be.
Using an [inaudible] tag is always "more correct" than guessing and ending up with something nonsensical.
[inaudible] tags only count against your rating on a file if a reviewer or QA agent is able to decipher what was said.
Make sure you are using adequate equipment for the file you have selected and choose to commit only to files you are certain you are able to complete to your best quality.
Music that lasts by itself for 3+ seconds must be tagged and timestamped.
Formatted as:
[hh:mm:ss] (when music starts)
♪ [music] ♪
[hh:mm:ss] (when next speaking starts)
Use the ♪ [music] ♪ tag for instrumental sections.
Copy/paste ♪ sign or Alt+1+3 on keyboard number pad.
Music that happens behind speaking (background music) while speaking is happening does not need to be tagged.
If music starts the video, do not include a starting ttimestamp.
Formatted as:
♪ [music] ♪
[hh:mm:ss] (when speaking starts)
If music ends the video, do not include an end timestamp.
Formatted as:
[hh:mm:ss]
♪ [music] ♪
When a speaker sings or speaks in a “singing” voice. Use ♪ marks around the text.
Example:
- [Instructor] While we wait for this to load, ♪ here I am singing some words. ♪
If the file has music that is important to the file, (concert, band interviews with songs) you must transcribe the lyrics.
Do not transcribe lyrics for music mixing/production (sound engineer) files.
You still notate ♪ [music] ♪ when there are no lyrics being sung as per music tagging rules.
If you can find the lyrics online, you can copy-paste them into your transcript, but...
You must listen to the video and change any lyrics to match the video.
If you are unable to understand most of the lyrics, mark the whole song ♪ [music] ♪ and [inaudible] as appropriate.
Slang is acceptable in lyrics.
Do not type lyrics in paragraph form.
Block off the beginning and end of any "paragraphs" of lyrics with ♪ notes.
Example:
♪ Twinkle, twinkle, little star
How I wonder what you are
Up above the world so high
Like a diamond in the sky
Twinkle, twinkle little star
How I wonder what you are ♪
♪ When the blazing sun is gone
When he nothing shines upon
Then you show your little light
Twinkle, twinkle, all the night
Twinkle, twinkle, little star
How I wonder what you are ♪
Reviewers will add music notes per subtitle group. This will have an effect on your score. Files that are known concert files are automatically exempted. If your score has been affected by lyric formatting, email us at worker@speechpad.com and we’ll check/exempt the rating.
Silence lasting for 5+ seconds must be tagged.
Formatted as:
[hh:mm:ss] (when silence starts)
[Silence]
[hh:mm:ss] (when next speaking starts)
If silence starts the video, do not include a starting timestamp.
Formatted as:
[Silence]
[hh:mm:ss] (when speaking starts)
If silence ends the video, do not tag it. Do not add a timestamp.
Silence tags are used for syncing purposes only. The [Silence] tag will be removed by the reviewer.
This has a negligible effect on your file rating. If you feel your rating has been severely affected by any captions formatting, email worker@speechpad.com and we’ll take a look.
When speakers overlap each other and each person cannot be distinctly understood, use a [crosstalk] tag.
Reviewers may change dialogue to [crosstalk] if both speakers’ text cannot appear in the captions at the same time.
[crosstalk] tags do not need timestamps.
The [crosstalk] tag can be in the middle of a sentence, it does not need to go on a separate line.
When a speaker uses anything non-English, use a [foreign language] tag.
You can use [speaking French] (with the appropriate language) if you are 100% certain.
If the phrase is short and well known, or easy to understand, please type it.
Example:
- [Interviewer] Thank you for joining us, today.
- [Interviewee] Thank you. Gracias. I’m excited to be here.
For longer foreign language sections, if you understand the spoken language and want to transcribe it, mark [foreign language] and skip.
Email worker@speechpad.com to receive permission before transcribing in a foreign language.
[foreign language] tag does not need timestamps.
[foreign language] tag can be in the middle of a sentence, it does not need to be on a separate line.
If the audio is 100% foreign language, email worker@speechpad.com and return the file.
If the video has on-screen captions for foreign language, do not re-transcribe them.
If the video has on-screen captions in English for PART of the file, do not re-transcribe them.
If the video has on-screen captions in English for the WHOLE file, email worker@speechpad.com with the AID#.
Standard captions files do not include any [ambient noise] tags.
If speakers create their own noises, try to spell them as phonetically as possible or use a [vocalization] tag.
Some sound effects are easier to spell phonetically than others. Do your best, but no need to slave over trying to spell something phonetically. Just use a tag.
Example:
Go up to your Settings and you click on... let's see here... [vocalization]. Let's start that over.
(speaker blows raspberries where the [vocalization] tag is.)
Example:
How does a squid go into battle? Well armed. Ba-dum tss!
(speaker makes noise of hitting drums and a cymbal)
SDH captions-only
Files marked as SDH Captions in the job title require [ambient noise] tags.
[ambient noise] tags are in brackets.
Examples:
[doorbell ringing]
[laughter]
[children singing]
[ambient noise] tags do not need timestamps.
[ambient noise] tags can be in the middle of a sentence, it does not need to be on a separate line.
Computer programming files (Java, Node, AWS, etc.)
3D Image/Photo Editing (Photoshop, Cinema 4D, After Effects, etc.)
Anything requiring keyboard commands, links and code.
Navigation headings and tool names should be capitalized.
Write keyboard commands, keystrokes, and codes as they appear on your keyboard/screen.
Example:
Code as typed in the video: <Provider store={store}>
As spoken: "Provider store equals store"
As transcribed: Provider store={store}
Do not use all caps unless requested in special instructions.
Examples:
Ctrl+Alt+Del
Press the OK button
Get-ChildItem
Ctrl+- (not) Ctrl+minus key
Do not use < or > (Some special characters like these interfere with captions generation.)
Use these rules when any clear equations are being referenced. The whole file does not have to be related to math.
Unless requested by the client, do not write mathematical equations with symbols.
“-” can be used instead of writing out “negative” only when referring to a numerical value.
Examples:
Plus not +
Minus not -
Equals not =
-2.5 not negative 2.5
Our answer is negative.
Always use numerals in the body of the equation.
Examples:
1 plus 1 equals 2
2 times 10 equals 20
Coordinates/Intercepts may be typed with their parentheses, including negative.
Examples:
(0,1),
(4, 5)
(-2, 10)
(5, -0.2)
Standard number rules apply if the sentence contains a number that is not part of an equation.
Example:
We have two answers: 5 and -10.
Variables should be in lowercase letters unless shown differently in the video.
Examples:
4x plus 3y equals 10
f of x equals 2/3y
1 plus 1.5 equals 2.5
-3 minus -4 equals 1
Note that some rules in this section are contrary to the General Transcription Style Guide due to ADA compliance and accessibility for our captioning services.
Words that should be capitalized are:
Pronouns referring to God He, Him, His, Your, You, etc.
Book (Bible)
Word (Bible)
Christ
Christian
Jesus/Muhammad but not pronouns he, his, him, your, you etc.
Holy Spirit
References to the prophet Muhammad will often be followed by the phrase "peace be upon him" or "PBUH"
References should always be written out in full, as-spoken.
Removing words from what is spoken can alter the timing of the captions since our system tracks speech versus the text submitted when generating the initial captions reviewers will receive.
Examples:
Revelation Chapter 4 verse 2 (spoken as such - typed as-spoken)
John 3:16 (Spoken John three sixteen.)
Always use numerals when referring to pages, verses, chapters.