SDH - Captioning Key


Quality captioning must be:


Errorless transcription is the goal for each production.


Uniformity in style and presentation of all captioning features is crucial for viewer understanding.


A complete textual representation of the audio, including speaker identification and non-speech information, provides clarity.


Captions are displayed with enough time to be read completely, are in synchronization with the audio, and are not obscured by (nor do they obscure) the visual content.


Equal access requires that the meaning and intention of the material is completely preserved.

Computer-Assisted SDH Captions

Speechpad’s computer-assisted SDH captions are transcribed by humans, but the line division and timing are automated. 


Text refers to the appearance and presentation of the letters and words. Text considerations include case, font, line division, and caption placement.


Captions should be in mixed case. However, all capital letters may be used for an individual word to denote shouting.

Caption Placement

Captions should not obscure any text or important visual information on the screen. When text is part of the video (this is called “burned-in”), captions should be placed elsewhere so they do not overlap. To do this in Speechpad’s Subtitle Editor, select “Format Subtitles” from the “Window” drop-down menu.  A pop-up box will appear that will let you change the position of the caption. Move each caption to the top of the screen during the period of the burned-in text. If there is text simultaneously at the top and bottom of the screen, place the caption in the middle.


Caption text should always be aligned to the left.

Language Mechanics

Language mechanics incorporate the proper use of spelling, grammar, punctuation, capitalization, and other factors deemed necessary for high-quality captioned media. Rules included in these guidelines are primarily those which are unique to captioning and speech-to-text.

Spelling and Capitalization

1. Be consistent in the spelling of words throughout the production, including vocabulary that can be spelled either as one or two words or in hyphenated form. For conventional words, dictionaries, and style guides must be followed. Proper names, technical terms, and specialized language must be verified though specialty references or directly from an authoritative source. Remember that no single reference source can claim to be error free. 

2. Do not use British spellings or punctuation.

3. Do not emphasize a word using all capital letters except to indicate screaming. 

4. Be consistent in the spelling of words throughout the media. This includes vocabulary that can be spelled either as one or two words or in hyphenated form.

5. Capitalize proper nouns for speaker identification. All other speaker identification should be lowercased unless this identification is being used as a proper noun. Examples:

 Incorrect Correct 
 (bobby)  (Bobby)
 (Male Narrator) (male narrator)

6. Sound effects should be lowercase unless a proper noun is part of the description. Examples:
 [Machine Gun Firing] [machine gun firing]
 [Pinky Squealing] [Pinky squealing] 

Punctuation and Grammar

Always follow conventional rules of Standard English to the greatest extent possible, utilizing style guides to reach sound decisions.

Captioning spontaneous speech can be very difficult, as real conversations often contain improper grammar or run-on sentences, dialect, and slang. Problems are compounded with restrictions of time and space. As punctuation cannot correct non-grammatical speech, its role in captioning is to facilitate clarity and ease of reading. 

 As a general rule, written English language depends largely on word order to make the relationships between words clear. When word order alone is not sufficient to establish these relationships, there is little choice but to resort to punctuation that is sometimes unique to the captioning process.

Hyphens and Dashes

1. Nonessential information that needs special emphasis should be conveyed by double hyphens with no space.

2. When a speaker is interrupted, the interruption should be conveyed by double hyphens with no space.

3. When a speaker stutters, caption what is said. Example:
 Incorrect Correct 
 book  b-b-b-ook 
4. When captioning spelling (including finger spelling), separate capital letters with hyphens. Example: S-P-E-E-C-H-P-A-D


1. Use an ellipsis when there is a significant pause within a caption.

2. Do not use an ellipsis to indicate that the sentence continues into the next caption.

3. Use an ellipsis to lead into or out of audio relating to an onscreen graphic unless there is a complete sentence in the graphic that is more appropriately introduced by a colon.

Quotation Marks

1. Use quotation marks for onscreen readings from a poem, book, play, journal, or letter. However, use quotation marks and italics for offscreen readings or voice-overs.

2. Beginning quotation marks should be used for each caption of quoted material except for the last caption. The last caption should have only the ending quotation marks. Example:

Reading from a journal...

 Incorrect Correct 
 "Mother knelt down
and began thoughtfully fitting"
 "Mother knelt down
and began thoughtfully fitting
 "the ragged edges
of the paper together."
 "the ragged edges
of the paper together.
 "The process was watched
with spellbound interest."
 The process was watched
with spellbound interest."


1. Spaces should not be inserted before ending punctuation, after opening and before closing parentheses and brackets, before and after double hyphens, or before/between/after the periods of an ellipsis. Examples:
 Incorrect Correct
 ( narrator ) (narrator) 
 I am happy . . . thank you.  I am happy...thank you.

2. A space should be inserted after the beginning music icon (♪) and before the ending music icon(s). Example:

 Incorrect   Correct 
 ♪There’s a bad moon rising♪ ♪ There’s a bad moon rising ♪


Use italics as follows:

1. A voice-over of a poem, book, play, journal, letter, etc. (This is also quoted material, so quotation marks are also needed.)

2. When a person is dreaming, thinking, or reminiscing.

3. When there is background audio that is essential to the plot, such as a PA system or TV.

4. The first time a new word is being defined, but do not italicize the word thereafter.

5. Offscreen dialogue, narrator (see Exception 2 below) sound effects, or music (this includes background music).

6. The offscreen narrator when there are multiple speakers onscreen or offscreen.

7. Speaker identification when the dialogue is in italics and speaker identification is necessary.

8. Foreign words and phrases, unless they are in an English dictionary.

9. When a particular word is heavily emphasized in speech. Example: You must go!

Exceptions to the use of italics include:

1. When an entire caption is already italicized, use Roman type to set off a word that would normally be italicized.

2. If there is only one person speaking and no other speakers, whether on- or offscreen, use Roman type with no italics.

3. Do not italicize an offscreen interpreter while translating for a person onscreen. 

Sound Effects

Sound effects are sounds other than music, narration, or dialogue. They are captioned if it is necessary to the understanding and/or enjoyment of the media.

1. A description of sound effects, in brackets, should include the source of the sound. Example: [audience cheering]

2. Source description can be eliminated if the source of the sound can clearly be seen onscreen. Example: If a growling dog is the sole subject of a shot, [growling] can be used instead of [dog growling]

3. Offscreen sound effects should be italicized. This includes background music.

4. Sound effects must be lowercased unless a proper noun is part of the description.

5. If description is used for offscreen sound effects, it is not necessary to repeat the source of the sound if it is making the same sound a few captions later. Example:
 First Caption Later Caption
 [pig squealing] [squealing continues]
6. When describing a sustained sound, use the present participle form of the verb. When describing an abrupt sound, use the third person verb form. Examples:
 Sustained Sound Abrupt Sound 
 [dog barking]  [dog barks] 
 [papers crinkling] [papers crinkle] 
7. Caption background sound effects only when they are essential to the plot.

8. Caption the audience response only when it is essential to a better understanding of onscreen or offscreen action. Example:
 Inappropriate  Appropriate 
 (John) Bring out the band! (John) Bring out the band!
 [audience cheering]

9. When possible, use specific rather than vague, general terms to describe sounds. Examples:

 [horse running] [horse galloping]
 [bird singing] [falcon screeching]
10. Never use the past tense when describing sounds. Captions should be synchronized with the sound and are therefore in the present tense.

Speaker Identification

Establishing the identity of both onscreen and offscreen speakers is vital for clarity. When names are unknown, be as specific as possible in providing a label.

1. If offscreen speakers are speaking simultaneously, appropriate speaker identification must be added.

2. If the speaker’s name is known, the speaker’s name should be in parentheses. Example: 
 Incorrect  Correct 
 [Jack] Honey, I'm home! (Jack) Honey, I'm home!
3. If the speaker’s name is unknown, identify the speaker using the same information the hearing viewer has. Examples: (female #1), (male narrator)

4. Do not identify the speaker by name until the speaker is introduced in the audio or by an onscreen graphic. Exception: Characters in episodes of a series with known regular characters may be identified by name from the start of the episode in all episodes except the very first. 

5. If there is only one narrator, identify as (male narrator) or (female narrator) at the beginning of the media. It is unnecessary to identify gender for each caption thereafter.

6. When an actor is portraying or imitating another person or character, identify the actor as the person being portrayed. Example: (as George Washington) If the freedom of speech is taken away, then dumb and silent we may be led, like sheep to the slaughter.

Special Considerations

Spoken language is rich and full of meaning. However, it also consists of oddly formed sentences and even word play. Accuracy, clarity, and readability are challenges for the captioner.

Intonation, Play on Words, and No Audio

1. If the speaker is not visible onscreen, or visual clues that denote the emotional state are not shown, indicate the speaker’s emotion. Example: 
 Incorrect Correct
 Well, whatever! [angrily] Well, whatever! 
2. When a person is whispering, caption with [whispering]. Example: [whispering] Okay, you go first.

3. When feasible, describe puns. Example: 

        Why do they call her “Ouisy”? 


4. When people are seen talking, but there is no audio, caption as [no audio] or [silence]. 

Foreign Language, Dialect, Slang, and Phonetics

1. If possible, caption the actual foreign words in italics. If it is not possible to caption the words, use a description (e.g., [speaking French]). Never translate into English. 

2. Use accent marks, diacritical marks, and other indicators. 

3. Indicate regional accent at the beginning of the first caption. Example: 
 Incorrect Correct 
 If y'all want me to.  [Southern accent] If y'all want me to.
4. Keep the flavor of dialect. Example:
 Inappropriate Appropriate 
 You are sure not from around here.  You sho' ain't from 'round here.
5. Keep the flavor of the speaker’s language when necessary to portray a character’s personality. This includes captioning profanity and slang. Examples:
 Incorrect Correct 
 I'm not going anywhere.  I ain't going nowhere.
 [cursing]  Damn! 
6. When a word is spoken phonetically, caption it the way it is commonly written. Examples:
 Original Narration Captioned As
 "N-double-A-C-P" NAACP 
 "www dot D-C-M-P dot org"
 "eight or nine hundred" 800 or 900 
 "a thousand" a thousand 
 "one thousand"  1000 


1. When captioning music, use descriptions that indicate the mood. Be as objective as possible. Avoid subjective words, such as “delightful,” “beautiful,” or “melodic.” 

2. If music contains lyrics, caption the lyrics verbatim. The lyrics should be introduced with the name of the vocalist/vocal group, and the title (in brackets) if known/significant.

3. Caption lyrics with music icons (♪). Use one music icon at the beginning and end of each caption within a  song, but use two music icons at the end of the last line of a song. 

4. A description (in brackets) should be used for instrumental/background music or when verbatim captioning would exceed the presentation rate. If known, the description should include the performer/composer and the title. Examples: 

        [Louis Armstrong plays “Hello Dolly”] 

        [lyrical flute solo]

        [pianist playing the national anthem]

5. Beware of misplaced modifiers in descriptions. Example: [frantic piano playing] “Frantic” describes the piano in this case. [frantic piano music] is more appropriate because “frantic” describes the music, not the instrument.

6. For background music that is not important to the content of the program, a single music icon may be used.


Experts don’t always agree on rules for writing numbers or numerals. Captionists should follow a standard style manual and remember to be consistent.

Spelling Out

1. Unless otherwise specified below, spell out all numbers from one to ten, but use numerals for all numbers over ten. Examples: 
 Inappropriate Appropriate
 The fifty-four DVDs need to be shelved.  The 54 DVDs need to be shelved.
 He's at the thirty, the twenty, the ten, and scores!  He's at the 30, the 20, the 10, and scores!
2. Spell out any number that begins a sentence as well as any related numbers. Example:  Two hundred guests and eleven guides entered. 

3. Spell out casual, nonemphatic numbers. Example: He gave me hundreds of reasons. 

4. Numerals with four digits do not need a comma. Numerals containing five or more digits do need a comma. Example: I started with 1000 tickets and somehow ended up with 20,000!

5. Use numerals in a listing of numbers if one or more is above ten and these occur in one caption or one sentence. Example: 
 Incorrect Correct 
 She has 21 books, 11 oranges, and three cats.  She has 21 books, 11 oranges, and 3 cats.

6. Use numerals when referring to technical and athletic terms. Example: He scored 3 goals in today’s game!

7. When indicating sequence, capitalize the noun and use numerals. Exceptions are the indication of line, note, page, paragraph, size, step, or verse. Examples: 
 Building 2 page 31
 Channel 5 size 12 
 Chapter III step 3 
 Room 438  paragraph 2 


1. Use the numeral plus the lowercase “th,” “st,” or “nd” when a day of the month is mentioned by itself (no month is referred to). Example:
 Original Narration Captioned As 
 Bob went fishing on the ninth. Bob went fishing on the 9th.
2. When the day precedes the month, use the numeral plus the lowercase “th,” “st,” or “nd” if the ending is spoken. Example: 
 Original Narration Captioned As 
 My birthday is on the seventeenth of June. My birthday is on the 17th of June. 
3. Use the numeral alone when the day follows the month. Example: 
 Original Narration Captioned As 
 I will meet you May ninth.
I will meet you on May nine.
 I will meet you on May 9.
4. When the month, day, and year are spoken, use the numeral alone for the day, even if an ending (“th,” “st,” or “nd”) is spoken. Example: 

 Original Narration Captioned As 
Paul will marry on July sixth, 1996.
Paul will marry on July six, 1996.
Paul will marry on July 6, 1996.


1. Indicate time of day with numerals only. Examples: 

        I awoke at 5:17. 

        If you wish to attend, you must arrive by 6:25 p.m. 

        We were expected to report no later than 1400 hours. 

2. Always use numerals when the abbreviation “a.m.” or “p.m.” is present. Double zeros are not necessary to indicate minutes of the hour when a whole number is used with a.m. or p.m. Examples: 

        She leaves at 3:20 p.m. for the airport. 

        Our hours are from 9 a.m. to 5 p.m. 

        We’re leaving at 6 in the morning. 

Periods of Time

1. A decade should be captioned as “the 1980s” (not “the 1980’s”) and “the ’50s” (not “the 50’s”). 

2. If a decade or century is in noun form, do not use hyphens. Example:  This vase is from the 17th century. 

3. If a period of time is used as an adjective, use a hyphen. Example: This 19th-century painting was done by Van Gogh. 


1. Either spell out or use numerals for fractions, keeping this rule consistent throughout the media. If using numerals, insert a space between a whole number and its fraction. Example: 
 Numeral Spelled Out
 Do you plan to eat 1 ½ pizzas? Do you plan to eat one and one-half pizzas?
2. Do not mix numerals and spelled-out words within the same sentence. Example: 
 She is 13 and a half years old. She is 13 ½ years old.
3. If a fraction is used with “million,” “billion,” “trillion,” etc., spell out the fraction. Example: The population was over one-half million.

4. Fractions expressed in figures should not be followed by endings, such as “sts,” “rds,” “nds,” or “ths.” Example: 
 Incorrect Correct 
 3/10ths 3/10 


Use numerals and the percent sign to indicate all percentages except at the beginning of a sentence. Examples: 

 Middle of Sentence Beginning of Sentence 
 Only 6% of the votes were counted. Fifty-one percent of the people voted.

Dollar Amounts

1. Use the numeral plus “cents” or “¢” for amounts under one dollar. Examples: I need 15 cents. I owe you 32¢.

2. Use the dollar sign plus the numeral for dollar amounts under one million. For whole-dollar amounts of one million and greater, spell out “million,” “billion,” etc. Examples: 

        John brought only $11. 

        Bob brought $6.12.

        The budget of $13,000 will be sufficient. 

        Taxes will be reduced by a total of $13 million. 

        He owes $13,656,000. 

3. Use the word “dollar” only once for a range up to ten. Example: I hope I find three or four dollars.

4. Use the dollar sign and numerals when captioning a range of currency over ten dollars. Example: Alice expected a raise of $6000 to $7000.


1. Spell out units of measurement, such as “inches,” “feet,” “yards,” “miles,” “ounces,” “pounds,” and “tablespoons.” However, if spoken in shortened form, symbols should be used. For example, if the original narration is “I’m five eight,” it should be captioned as: I’m 5'8". 

2. For whole numbers, use numerals. For example, caption “3 cups of sugar” instead of “three cups of sugar.”