Transcription is completed following the standard grammar rules for the target language. However, there are certain exceptions to those rules, and these guidelines give a brief explanation for those exceptions.
Some examples are given in English, but those general principles should be applicable for most languages. Use this document as an opportunity to understand how guidelines affect how your language should be transcribed.
Transcription quality (Context): The transcription should be verbatim, without any typos, double blank spaces, or unnecessary punctuation. All repetitions should be transcribed. Context should be used whenever transcribing (“ate” vs. “eight”), and in cases of ambiguity or homophones words should be looked up.
You can activate a keyboard option in the parameters to help you spot spelling mistakes, but do not solely rely on this as grammar is not looked up by this tool. If the speaker intentionally says something, the transcription should reflect that, and no corrections should be made to what the speaker said.
No words should be added or taken away from the transcription. Even if a speaker misses a word in a sentence, and you are certain what that word should be, don’t add the missing word.
Some exceptions exist depending on your transcription language. For most of them, you have to transcribe implied times and currency for example.
If a word is said by mistake, unintended by the speaker, it should be transcribed.
Punctuation: Punctuation should be added where needed but should be kept to a minimum. One very important distinction should be made between full sentences and fragments. A full sentence contains a subject, a verb and conveys a statement, question, exclamation, or command.
A fragment is a part of a sentence, and a fragment shouldn’t be formatted as a full sentence. Usually this means that there should be no end punctuation. (A rule of thumb: if you want to use an ellipsis, leave out the punctuation)
Examples of English fragments:
weather in Boston
pictures of dogs
How was the (this sounds like a beginning of a sentence, but it’s not finished, hence it’s a fragment and no end punctuation is used)
Voice actions vs. Web searches: In the same way there are distinctions between full sentences and fragments, there is a distinction between voice actions and web searches.
Voice actions are queries where the user of a device requests a specific action from the device.
These do not need to be formulated as sentences to be punctuated.
Examples:
Dining room lights in blue.
Alarm at 7:00.
Timer 5 minutes.
Take On Me on Spotify.
88.9 on TuneIn.
Volume 70 %.
On the other hand, web searches are most often spoken as fragments:
Examples:
restaurants near me
directions to LA
furniture for sale
Commas: Commas should be used when needed. Don’t rely on intonation for commas placement. Even if the speaker makes a pause, a comma should be added only if required by the structure of the whole sentence.
Use a comma when a sentence starts with a discourse word, interjection, or yes/no word.
Examples of English fragments:
Well, I thought you had company.
Yes, I can do that for you.
Use commas before tag questions and sentence-final "too", "also", "please", "however", "sorry", etc.
Example:
I want a coffee, too.
Commas should be used when listing things, and when signing off. However, end punctuation shouldn’t be added to the end of messages.
Example:
Sincerely, John
Quotation: There are always specific rules for reported speech. It usually takes the form of a comma or colon after a reported speech verb, followed by the speech itself in quotation marks. Most commonly the reported speech verb is “said”.
Examples of English:
He said, “Let’s go!”
Examples of French:
Il a dit : “Allons-y !”
In the above example you can see that the direct quotation is a full sentence. In such cases, the full sentence should be capitalized and punctuated as it would normally be. Don’t add end-punctuation after the quotation, if there is already end-punctuation inside the quotations. (Don’t do .”. )
If the direct quotation is not a full sentence, then treat it as a fragment.
Example:
Amy said “butterflies”. (note that here the punctuation is outside the quotation marks)
Other symbols: Not all symbols are allowed for your transcription. You might have to ignore accents or symbols from media titles for example.
Example:
Play Senorita on Spotify.
If you are not able to transcribe ”ñ” in your language, just transcribe ”n” instead.
Spoken punctuation: Sometimes a speaker can speak out punctuation. This happens most often when the speaker is dictating a message. In cases of spoken punctuation, it should be marked with {}.
Example:
Hello John {comma} how are you {question mark} I hear the weather is fine over there {period} {smiley face}
Numbers : Numbers smaller than 10 (i.e. 0-9) are written with letters. Numbers greater than 9 are written with numbers.
The turning point being set at 10 is for most languages, but some start using numbers from 13 onward, like German. An exception to this are units of measurement and currency:
Example:
Cancel my two alarms for tomorrow.
There are 12 cans of soda in the package.
For longer numbers (4+ digits) some languages will ask for a separator every three digits. It can be a comma, a period, a space or even no separator. You must check your guidelines to make sure you are formatting numbers the right way.
Example:
7,855,967 (in English) / 7855967 (in French)
Fractions should be transcribed with numerals and slashes.
Example:
2/3 * 5/16
When the fractions are referring to items, not measures or currency, they should be written with letters.
Example:
I will eat only a half of the portion.
Currency : For most languages, you will use symbols for currency amounts in very specific currencies. It usually includes the common currency in the main country the language is spoken in, but you must check in the guidelines what applies to your language. In case another currency is used, spell out the word.
When only cents are mentioned, transcribe the word “cent” or the equivalent for your language.
Example:
5 cents
Example:
7,855,967 (in English) / 7855967 (in French)
Units : Whenever a numeric value is used in conjunction with a unit of measure, it should be written with numerals, even if less than 10. The unit of measure should be abbreviated. When only cents are mentioned, transcribe the word “cent” or the equivalent for your language.
Example:
How many miles are there in 5 km? (miles is not abbreviated because it is not used in conjunction with a number, on the contrary of kilometer, which is then abbreviated.)
The apartment is 1,500 ft². (transcribe square and cube using the symbol)
I will be there in 2 minutes. (even though minute is a unit of measure of time, it’s not abbreviated since it would be unnatural to do so. The same goes for seconds, hours, years, etc.)
Date and time : Dates and time are transcribed using the natural form. This natural form varies depending on the language and you should therefore check in the guidelines how it is done in your language.
Example:
July 12th, 1964
7/12/2010
•Times are formatted too. It is extremely common to encounter this in transcription and it is impossible to have an exhaustive list here. Make sure you know how to handle this by heart for your language.
Example in English:
3:00
7:15 a.m. (a.m. and p.m. are used only if spoken)
Example in French:
3h
15h00 (only if the user precise o’clock)
Example in German:
3 Uhr
Address : Addresses should be fully spelled, unless an abbreviation is explicitly used. Use commas between address, town, and state.
Example:
751 Jefferson Street, New York City
9 Boylston Street, Chestnut Hill, Mass
Use commas for ENTITY, LOCATION.
Example:
doctors, Toronto
French restaurants, Laval
Web: Write URLs, email addresses, and Twitter hashtags as they are spoken and don't capitalise them.
Example:
I love pizza. #hungry
name@email.com
amazon.com
Abbreviations : Do not abbreviate, unless the speaker explicitly uses the abbreviated form.
Some languages have exceptions for this. “et cetera” which should always be written as “etc.” but titles preceding proper names will sometimes be abbreviated like in English.
Example:
I went to see Dr. Jones. (even if the speaker said “doctor Jones”)
In acronyms, do not use periods between letters.
Example:
AT&T, SUV
lol, jk, wtf, rofl
Agreed spelling : When a speaker spells out a word the spelled-out word should be written in lowercase with blank spaces between the letters.
Example:
c o m p u t e r (if the speaker spells out the word “computer)
•An exception to this rule is initialises (BBS), acronyms (NASA), and spelled out email and web-addresses.
•Single letters should be written in uppercase.
Example:
words that start with J
Proper Names : Use official spelling, capitalisation, and punctuation for proper names. Google them and pay attention to the correct format. Official format and spelling of a proper name may supersede the usual written transcription conventions detailed in this document.
When words are used as titles or names, they should be capitalised:
Example:
I'm going with Mom, Father, Grandpa, Grams, and Aunt Sue.
•If proper names include diacritics uncommon in your language (é, ü, ç, etc), include them in your spelling. If unsure, refer to news articles, official city or celebrity websites, IMDb, Wikipedia, Google Maps, or knowledge cards (the answer box above the list of Google Search results) in that order. When no other source can be used to decide between spellings, choose the spelling used in the first hit(s) on Google.
•Format proper names as they are most formatted on the entity's website (especially official documents).
•The phrase "Ok Google", as well as possible derivatives such as "Ok Google Now" and "Ok Glass", require their own spelling of "okay". This spelling is unique to these cases. In all other cases, use the correct way to spell it in your language (OK, Ok, Okay, etc.)
Example:
Ok Google, show me pictures of dogs.
Refer to the Google Play Store for official spellings of media titles. For film/television, IMDb is also available. If an utterance is ambiguous between a media title and a sentence or web search, use your judgement for which is more likely; if truly unclear, default to media title. Capitalise all title words except articles, conjunctions and prepositions unless they are the first word.
Do not use quotation marks for media titles.
Laughter : Actual laughter within speech should be ignored and not transcribed. In case the speaker says “ha ha ha ha ha”, with words, without laughing, it should be transcribed, but only up to an amount of syllables that can vary depending on your language.
Multiple spellings : When multiple spellings are attested, use the first spelling used in the reference dictionary for your language. Make sure with your trainer that you have access to the reference dictionary for your language.
Slang and contractions: Transcribe slang and colloquialisms as spoken. Do not alter non-standard speech that the speaker probably wouldn't want corrected.
Some languages have exceptions lile “gonna” and “wanna” which should be spelled as “going to” and “want to”. Make sure you know which ones fonction this way for your language.
When the speaker speaks with an accent, use the standard spelling. Someone who is a German native speaker may say something that sounds like “zis” when they want to say “this”. We would correct this slight error, and we would transcribe the intended word.