Buckwalter transliteration: https://en.wikipedia.org/wiki/Buckwalter_transliteration
Penn Arabic Treebank
Paper: here
Part 4 (to be added)
Parallel Arabic-English Treebank
English POS tagset (note: there is also a more complete annotation manual for just POS available, use Google in case you need it)
MADAR: English sentences from travel domain translated into 25 city dialects
Web page: http://madar.camel-lab.com
Access: to be determined
Columbia 6-Dialect Corpus : data collected from social media and annotated morphologically (Taizi, Sanaani, Najdi, Jordanian, Syrian, Iraqi and Moroccan)
Levantine Treebank (LDC2005E78)
Use the MSA guidelines: Guidelines
CAMel Tools is a suite of tools developed at NYU Abu Dhabi for Arabic. The main page is here. Below you will find a suggested sequence of steps to install and use.
Reference: Obeid, Ossama, et al. "CAMeL Tools: An Open Source Python Toolkit for Arabic Natural Language Processing." Proceedings of The 12th Language Resources and Evaluation Conference. 2020.
For my experience in installing the tools, and a proposed way of using the analyzer, see this page.
A Buckwalter-encoded version of the main morphology database of the analyzer is here.
CODA convenrtional spelling for dialects: http://coda.camel-lab.com/
Semitisches Tonarchiv, Heidelberg Recordings from many Arabic dialects and other Semitic languages. Some have transcriptions, typically in other documents which are referenced but not provided. Some of the meta data is only available in German (but should be pretty clear and I am happy to help if needed).