Controlled vocabularies are meant to organise knowledge that is classifiable in nature.
The purpose has to do with knowledge management and metadata interchange, and is geared to a machine-readable environment, meant to improve dissemination/discovery, repurposing/reuse, and collection/merging of data across the open and globally connected digital environment of the semantic web.
Set of standard, specification or widely used classifications:
EU Vocabularies (European Union): languages, etc.
https://op.europa.eu/en/web/eu-vocabularies
IETF BCP 47 language tag (maintained by the IANA language subtag registry)
https://en.wikipedia.org/wiki/IETF_language_tag
https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
It aims to cover all known natural languages, including living and extinct, ancient and constructed, major and minor, written and unwritten. Doesn't include reconstructed languages.
Usage: Intended for use as metadata codes. Used in computer and information systems, such as the Internet. Used in cataloging systems as archives and information storage.
>SQL table definition and tab delimited file for populating it can be downloaded from https://iso639-3.sil.org/code_tables/download_tables#Complete%20Code%20Tables
>Java library (available in Maven repositories) https://github.com/mihxil/i18n-iso-639
Special codes:
iso639p3 name_original
und Undetermined
Macrolanguages:
iso639p3 name_original
zho Chinese (macrolanguage)
Eg: Field 'name_original' contains the endonym if known, or the exonym otherwise.
iso639p3 name_original iso639p1
cat català ca
cdo Min Dong Chinese zh
cjy Jinyu Chinese zh
cmn Mandarin Chinese zh
cnp Northern Ping Chinese zh
cpx Pu-Xian Chinese zh
csp Southern Ping Chinese zh
czh Huizhou Chinese zh
czo Min Zhong Chinese zh
eng English en
gan Gan Chinese zh
hak Hakka Chinese zh
hsn Xiang Chinese zh
lzh Literary Chinese zh
mnp Min Bei Chinese zh
nan Min Nan Chinese zh
pol polski pl
spa español es
wuu Wu Chinese zh
yue Yue Chinese zh
--------------------------------------------------------------------------------
IETF BCP 47 language tag is used to identify human languages on the Internet (maintained by the IANA language subtag registry).
Usage: Used by computing standards such as HTTP (eg: header 'Accept-Language'), HTML, XML and PNG.
The EU has a controlled vacabulary at https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/language.
Usage: Its main purpose is to support activities associated with the publication process.
Eg:
id authority_code iso639p1 iso639p2b iso639p2t iso639p3 name_original
LNG0001 ENG en eng eng eng English
LNG1168 CAT ca cat cat cat català
LNG5250 POL pl pol pol pol polski
LNG5896 SPA es spa spa spa español
--------------------------------------------------------------------------------