Computational Dialectology

Why using computational approaches?

We want to avoid cherry-picking features
More data-driven
Make use of as much (survey/ atlas) data as we can
Comparing non-neighbouring dialects
See the overall patterns
We can then feed our data to various visualisation tools
Can always update our analysis with more data

Procedures:

Data digitisation
Distance calculation
Visualisation

Data digisation is a very improtant step, because without transferring the data from paper to a machine-readable file, none of the following steps are possible.

Distance calculation converts qualitative data (e.g. IPA transcriptions, dialect features) into quantitative data (distances). By obtaining dialect distances, we can then use tools such as cluster analysis and multidimensional scaling to visualise the relationships between the dialects in our dataset.

We can speculate the patterns hidden in our data by visualising the dialect distances. Existing methods in machine learning and maps allows us to make interpretations of our data.

In addition to understanding the global dialectal landscape, there are additional analyses which can be done, such as automatic feature extraction, and correspondence detection. These tools help us explain the patterns we with the dialect distances.

I have been giving workshops and summer school courses on traditional dialectology/dialect geography and dialectometry for the past 2 years. You can find the syllabus of these workshops here.

If you are interested in organising a workshop on dialectometry/ dialectology for your students, please contact me through h.w.m.sung[at]hum.leidenuniv[dot]nl.

Page updated

Google Sites

Report abuse