Digitized Vietnamese Historical Texts?

From: Vsg <vsg-bounces@mailman11.u.washington.edu> On Behalf Of Hieu Phung

Sent: Friday, June 12, 2020 3:34 PM

To: VSG <vsg@u.washington.edu>

Subject: Re: [Vsg] Digitized Vietnamese Historical Texts?

Dear George and colleagues,

Thank you so much for raising this topic. Someone have retyped the entire DVSKTT and part of the “messy” text of the DVSK tục biên and uploaded it on Wikisource. https://zh.wikisource.org/w/index.php?title=Special:搜索&search=大越史記全書 They have done so to some other Han-Chinese texts as well. While they did a great job in making searchable texts available to the public, these online texts need to be used with great caution. My guess is that the person (group of people) who typed them is/are Chinese and they did make quite a lot of typos.

While we shouldn’t be concerned with different texts of DVSKTT (so whatever text was the based one for this Wikisource doesn’t matter), we do not know which texts that the online part of the DVSK tuc bien (i.e., the dynastic history of the Later Lê) was based on.

To do OCR for the DVSKTT, we might want to use the Chen Ching-ho version or the recent DVSKTT version reproduced by some Chinese scholars. However, based on my experience with ctext.org, the quality of Vietnamese texts might end up requiring us to heavily edit the resultant texts.

George, I don’t know (and don’t think) there are digital versions of the Chu Han originals of Phan Boi Chau’s work (if so, someone must have announced it across social media), but I will send you an additional email on digitized Vietnamese historical texts.

Best regards,

Hieu Phung

From: "Dutton, George" <dutton@humnet.ucla.edu>

Date: Thursday, June 11, 2020 at 21:09

To: VSG <vsg@u.washington.edu>

Subject: [Vsg] Digitized Vietnamese Historical Texts?

Dear VSG,

I’m using some of this quarantine induced downtime (such as it is) to work on the digitization of some Vietnamese historical texts in Chu Han. Slow work, obviously. Anyway, the idea is that this digitization work (currently through scanning, OCR, and proof-reading) might eventually contribute to the creation of a larger-scale corpus of Vietnamese Chu Han texts that could be accessed and searched online.

My question to the group is: which original historical texts (not the modern quoc ngu translations) might any of you know about that have already been digitized and made available? I know of the Nom Foundation’s _Dai Viet Su Ky Toan Thu_ (though it is not searchable by Han character), and I know that Liam Kelly has created an online version of the _Linh Nam Chich Quai_ I just want to avoid duplicating work that has already been done. I’d be curious, for example, if there are digital versions of the Chu Han originals of Phan Boi Chau’s work, for example.

Thanks in advance for any information that people might offer. I do hope that in the longer term, we can create a systematic effort (perhaps even with funding) to do some of this work. We’re light years behind similar projects for Chinese and Japanese textual corpuses, but we need to start somewhere . . .

Best,

George

_______________________________________________

George Dutton

Director, UCLA Center for Southeast Asian Studies

Professor, UCLA Department of Asian Languages and Cultures

290 Royce Hall

Box 951540

Los Angeles, CA 90095-1540

Pronouns: He/Him/His

tel: (310) 825-0523

fax: (310) 825-8808

http://www.alc.ucla.edu/person/george-e-dutton/