It occurs to me that the files containing the Western Oceanic data I collected in the late 70s/early 80s for my PhD might be useful to someone. In any case, it is right that they be made publicly available, something that wasn't so easy back then. Most of the material is from wordlists that I collected during fieldwork in Papua New Guinea from around 1978 to 1982. The SE Solomonic data are from Tryon & Hackman 1983 (this is outside Western Oceanic).
I keyed the data into text files in a format such that each line was the entry for a single word, and each field within an entry was marked by a backslash code (I adapted this format from SIL's conventions at the time), then arranged them in cognate sets, ea.ch set separated from the next by an empty line. This work was done between 1983 and 1985, when text files were the best way to store data. They were entered on a terminal connected to a mainframe computer at the ANU. I have converted the ASCII symbols used in the original files into UTF-8.
Each file contains languages from a region, as listed below (and the regions sometimes cut across subgroups determined by the comparative method). Three-letter abbreviations are used for language names, and two key files are also provided. The files are now located on Zenodo at https://doi.org/10.5281/zenodo.7878855.
Files are: 1-3 New Ireland; 4 Willaumez Peninsula (New Britain) area; 5 NW Solomonic; 6 ; 7+8 Papuan Tip; 9 Vitiaz Strait area and NG north coast; 10 Huon Gulf and Markham Valley; 11 South and west New Britain. 7+8 are partial only. When I keyed the files. I had to rely on a mainframe's daily back-up onto tape spools. One night the system failed, and so did the restore, and I lost some data.
The formatting of these files is a little odd, since they served as input to routines I wrote to pull out sound correspondences. Anything after '%' is the elicited form: what immediately precedes '%' has had something 'undone', e.g. metathesis.
The orthography of the files is phonemic and largely obvious. The conventions are set out in the introductions to the volumes of The lexicon of Proto Oceanic.
Finally, the files also contain reconstructions at various interstages at the top of a cognate set. These were inserted for heuristic reasons during my research. Many of them did not survive into my PhD thesis, and they should preferably be ignored. The reader who is interested in Oceanic reconstructions should turn to the volumes of The lexicon of Proto Oceanic.