Details about using the runAdjHm.sh script
Check homograph numbers with the script runAdjHm.sh from the SIL dictionary-lexical-services AdjustHomographs repo on github
This script reports if there are duplicate or missing homograph numbers.
It adds homograph numbers to forms that need them (starting just after the highest number that exists for that form, even if the existing numbers are not contiguous).
If a record has both a \lx and \lc field, only the \lc field is checked for matching homographs.
It flags subentry fields that are reference fields. The flag causes the subentry to be ignored for the purposes of renumbering the homographs.
A reference subentry is one that is immediately followed by:
another se marker
a date marker
the end of the record
The program requires the following files & scripts in the current directory
AdjstHm.ini from https://github.com/sil-dictionary-lexical-services/AdjustHomographs
AdjstHm.pl from https://github.com/sil-dictionary-lexical-services/AdjustHomographs
FlagseReF.pl from https://github.com/sil-dictionary-lexical-services/AdjustHomographs
de_opl.pl from https://github.com/sil-dictionary-lexical-services/Opl_DeOplStub/tree/master/Piped
opl.pl from https://github.com/sil-dictionary-lexical-services/Opl_DeOplStub/tree/master/Piped
runAdjHm.sh from https://github.com/sil-dictionary-lexical-services/AdjustHomographs
Prior to running, all the scripts and control files should have Linux line endings. The easiest way to do that is:
dos2unix *
By default the runAdjHm.sh operates on a file named adjust.sfm. You can temporarily rename your file to that. Or, you can set the filename in a variable named dbname like this:
dbname=myfile.sfm ./runAdjHm.sh
The process creates two files with the original input file name with the extension changed. The error file (.err) tells you which items have an invalid homograph number.. The log file (.log) indicates which items have been assigned a new homograph number.
Your original file is retained with a .bkp extension. A new file with a .ref extension has the reference subentries with a marker like \REFse. The file with the original name, has the subentry reference flags and the new homograph numbers.
You can customize the behaviour of the script by adjusting the .ini file. Details are in the README file