Forced alignment, a technique for aligning segment-level
annotations with audio recordings, is a valuable
tool for phonetic analysis. While forced alignment
has great promise for phonetic fieldwork and
language documentation, training a functional, custom
forced alignment model requires at least several
hours of accurately transcribed audio in the target
language—something which is not always available
in language documentation contexts. We explore
a technique for model training which sidesteps
this limitation by pooling smaller quantities of data
from genetically-related languages to train a forced
aligner. Using data from two Mayan languages, we
show that this technique produces an effective forced
alignment system even with relatively small amounts
of data. We also discuss factors which affect the accuracy
of training on mixed data sets of this type, and
provide some recommendations about how to balance
data from pooled languages.