The evolutionary distance between different species was determined by cophenetic distances derived from the Zoonomia consortium 240-way Cactus alignment. We generated a ranked set of species, ordered from least to greatest evolutionary divergence from Human to temporally place transposition events.
UCSC LiftOver tool translates genome coordinates and annotation files from one assembly to another based on a reference alignment. Within our project, we used LiftOver to ensure Human repeats were lifted over to the target species, then reverse-lifted back to the Human genome to confirm the accuracy of the mapped loci. We quantified the reads that were conserved during this process of lifting over between the first and reverse strands.
We assessed the biological relevance of our data by analyzing evolutionary trends. For instance, if a TE appeared in Bornean Orangutan, was absent in intermediate lineages, and then reappeared in humans, we excluded it due to the improbability of such a re-emergence. In contrast, TEs that consistently appeared across lineages leading to humans or were specific to a single species were retained for further analysis. Using the "passed" data points, we quantified their ages by defining human specific elements to be new, the TEs with a consistent score indicating they have been continually passed down to human to be old and the rest of the data samples would categorize themselves to medium.