As an example of how molecules can be used to infer relationships,
consider globin genes in vertebrates. The globin gene complex itself is
quite complicated in structure, consisting of several clusters or families
of duplicated genes and pseudogenes. However, if we restrict our
attention to one of those genes, for which there are identifiable homologs
(sharing a direct evolutionary lineage) in the taxa of interest, the story
simplifies considerably. Examination of the amino acid sequences of
-globin for human, rhesus monkey, cow, platypus, chicken, and shark
reveals that there are some differences in sequence among the taxa. It is
possible to draw a "family tree" for
-globin in these taxa by
considering all pairwise comparisons between taxa, and making the
seemingly reasonable assumption that similarity of sequence implies
relatedness of sequence. Counting up pairwise differences in amino acid
sequence produces the following matrix:

corresponding to the tree

As required, the pairs with fewer differences (such as man and rhesus monkey) are placed more closely together than other pairs. One of the nice things about this particular example is that it is possible to use this kind of logic for all pairs and produce a single unambiguous tree topology. This is not always (or even usually) the case.