Multigene families comprise of genes in an organism which are identical or having similar sequence. The similarity can be either for the entire sequence or partial limited to specific domains. For a multigene family, both the alleles have to be copied and present in the subsequent generations. Gene duplication is thought to be the major phenomenon behind the origin of such multigene families.
The individual genes of the same multigene family usually have different functions. In fact, the occurrence of multigene families is considered as a proof for formation of new genes by the processes of gene duplication and divergence.
Multigene families are different from related genes which code for similar proteins or peptide chains. Such identical/ similar genes code for isoforms of the same protein/enzyme. However, multigenes code for different functional proteins though they share some sequence similarity.
Types of multigene families
Apart from the simple multigene families, there are complex multigene families whose individual genes have distinct genomic products although they are similar in sequence. One common example of complex multigene families is the globin genes in humans. The genes for alpha and beta chains of the mammalian hemoglobin molecule are coded by multigene families on chromosomes 16 and 11 respectively. These sequences though similar code for different chains are different.
The sequence similarity between the different chains has been found to be only 79.1 percent which is enough for the two genome products to be identical. Yet the chains are different and have distinct biochemical characteristics. The difference is accounted for the differential expression pattern of genes at different stages of development. These biochemical differences play a significant role in the differential functioning of the hemoglobin molecule at the different stages of development.
The globin genes are clustered in nature whreas multigene families sometimes appear dispersed throughout the genome. An example is the human genes coding for aldolase enzyme. However, sequence similarity is a strong reason to consider the individual dispersed genes as part of the same multigene family. The presence of such similar genes across species points to the possibility of having a common ancestor.
Comparison of sequences within as well as between the gene families can throw substantial information on the evolutionary relatedness. Hence, phylogenetic trees are being derived from these multigene families.
Gene superfamiles are groups of two or more multigene families. Globin gene family can be considered as one such superfamily. The alpha and beta globins form two different multigene families within the same group. They are clearly distinguishable from one another while sharing considerable sequence similarity.
Clan is a group of protein families which has the same functions but there are no established phylogenetic relationships between each other. Unlike the usual multigene families, clans are the result of convergent evolution.
The resultant similarities among multigenes can be classified into orthologs and paralogs based on the nature of similarity. The genes which occupy the same loci in different species and having similar functions are called orthologs. Gene divergence between the two loci is considered to be the result of species divergence. Those loci within the same species are called paralogs. These occur within the same individual.
Beta globin gene of humans and chicks are found to be orthologs where as alpha and beta chains of the globin gene within the individuals of the same taxa is considered as paralogs. In this case, the gene duplication should have occurred prior to gene divergence. The process followed by gene loss and amplification results in diverse genome sets of different individuals.
Multigene families and evolution
Comparative genomic studies revealed that eukaryotic ancestral genes are the precursors for the current functioning genes. The new genes have evolved as a result of gene duplication and divergence. Since the individual genes of a same species diverge faster, the sequence similarity between species is more compared to the similarity of gene sequences within the species. The speciation and divergence events can be calculated from anlaysing similiarity and this forms the basis of use of multigene evolution in construction of phylogenetic trees.
Examples of multigene families
The most prominent examples include the genes for ribosomal RNAs in eukaryotes. The human genome alone has 2000 genes for 5S rRNA. These are considered as simple or classical multigene families. The presence of multiple copies of the genes enable the cells to synthesize the ribosomal RNAs in abundance especially while cell division. Multigene families of acins, immunoglobulins, interferons, tubulins, hemoglobins, histones etc have been identified.
About Author / Additional Info: