The application of information technology to the molecular field is called bio-informatics. It is a combination of statistics and computer science applied in the field of molecular biology.
Paulien Hogeweg was the first person who gave the name bio-informatics t this field in 1979. He wanted to study informatics processes in biotic systems. Until the late 1980's this field was used to apply its techniques in genetics and genomics, especially large scale DNA sequencing areas of genomics. Bio-informatics is now used to make algorithms, create and make advancement in databases and uses statistical and computational techniques and finds the answers of the problems which arise from the biological data.
Mapping of DNA and analyzing it and observing protein sequences are the common activities of bio-informatics. This field also concerns with aligning different protein and DNA sequences and compare them to create and view 3D models of protein and DNA structures. Purpose of bio-informatics is to understand different biological processes. What makes bio-informatics different from other fields is that this field along with biotechnology techniques makes computational techniques. Most of the research is done in the gene assembly, sequence alignment, gene finding, drug discovery and protein alignment, structure and prediction.
Research Areas:-
Bio-informatics is a vast field and it has many search areas like;
Sequence analysis:-
In 1977, Phage phi-X174 was sequenced, after that many DNA sequences of thousands of organisms are stored in the form of biological data. The sequence information present in these biological data determines the genes that do encoding for polypeptides, RNA genes, regulatory motifs and repetitive sequences. These biological data do comparison between the species or within the species to get information about the protein sequences and structure of proteins. Though scientists got all the information of proteins and DNA in the form of biological data, but getting information from this data was very time consuming. Now computational programs have made it easy for the scientists to analyze the data computationally. Now a day, BLAST, a computer program is used to search the genome of various organisms within seconds containing billions of nucleotides. Programs like BLAST can identify the mutations present in the DNA sequences and identify the sequences that are related to each other but are not identical. There is a technique shot-gun which was first used to sequence the bacterial genome. But this technique did not produce the entire chromosomes. It produces sequences of small DNA fragments.
Genome Annotation:-
There is another aspect of bio-informatics that is sequence analysis through annotation. It is a technique which involves the finding of protein coding genes, RNA genes computationally. Annotation is the technique of bio-informatics which involves the marking of genes within the DNA sequence. All the nucleotides within a human genome are not part of genome and mostly large parts of DNA are not functional in a genome. The Institute of Genomic Research analyzed the genome of a living organism for the first time to decode it. It was a bacterium, Haemophilus influezae. Programs which are involved in DNA sequencing of the genome are constantly making improvements.
Analysis of Gene Expression:-
How the genes are expressed within the genome, can be determined by measuring the levels of messenger RNA (mRNA) using the techniques of micro-arrays, serial analysis of gene expression and expressed cDNA sequence tag sequencing. Major use of computational biology is to make statistical tools to separate the noise and signal in gene expression. These studies are used to determine the implications of genes in a disorder. A microarray data of cancerous cells can be compared with the data of non-cancerous cells.
Analysis of Mutations in Cancer:-
In the disease of cancer, effected cells of the genome are rearranged in unpredictable or complex ways. Bio-informatics techniques make it possible to identify the point mutations in various genes of different varieties in cancer. Bio-informatics techniques are able to control the volume of sequence data which is produced. Bio-informatics experts create new software and algorithms and compare the results of sequencing to the collection of germline polymorphisms and human genome. Some techniques like oligonucleotide microarrays are used to identify the gains and losses of chromosomes. To know the point mutations, a technique is used called, single nucleotide polymorphism arrays. Several hundred thousand sites in the whole genome are measured simultaneously by using these methods.
Analysis of Protein Expression:-
Pictures of proteins present in the genome can be obtained by protein microarrays and high throughput mass spectrometry. Making sense of protein microarrays and mass spectrometry is the main concern of bio-informatics. Protein microarrays deals with the similar problems like with the microarrays which are targeted at mRNA level. Mass spectrometry deals with the problem of comparing large amount of mass data with databases of protein sequences.
It doesn't end here, bio-informatics has other research area also like; analysis of regulation, prediction of protein structure, comparative genomics and modeling biological systems etc.
Software and Tools of Bio-informatics:-
Software and tools of bio-informatics vary from the simple command line tools to the complex databases and programs. SOAP and REST based interfaces are used for variety of purposes and applications of bio-informatics. They allocate an application on one computer in some part of the world to use the algorithms, computing resources and data on servers in other parts of the world.
About Author / Additional Info: