Next Generation Sequencing Techniques: The future of DNA Sequencing
Author: Ashutosh
ABSTRACT
Next generation sequencing techniques is the future of the DNA sequencing. Next Generation techniques provide the high speed and accurately results from the old methods of DNA sequencing (Maxam Gilbert or Sanger). Next generation sequencing techniques provides the massive parallelization, improved automation and the most important point which at very reduced price. The Human Genome project took 13 years for whole genome sequencing and by the introduction of Next Generation Sequencing Techniques can now be completed within a week or in a few days or few hours. This article include the applications of NGS and om approaches NGS, platforms using for NGS techniques, methods of NGS techniques, limitation of NGS techniques and how NGS is different from first generation sequencing techniques (Sanger’s Method).
INTRODUCTION
DNA sequencing is the method of sequencing to obtain the exact order of occurrence of nucleotides in a DNA, with the help of DNA sequences for the study of any organisms and researches an illuminate genetic information from a biological systems. A very large number of branches of sciences are receiving the benefits of these techniques, ranging from archaeology, anthropology, forensic sciences, genetics, molecular biology, biotechnology, among others. The studying of DNA sequences is necessary for almost all branches of life sciences and is understanding has grown exponentially in the last few decades (Khalid et al. 2016, Principle, Analysis, Applications and Challenges of NGS).
The first DNA Sequencing was Sanger’s method of DNA was called ‘plus and minus’(Sanger and Coulson, 1975). In this method, based on the selective incorporation of chain-termination dideoxynucleotides by DNA polymerase during in-vitro DNA replication (Sanger et al, 1977). Sanger’s sequencing method is also known as “First Generation Sequencing Techniques”. Sanger’s method of sequencing remains in wide use, but for smaller-scale projects.
Another method of DNA sequencing was the Maxam-Gilbert sequencing method are also known as chemical method of DNA sequencing. This method was developed by Allan Maxam and Walter Gilbert in 1976-1977. This method of DNA sequencing is based on nucleotide base-specific partial chemical modification of DNA and subsequent cleavage of the DNA backbone at sites adjacent to the modified nucleotides (Maxam,1980).
The NGS techniques refers to technology that do not rely on traditional dideoxy-nucleotide (Sanger’s Method) sequencing where labeled DNA fragments are physically replaced by electrophoresis (Paul et al. 2010, special Issue: Next Generation DNA sequencing). The key feature of NGS technique is parallelization of high number of sequences, provide high speed and throughout. The main advantage of NGS is determination of the sequence data from amplified single DNA fragment, avoiding the need for cloning of DNA fragments. The DNA molecules that are spatially separated in a flow cell.
The development of NGS techniques took place in the late of 20th century and the early of 21th century. The first NGS technology known as massively parallel signature sequencing (MPSS), was launched by a USA, based Lynx the Therapeutic Company in 2000, after this there are many platform of NGS are developed like Pyro sequencing 454, Roche 454 GS-FLX, Genome Analyzer, Ion Torrent sequences, complete genomics platform and many more. The speed of sequencing range from 20 Mega base pair (Mbp) to 3000Gbp per run woth maximum reading length of 70 bases (Sabahuddin et al., 2016).
What NGS can do ?
The Next Generation Sequencing techniques is a very powerful, flexibly, indispensable, and universal biological tool that is very useful in the field of life science. Some of the important features of NGS are:-
NGS provides us a much cheaper and faster alternative to traditional sequencing technique. Researchers can now sequence whole small genome in a day or within a few hour.
NGS offers high throughput sequencing of the Human Genome helps to discover new genes and regulatory pathways associated with disease (Grada and Weinbrecht, 2013).
NGS helps in the identification of disease causing mutation. It helps in the faster diagnosis and help in the better decision making for several genetic diseases, including many cancers (Grada and Weinbrecht, 2013).
NGS of RNA can provide entire transcriptomic information of a sample without any need of previous knowledge related to genetic sequences (Grada and Weinbrecht, 2013).
NGS helps in the variant study is common in medical genetic, where DNA sequence and data are compared with a reference sequence to catalogue the differences. These are the differences may from SNPs to complex chromosomal rearrangement (NeKrutenko and Taylor, 2012).
Some Next Generation sequencing Platforms are discussed as below:
Massively Parallel Signature Sequencing (MPSS) by Lynx Therapeutics
The firs of the “Next Generation” sequencing technology, was developed in the 1990s, by Lynx Therapeutics i.e., MPSS. Lynx Therapeutic is a company was founded in 1992 by Sydney Brenner and Sam Eletr. MPSS is an ultra extremely high throughput sequencing technology. When MPSS applied to expression profile, it reveal almost every transcript in the sample and also it can provide its accurate expression level. This technique was a bead-based method that used a complex approach adaptor ligation followed by adapter decoding, reading he sequence in increments of 4 nucleotides. This method made it susceptible-specific bias or loss of specific nucleotides sequences. However, the important properties of this technique output were typical of later “ Next-Gen” data types, including hundreds of thousands of short DNA sequences. This company was latter purchased by Illumina (Anjana Munshi, 2012).
Polony Sequencing
This technique is an inexpensive but highly accurate multiples sequencing technique that can be used to read millions of immobilized sequence of DNA in parallel. Polony Sequencing was first developed by Dr. George Church in Hardvard Medical College. This technique combined an in-vitro paired-tag library with emulsion PCR, an automated microscope, and ligation based sequencing chemistry to sequence (Anjana Munshi, 2012).
Pyro Sequencing
Pyro sequencing was developed by 454 life sciences, which has since been acquired by Roche diagnostics. In pyro sequencing method, amplifies DNA inside water droplets in an oil solution (Emulsion PCR), with each droplet containing a single DNA template attached to a single primer coated bead that then a form ‘cluster’ clonal colony. In pyro sequencing machine, contains many picolitre volume wells each containing a single bead and a sequencing enzymes. This Technique uses Luciferase to generate light for detection of the individual nucleotides added to the nascent DNA. The combined data are used to generate sequence read outs (Anjana Munshi, 2012). In 454 pyrosequencing approach, libraries may be constructed by any method that give rise to a mixture of short, adaptor flanked fragments.
Sequencing is performed by the pyrosequencing methods i.e. 454 Pyrosequencing. The amplicons bearing beads are preincubated with Bacillus stearothemorphilus(Bst) polymerase and single-stranded binding protein and then deposited on to a microfabricated array of picoliter scale wells to render with array based sequencing. At each cycle (hundred number of cycles) a single species of unlabelled nucleotide is introduced, on template where this results in an incorporation event, pyrophosphate is released.
The major limitations of this method relates to homopolymers (that is consecutive instances of the same base, such as AAA or GGG). In 454 pyrosequencing there is no terminating moiety preventing multiple consecutive incorporation at a given cycle.
Illumina(solexa) sequencing
Solexa (illumina) developed a sequencing technology which was based on dye terminators. In this sequencing method,DNA molecules are first attached to primers on a slide and amplified this is known as “bridge amplification”. In Illumina sequencing, the DNA can only be extended one nucleotide at a time. A camera can takes the image of the fluorescently labeled nucleotides, them the dye along with the terminal 3’ blocker is chemically removed from the DNA and allowing the next cycle. (Anjana Munshi,2012)
Illumina Genome Analyzer is commonly referred to as 'The Solexa', this platform has its origins in work by Turcatti and colleagues and the merger of four companies. In this method libraries can be constructed by any method, and that give rise to a mixture of adaptor flanked fragments upto several hundred base-pair in length. The amplified sequencing features are generated by Bridge PCR. (Volume 26. 2008, Nature Biotechnology ). In this platform, the Bridge PCR is unconventional in relying on alternating cycles of extension with Bacillus stearothemorphilus(Bst) polymerase and the denaturation with Formamide. This resulting clusters each consist of ~1000 clonal amplicons.
SOLID Sequencing
In solid sequencing, a pool of all possible oligonucleotides of fixed length are labeled according to the sequenced position. This is used I ABI solid sequencing is oligonucleotide legation and detection. The results of solid sequencing to the sequences of quantities and lengths comparable to illumine sequencing.
DNA Nanoball Sequencing
This technique is high throughput sequencing technology thast is used to determine the entire genomic sequence of an organism. This method is based on uses rolloing circle replication to amplify fragments of genomic dna molecules. This allows large number of DNA nanoballs to be sequenced per run and at low reagent cost compared to other NGS platforms. DNA nanball equencing has been used for multiple genome sequencing project.
Helioscope (TM) single molecule sequencing
In this NGS platforms uses DNA fragments with added ploy a tail adaptors, which are attached to the flow cell surface. In the next steps involves extension-based sequencing with cyclic washes of the flow cell with fluorescently labeled nucleotides. Helioscope sequences performed the reads (Jay Shendure, 2008, Nature Biotechnology).
NGS Techniques vs Sanger sequencing
The first generation sequencing i.e., traditional automated Sanger sequencing method played very important role to the completion of Human Genome Project (HGP) and many other animal and plant genomes. However, due to its heavy cost of sequencing and time intensive task, in the end of 20 th and early of 21st century, new methods (NGS) have been developed to replace the automated Sanger Sequencing method. The major advantages offered by NGS is…
Capability to generate massive volume of data.
Read more than one billion. Short reads in a single run of cycle.
Most important NGS is a fast and inexpensive method to get accurate genomic information.
The sample preparation of NGS is faster and straightforward in comparison to capillary electrophoresis (CE)-based Sanger Sequencing. In NGS techniques, it may be started directly from a gDNA or cDNA library. The sequences of DNA fragments to platform-specific oligonucleotide adapters and this needs less than 1.5 hours to complete. In comparison to Sanger’s method of sequencing needs Genomic DNA that are fragmented and cloned into either bacterial or yeast artificial chromosomes. Each YAC/BAC need to sub-cloned into a sequencing vectors and transformed to appropriate microbial host. Before the sequencing, the template DNA is purified form individual colonies. These processes may consume days or even weeks, depending the size of genome (Khalid Raza et al. 2016).
Limitations of NGS
In comparison to Sanger Sequencing, NGS is cheaper and faster but skill is too expensive to be affordable by small labs or an individual.
The analysis of data which are generated by NGS is time consuming and sufficient knowledge of bioinformatics to harvest accurate information from these sequence data.
A open goal posed by The National Human Genome Research Institute (NHGRI) to minimize the cost of human genome sequencing to $1000 to full further development in NGS technologies, so that routine sequencing of human genome can be used as a tool to diagnose various diseases. But, till now this target is still far away from its goal of $1000 (Hert et al, 2008).
NGS supported short sequencing read lengths is one of the major short coming which limit its application, especially in de novo sequencing (Hert et al, 2008).
To sequence highly repetitive regions due to short read lengths is another inherent limitation.
The major bottleneck for the implementation and capitalization of NGS technology is Data processing steps of bioinformatics (Daber et al, 2014).
Main Areas of Application of Next Generation Sequencing Techniques
The NGS have range of areas that are helpful like medical, studies of genomes, biochemical researchers etc. Some of them are:-
Whole Genome Sequencing
NGS performs massively parallel sequencing which helps in the high-throughput sequencing and thereby responsible for entire genome sequencing in less than a day in a minimum cost (Grada et el., 2013).
Exome Sequencing
NGS helps in investigating the protein coding regions which are present within the genome. Exome shares a very small portion of the whole genome. Exomes are the gene-coding regions of the genome and therefore important site of concern for clinicians and researchers.
Targeted Sequencing
Targeted sequencing involves in the sequencing of genomic.
Helps to analyse any protein structure and function we must have the knowledge of its primary structure of the protein i.e., DNA Sequence.
The main use of NGS in the field of DNA fingerprinting.
NGS also helps in the kinship study.
By the sequencing of DNA with the help og NGS, we can easily detect any kind of mutation as early as possible.
By the study of DNA we can understand the function of a specific sequence and the sequence responsible for any disease.
NGS helps in the study of early cancers. By the help of NGS, we can fastly and accurately sequence the sequences.
Through NGS, we can easily study the phylogenetic tree of any organisms. Compare it with ancestries on the basis of molecular.
NGS provides a powerful strategy to search the Mendelian diseases genes.
The majority of known disease-causing mutations affect highly conserved protein residues, other pathogenic mechanisms such as synonymous changes of rare codons that affect the rate of cotranslational folding (kimchi-Sarfaty et al., 2007).....may be responsible but not ascribed importance. This reasons emphasizes the need for better functional assays of discovered variants.
NGS provides adequate coverage, certain types of mutations (example: inversions, duplications, and other structural aberrations) remain challenging to detect. Some casual mutations also may be present outside of the regions targeted for Exome Sequencing. (Renton et al., 2010).
The major impact of Next Generation Sequencing technologies on rare genetic diseases is further evidenced by the growth of the Online Mendelian Inheritance in Man(OMIM) database (Mckusick., 2007), in which the number of inherited phenotypes for which the molecular basis is known has nearly doubled since 2006. Identification and study of number of genes associated with rare diseases, too, has grown at impressive rate.
Through Next Generation Sequencing technology identification of the rarest variants are de novo mutations: those variants that are arise first in individual. The identification and characterizing these mutations also allows for the estimation of the baseline human mutation rate as well as its correction to parental age (Abecasis et al., 2010; Kong et al., 2012). By the faster and accurately by NGS, reads have a higher error rate than traditional sequencing methods.
Variant detection by the Next Generation Sequencing Techniques is another advent of NGS, NGS enabled the inquiry of nearly every base in the genome, and thus techniques to reliably and identify millions of variants are being developed. The advantage of NGS in this regard is that most variants, common and rare, can be discovered with the appropriate sequencing read coverage, algorithmic methods to identify the variants, and a sufficient careful orthogonal validation to confirm true from false positives.
References:
Zyang,Y.; Jeltsch.A. The Application of next generation sequencing. Genes 2010,I,85-101
Anderson, M.W., Schrijver, I, Next generation sequencing and future of genomic medicine. Genes 2010,I,38-69
Maxam, AM, Gilbert W(February 1977). “A new method for sequencing DNA.” Proc. Nat.Acad.Sci.U.S.A 74
Gilbert.W.DNA sequencing and gene structure, noble lecture, 8 December 1980
Gilbert W,Maxam A(December 1973), “ The Nucleotide Sequence of the Lac Operator”. Proc. Nat1. A cad. Sci U.S.A.70
Sanger F, Coulson A R(May 1975), “A Rapid method for determining sequences in DNA by primed synthesis with DNA polymerase”. J.Mol.Bio.94(3):441-8.doi:10.1016/0022-2836(75)90213-2, PMID 1100841.
Sanger F, Nicklen S, Coulson AR(December 1977). “DNA sequencing with chain-terminating inhibitirs”. Proc.Nat.Acad.Sci.U.S.A.74(12):5463-Bibcode 1977ONAS…74.546355.doi.10.1073/phas.74.12.5463.PMC431765.PMID371965.
Genome sequencing on nanoballs Porreca,JG,Nature Biotechnology,2010,28L43-44)
Shendure,J.,Mitla,R.D.,Varma,C.and Church,G.M.Advanced sequencing technologies: methods and goals.Nat.Rev.Geneh5,335-344(2004)
Brenner.S.et al.Gene expression analysis by massively parallel signature sequencing (MPSS), Nat.Biotechnol.18,630-634(2000)
Magi,A;Benellt,M; Gozzini,A;Giedami,F; Torricell,F; Brandt,ML.Bioinformatics for Next generation sequencing data.Genes.2010.J.294-307
ADAMS,M.D.,FIELDS,C.and VENTER,J.C.(1996)Automatic DNA Sequencing and Analysis.SanDiego Academic Press.
ANSORAGE,W,SPROAT,B,STEGEMANN,J,SCHWAGER,C.and XENKE,M(1987). Automated DNA sequencing:ulteasensitive detection of luorescent bands during electrophoresis. Nucleic Acids Res.15,4593-4602.
Shedure,J.et al(2005) Science 309, 1728-1732
Geada.A., and WEIHNRECHT,K.(2013). Next Generation Sequencing: methodology and application. Journal of Investigation Dermatology,133(8).e11
Thompson,J.F. and Steinman, K.E.(2010), single molecule sequencing with a heliscope genetic analysis system. Current protocol in molecular Biology,7-10.
About Author / Additional Info: