By Parimala Nagaraja, Assistant Manager-NGS, Peer Reviewers: Neha Verma, Research Scientist – NGS, Ekkirala Chaitanya, Lab Director- NGS operations, MedGenome Inc., USA
With the advent of novel Next generation sequencing (NGS) technology platforms – DNA Sequencing has seen a revolutionary leap both in terms of cost and application in cutting-edge research.. Today, we can sequence an entire Human genome in a day compared to the conventional Sanger sequencing using capillary electrophoresis. It is now possible to identify and track genetic variation in a more efficient and precise manner. Also, owing to this seamless sequencing capability now thousands of variants can be analysed within a large population in a short span of time.
NGS is an umbrella term that describes various sequencing platforms of modern research. These technologies allow us to decode the DNA and RNA in the most cost effective and time efficient manner and pose a wide range of applications from exploring and detecting a Single Nucleotide polymorphism (SNP) to Constructing a whole Genome via de novo assembly.
Ever since the discovery of the Double Helix model of DNA in the 1950’s, Researchers have invested their time and efforts in decoding and unravelling the sequence of the variety of different genomes.
Before any attempts were made to sequence the DNA, Robert Holley, an American Biochemist in 1964 sequenced and determined the complete structure of Alanine tRNA molecules which consists of 77 ribonucleotides. His work paved the way to other scientists to sequence other RNA as well as DNA molecules.
Paul Berg developed the first technology which permitted isolation of defined fragments of DNA in 1972 leading to the development of modern genetic engineering tools. Before this only Phage DNA was available for sequencing.
Walter Greenburg published the first nucleotide sequence of Lac operator that consists of 24 base pairs in 1973. Later in 1977 Fredrick Sanger was the first to sequence the complete genome of Bacteriophage and developed the “Chain Termination” sequencing technology. In the same year, Walter Gilbert, an American Biochemist produced ‘DNA sequencing by chemical degradation’. This led to Sanger and Gilbert receiving a Nobel prize in the 1980s for successfully developing sequencing methods for a long DNA molecule. The other half was awarded to Paul Berg “for his fundamental studies of the biochemistry of nucleic acids, with particular regard to recombinant DNA”.
Later in 1986, Leroy Hood from California Institute of Technology first developed a Semi-automated DNA sequencing machine. This machine automated the Enzymatic Chain termination method of Sanger sequencing and became a key tool in mapping and sequencing the genetic material.
And then Applied Biosystems in the USA marketed the first automated sequencing machine called ABI 370.
Later came another method of sequencing called “Pyrosequencing” which did not require electrophoresis. This was developed by Pal Nyren and Mostafa Ronaghi in Sweden in 1996. It relies on the detection of DNA polymerase activity by an enzymatic luminometric inorganic pyrophosphate detection assay developed by P. Nyren in 1987.
The Automated Sanger sequencing led to many significant accomplishments which include: Completion of Human Genome Project, and other Plant and Animal genomes. Because of its own limitations there was a need for more advancements and improvements in sequencing of large numbers of genomes. While automated Sanger sequencing is called “First Generation Sequencing”, the later advanced methods of sequencing are called “Next Generation Sequencing”.
Emergence of NGS platforms
NGS became available in the early 21st century. The most significant improvement of NGS is its capability to produce a large amount of data in a very short time in the most cost effective and accurate manner beyond the reach of traditional Sanger methods.
Lynx Therapeutics Company launched the first of the NGS technology, called Massively parallel signature sequencing (MPSS) in the year 2000 which was later acquired by Illumina.
Later in 2004, a paralleled version of sequencing called “Pyrosequencing” was marketed by 454 Life Sciences (Branford, CT, USA) which reduced the cost of sequencing by 6fold compared to automated Sanger sequencing and was the second of the new generation of sequencing methods. This company was later acquired by Roche. Roche’s 454 GS 20 refashioned the sequencing platforms in 2005-2006 as it could produce about 20 million bases (20 Mbp), which was further replaced by GS FLX model in 2007 which can produce over 100 Mbp of sequence in just four hours, which increased to 400 Mbp in 2008. This model was then upgraded to the 454 GS−FLX+ Titanium sequencing platform which can produce over 600 Mbp of sequence data in a single run with Sanger-like read lengths of up to 1,000 bp.
In 2005, Solexa released the ubiquitous Sequencing by Synthesis (SBS) method of sequencing which was later purchased by Illumina in 2007. This method is responsible for 90% of sequencing data produced in biological research.
Recently, Illumina developed a platform called Novaseq 6000 which can produce enormous amounts of data by sequencing several hundreds of libraries within a matter of 24-48 hours. The output of Novaseq can range from 400gb to 3000gb. This platform unleashed a new era of sequencing with unconventional innovations providing users with the throughput, speed, and flexibility to complete projects faster and more economically than ever before. It also offers a wide range of flexibility to the users in the sequencing options and enables the users to choose between the four flow cell types (SP, S1, S2, S4) and sequence one or two flow cells at a time. This cutting-edge technology provides simple, scalable, and highly reliable high throughput sequencing with outstanding accuracy.
Other novel sequencing technologies parallel to SBS include: True Single Molecule Sequencing developed by Helicos Biosciences, Ion Torrent sequencing developed by Life technologies, Single molecule real-time (SMRT) sequencer developed by Pacific Biosciences, Oxford Technologies Nanopore (Oxford, UK) single molecule sequencer with ultra-long single molecule reads that became available in 2012–2013.
Future of NGS
NGS will continue to revolutionize genomics research by becoming increasingly efficient and cost effective. For now, all the NGS platforms require the preparation of NGS libraries (preparing the DNA/RNA compatible for sequencing) which includes Fragmenting the DNA/cDNA molecules, attaching the adapter molecules to either ends followed by the Amplification of the libraries using PCR. Another new class of sequencing is under development called “Third Generation sequencing” which allows the sequencing of single DNA molecules without amplification and can produce longer reads compared to NGS platforms. Single molecule real-time (SMRT) sequencer developed by Pacific Biosciences and Nanopore Sequencing offered by Oxford Technologies Nanopore are already on this domain by rapidly generating up to 15000 bases from a single DNA or RNA molecule. This allows the sequencing of smaller genomes completely without introducing the PCR bias in the most time and cost-efficient way.
Barba M, Czosnek H, Hadidi A. Historical perspective, development and applications of next-generation sequencing in plant virology. Viruses. 2014;6(1):106-136. Published 2014 Jan 6. doi:10.3390/v6010106
Behjati S, Tarpey PS. What is next generation sequencing?. Arch Dis Child Educ Pract Ed. 2013;98(6):236-238. doi:10.1136/archdischild-2013-304340
#Sanger sequencing, #de novo assembly, #Sequencing technology, #Pyrosequencing, #Sequencing by Synthesis, #Single Molecule Sequencing