By Parimala Nagaraja, Assistant Manager-NGS, MedGenome Inc., USA
It is known that all the hereditary information is contained within an organism’s genome. Owing to continuous global efforts many new bioinformatics databases are emerging and has seen an up trend in the recent past, a reflection on how NGS data is impacting our understanding of life and our need to constantly develop new methods to investigate and decode the information in and around DNA (or RNA for some viruses) and its nucleotide sequences.
Source: Next-Generation Sequencing: From Understanding Biology to Personalized Medicine – Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/Next-generation-sequencing-applications-Schematogram-depicting-the-different-methods-for_fig1_262384938 [accessed 15 Nov, 2021]
A comprehensive outlook and understanding of a full genome is now possible with de novo whole-genome shotgun sequencing and annotation. Because of the novel technological developments over the recent years and the availability of several reference genomes in the public domain that can be used for annotation, WGS has become increasingly easier, faster, and cheaper. NGS plays a significant role in Human genomics ushering a new era of new personalized therapeutics in order to achieve healthy and disease-free lifestyle – the broad term being“personomics”. Several SNPs, mutations, other sequence variants such as InDels, copy number (CNVs), and structural variations (SNVs) can be detected through Targeted sequencing like Ampliseq or Whole Exome Sequencing (WES) within or among different species. Other methods of Targeted sequencing that are widely used for identifying polymorphisms that are important in tissue or cell matching for transplantation include HLA genotyping of an entire gene or just exonic regions. Targeted sequencing of just coding regions to detect exonic mutations responsible for rare Mendelian Genetic disorders such as hearing loss, intellectual disabilities, and movement disorders and for investigating common disorders such as heart disease, hypertension, diabetes, and cancer can be termed as “Exomics”.
The sum total of RNA transcript sets expressed by the genome in cells, tissues, and organs at different stages of an organism’s life cycle is termed as a “transcriptome” of an organism. High throughput RNA sequencing from Complementary DNA (cDNA) molecules helps us to understand the complex and intricate genome functions in biological systems. It also provides us to identify quantitative expression levels of genes, tissue specific transcript variants and isoforms, small and large non-coding RNAs involved in the regulation of gene expression or associated with various types of cancer in a highly sensitive and accurate manner.
Methylomics and epigenomics:
The study of complete epigenetic modifications via DNA nucleotide methylation and posttranslational modifications of histones, the interaction between transcription factors and their targets, and nucleosome positioning is called Epigenomics. The genome-wide analysis (GWA) of DNA methylations and their effects on gene expression and heredity is called Methylomics. Bisulfite DNA sequencing (Methyl-seq) aids in mapping DNA cytosine methylation at single-base resolution. Methyl seq is a well-established method for DNA methylation profiling in various organisms as well as humans for evaluating pathogenic variants of the genes.
ChIP-seq (Chromatin Immunoprecipitation) allows the genome-wide profiling of DNA-binding proteins and histone and nucleosome modifications. It is the most widely used method for detecting and analysing the transcription factor binding sites and histone modifications in a variety of organisms. Another commonly used NGS method used in epigenomics is Hi-C which is generally used to identify DNA regions such as promoters, enhancers, and insulators that come together to mediate their regulatory activities.
Proteomics, metabolomics, and systeomics:
Proteomics is the study of structure, function and characterization of different peptides and proteins. Sequencing the Open reading frames (ORFs) of the genomic regions, exonic regions, and transcripts aids in constructing proteomic profiles from NGS data. Although this is not the only method to build the proteomics data. A variety of several other hardware and software tools are employed to build up an organism’s peptide and protein profiles. These include 2D-PAGE, liquid chromatography coupled with tandem mass spectrometry, affinity-tagged proteins, and yeast two-hybrid assays.
The study of an organism’s total metabolic response to an environmental stimulus or a genetic modification is called Metabolomics. The metabolomics of an organism is mainly drawn from the known functions of enzymes and proteins involved in metabolic and biochemical pathways. This field forms an integral part of functional genomics in determining the phenotypic effects of genetic modifications such as gene deletions, insertions, and other mutations.
Integration of genomics, proteomics, metabolomics into a single network system is termed as Systeomics. This field of study uses computational techniques to analyse and model cell interactions. This is an interdisciplinary field of study that focuses on complex interactions within biological systems using a holistic approach.
The study of the total genomic content of the microbial community is called Metagenomics. It helps in epidemiological study of various pathogenic agents such as mycobacteria, S. aureus, E. coli, cholera, influenza, HIV, Ebola virus, etc. The Earth microgenome project reconstructed approximately 500 million varieties of microbial genomes. Before the first NGS platforms emerged, metagenomics studies were focused on 16srRNA genes to genotype and detect different species of microbes. Over the past 10 years many big projects such as the TerraGenome project for soils and the Tara Oceans project on the microbiome, eukaryotic plankton, and viromes of the global oceans emerged for sequencing metagenomes.
Studies involved in advancing crop improvements and understanding plant biology using NGS are called Agricultural genomics or Agrigenomics. Arabidopsis thaliana was the first plant genome that was published in 2000. Since then, nearly 54 new plant genomes have been sequenced in 2013, followed by another 6 plant genomes including the hexaploid bread wheat genome.
Peek into the near future:
NGS is the science of Biological information systems and “Big Data” today. However, several challenges still persist with regard to NGS data acquisition, storage, analysis, integration, and interpretation. Hence, future developments will unquestionably depend on new technologies and large-scale collaborative efforts from multidisciplinary and international teams to continue generating comprehensive, high-throughput data production and analysis. With the new innovations and the availability of cost-effective sequencing methods and the existing “Third generation sequencing” tools, smaller industries and individual scientists will be able to participate in the genomics revolution and contribute new knowledge to the different fields of structural and functional “Omics” in the life sciences.
Source: Translational research in infectious disease: Current paradigms and challenges ahead – Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/A-top-down-explanation-of-omics-Genomics-is-the-study-of-the-complete-set-of_fig1_225062532 [accessed 15 Nov, 2021]
Next-Generation Sequencing — An Overview of the History, Tools, and “Omic” Applications | InTechOpen, Published on: 2016-01-14. Authors: Jerzy K. Kulski
#NGS data, #Next-Generation Sequencing, #de novo whole-genome, SNPs, Ampliseq, Whole Exome Sequencing, Mendelian Genetic disorders, #RNA transcript, #RNA sequencing, #Methylomics, #Proteomics, #Metabolomics, #Systeomics, #Metagenomics, #Agrigenomics,