The adaptive immune system’s ability to recognize and neutralize a vast array of pathogens relies on the diversity of its lymphocyte repertoire. T and B lymphocytes, with their unique T cell receptors (TCRs) and B cell receptors (BCRs), are crucial for immunological memory and effective immune responses. This diversity allows the immune system to identify and combat a broad range of pathogens. Studying immune repertoire diversity offers valuable insights into disease mechanisms, therapeutic strategies, and vaccine development.
By Dr. Lavanya Balakrishnan and Vinay C. G., MedGenome Scientific Affairs
The adaptive immune system’s ability to recognize and neutralize a vast array of pathogens relies on the diversity of its lymphocyte repertoire. T and B lymphocytes, with their unique T cell receptors (TCRs) and B cell receptors (BCRs), are crucial for immunological memory and effective immune responses. This diversity allows the immune system to identify and combat a broad range of pathogens. Studying immune repertoire diversity offers valuable insights into disease mechanisms, therapeutic strategies, and vaccine development.
How TCRs and BCRs differ and contribute to immune repertoire diversity
TCRs recognize antigenic peptides presented by MHC molecules, while BCRs and antibodies bind directly to antigen surfaces. TCRs consist of either α and β chains or γ and δ chains, with most T cells expressing αβ TCRs and a smaller percentage expressing γδ TCRs involved in innate immunity. BCRs are membrane-bound and consist of heavy and light chains, while immunoglobulins (Ig) are secreted by B/plasma cells1,2.
These immune receptors arise from somatic V(D)J recombination, generating over 1018 potential diversities in T cells and 1013 in B cells. Each receptor is composed of variable (V), diversity (D), joining (J), and constant (C) gene segments, which determine lymphocyte specificity. TCRs and BCRs include three complementarity-determining regions (CDR1, CDR2, CDR3), with CDR3 being crucial for antigen binding. Recombination involves joining D and J segments first, then V, with exonucleases removing and random nucleotides being added, enhancing junctional diversity. The D segment is present only in TCR β, TCR γ, and BCR heavy chains, while other chains only involve V and J segments. B cell receptors may also undergo somatic hypermutation and class-switch recombination to enhance antibody affinity and diversify the immune repertoire1,2.
Leveraging immune repertoire sequencing for innovative therapeutic strategies
Immune repertoire profiling, through TCR and BCR sequencing, provides a comprehensive view of the immune system. By analyzing clonal diversity and dynamics, researchers gain insights into immune responses, disease mechanisms, and therapeutic targets. This approach is transforming our understanding of immunity and driving advancements in immunotherapy, immuno-oncology, autoimmune, and infectious diseases. Several studies have validated the utility of immune repertoire sequencing for investigating fundamental questions in healthcare and medical science. Summarized below are a few studies that highlight the benefits of bulk TCR/BCR sequencing in advancing our understanding of immune responses.
Overlapping T cell signatures in blood and tumor correlate with PD-1 efficacy
T cell receptor repertoire predicts response to PD-1 immunotherapy. Given the increasing use of PD-1 blockade, identifying predictive biomarkers is crucial. TCR repertoire analysis of gastrointestinal cancer cases who have undergone treatment with anti-PD1 antibody (nivolumab) revealed a correlation between treatment response and the overlap of T cell clones found in blood and tumor tissue. Individuals with a higher frequency of shared T cell clones in their blood before treatment experienced better clinical outcomes. These findings suggest that TCR repertoire analysis can potentially serve as a predictive biomarker to guide stratification of individuals for PD-1 therapy3.
Characterization of TCR and BCR repertoires in hepatocellular carcinoma
In hepatocellular carcinoma (HCC), tumor-infiltrating T and B cells play a crucial role in anti-tumor immunity. Analyzing immune repertoire (IR) features of tumor and non-tumor tissues from 64 HCC cases revealed high IR heterogeneity, with non-tumor tissues showing higher BCR diversity and somatic hypermutation, while tumor tissues had comparable or higher TCR diversity and lower immune infiltration. Notably, higher IR evenness in tumors and lower TCR richness in non-tumor tissues correlated with better patient survival. These findings suggest that IR features could serve as biomarkers for HCC diagnosis and treatment, aiding future immunotherapy strategies4.
COVID-19 severity linked to T and B Cell receptor profiles
By comprehensively profiling the immune responses of individuals infected with SARS-CoV-2, researchers constructed a vast repository of B and T cell receptor sequences. Analysis of this data revealed distinct patterns in antibody and T cell responses correlated with disease severity and progression. Specific B cell clusters associated with virus-neutralizing antibodies were identified, along with diverse T cell responses involved in early immune activation, antiviral responses, and regulatory functions. These findings provide a foundation for developing effective vaccines and immunotherapies against SARS-CoV-25.
Decode immune repertoires with MedGenome’s immune profiling solutions
MedGenome delivers advanced immune profiling solutions to unravel the intricacies of TCR and BCR repertoires. Our comprehensive platform integrates high-throughput technology with expert bioinformatics, providing precise and actionable insights. From sample preparation to in-depth repertoire characterization, including the detection of rare clonotypes, our seamless workflow empowers researchers to accelerate discoveries. Our expert team offers tailored support and generates publication-ready results. Utilizing Takara’s Immuneprofiler and VDJ tools, our analysis pipeline delivers comprehensive reports encompassing full-length clonotype sequences, clonal frequencies, diversity metrics, V and J gene usage, and phylogenetic analysis of targeted clonotypes.
Table 1. Details of sample types, library generation methods and analysis offerings for bulk TCR and BCR profiling
MedGenome’s comprehensive and precise immune repertoire analyses enables the identification of critical biomarkers and therapeutic targets, empowering researchers to develop innovative treatments for cancer, infectious diseases, and autoimmune disorders.
Optimize your immune research with MedGenome’s advanced repertoire profiling. Contact us to see how our immune profiling and antibody discovery services can accelerate your R&D efforts.
References
Liu H, Pan W, Tang C, Tang Y, Wu H, et al. The methods and advances of adaptive immune receptors repertoire sequencing. Theranostics. 11, 8945-8963 (2021).
Katoh H, Komura D, Furuya G, Ishikawa S. Immune repertoire profiling for disease pathobiology. Pathol Int. 73, 1-11 (2023).
Aoki H, Ueha S, Nakamura Y, Shichino S, Nakajima H, et al. Greater extent of blood-tumor TCR repertoire overlap is associated with favorable clinical responses to PD-1 blockade. Cancer Sci. 112, 2993-3004 (2021).
Xie S, Yan R, Zheng A, Shi M, Tang L, et al. T cell receptor and B cell receptor exhibit unique signatures in tumor and adjacent non-tumor tissues of hepatocellular carcinoma. Front Immunol. 14, 1161417 (2023).
Schultheiß C, Paschold L, Simnica D, Mohme M, Willscher E, et al. Next-Generation Sequencing of T and B Cell Receptor Repertoires from COVID-19 Patients Showed Signatures Associated with Severity of Disease. Immunity. 53, 442-455.e4 (2020).
Illumina’s TSO 500 empowers comprehensive genomic profiling (CGP), unlocking crucial tumor biomarkers to drive precision medicine. By focusing on clinically relevant genomic regions, CGP provides in-depth analysis, accurately detects low-frequency mutations, and comprehensively characterizes tumors. This facilitates tailored treatment approaches, guiding therapy selection based on specific molecular profiles. Moreover, CGP enables non-invasive disease monitoring through circulating tumor DNA (ctDNA) analysis.
By Dr. Lavanya Balakrishnan and Vinay C. G., MedGenome Scientific Affairs
Illumina’s TSO 500 empowers comprehensive genomic profiling (CGP), unlocking crucial tumor biomarkers to drive precision medicine. By focusing on clinically relevant genomic regions, CGP provides in-depth analysis, accurately detects low-frequency mutations, and comprehensively characterizes tumors. This facilitates tailored treatment approaches, guiding therapy selection based on specific molecular profiles. Moreover, CGP enables non-invasive disease monitoring through circulating tumor DNA (ctDNA) analysis.
TSO 500: Advanced NGS assay for in-depth pan-cancer genomic profiling
TSO 500 is a cutting-edge next-generation sequencing (NGS) assay designed for pan-cancer genomic profiling. TSO 500 targets the full coding regions of 523 genes known to be implicated in cancer and offers a comprehensive analysis of genetic alterations in solid tumors including single nucleotide variants (SNVs), insertions/deletions (InDels), copy number variations (CNVs), gene fusions and splice variants. Furthermore, TSO 500 also assesses microsatellite instability (MSI) and tumor mutational burden (TMB), biomarkers crucial for understanding the tumor’s behavior and potential response to immunotherapy1.
TSO 500 employs a hybridization-based capture approach using unique molecular identifiers (UMIs) and allows the analysis of both DNA and RNA from a single sample1. The TSO 500 portfolio offered by MedGenome includes two assays: TSO 500 for tissue-based profiling and TSO 500 ctDNA for liquid biopsy analysis.
Table. 1. Features of TSO 500 portfolio offered by MedGenome
TSO 500
TSO 500 ctDNA
Sample type
Tissue biopsies and FFPE
Blood
Sample input and amount
DNA and RNA; 50 ng purified DNA and RNA
Cell free DNA; 20–30 ng purified ctDNA
Genes covered
523 genes for DNA variants; 55 genes for RNA fusions and splice variants
523 genes for DNA variants; 23 genes for DNA fusions
Variants called and Genomic signatures
SNVs
InDels
CNVs
Fusions (RNA)
Splice variants
TMB
MSI
HRD (DNA)
SNVs
InDels
CNVs
TMB
MSI
Panel size
1.94 Mb DNA; 358 kb RNA
1.94 Mb DNA
Sequencing platform and read depth
Novaseq 6000 or Novaseq X plus DNA: >80 M PE Reads; 100 bp PE RNA: >40 M PE Reads; 100 bp PE
Novaseq 6000 or Novaseq X plus >400 M PE Reads; 150 bp PE
Predictive biomarker identification: TMB and MSI with TSO 500
The TSO 500 assay is a comprehensive solution for identifying patients likely to benefit from immunotherapy. By accurately quantifying TMB and determining MSI status, two key predictive biomarkers, the assay allows to make informed treatment decisions. Leveraging error-corrected sequencing and robust bioinformatics, the assay enables precise measurement of TMB, including both synonymous and nonsynonymous mutations, and reliable assessment of MSI. Both TMB and MSI values generated by this assay demonstrated high concordance with those obtained from whole exome sequencing and PCR-based assays, respectively.
From data to insights: Illumina Connected Insights and Dragen
Illumina Connected Insights, powered by the Dragen bioinformatics platform, transforms complex genomic data into actionable insights. By integrating the TSO 500 assay, this powerful combination delivers comprehensive tumor profiling, enabling precise patient selection and optimized treatment strategies. Dragen rapidly processes vast amounts of sequencing data, providing accurate variant calls, while Illumina Connected Insights offers a user-friendly interface for interpretation, clinical decision support, and seamless workflow integration.
Effectiveness of TSO 500 in cancer immunotherapy
The TSO 500 assay has proven highly effective in immunotherapy by identifying biomarkers such as TMB and MSI that predict response to immune checkpoint inhibitors. Several studies have shown how the assay has helped identify actionable mutations in various types of cancer, leading to successful targeted therapies. Validation of TSO 500 on 170 clinical samples across different cancers demonstrated precision and accuracy of over 99%, with sensitivity and specificity of at least 99% for all variant types2. Using this assay, higher response rates to anti-PD-(L)1 therapy in TMB-high cases were observed, particularly in gastric, gallbladder, head and neck cancers, and melanoma3. It also highlighted significant genetic alterations, such as BRAF mutations in peritoneal metastases from colorectal cancer4 and heterogeneity in intrahepatic cholangiocarcinoma5. Additionally, TSO 500’s comprehensive profiling of early-stage NSCLC6 and endometrial serous carcinoma7 revealed actionable mutations and potential markers for prognosis and treatment stratification. These findings underscore TSO 500’s role in enhancing diagnostic accuracy and guiding therapeutic decisions across different cancer types.
How MedGenome’s comprehensive genomic profiling enhances targeted therapies
MedGenome offers end-to-end customized genomic profiling solutions to accelerate cancer research. Our expertise in handling diverse sample types, including tumor biopsies, FFPE tissues, and liquid biopsies, combined with advanced bioinformatics capabilities, ensures rapid and accurate analysis. With rapid turnaround times and scalable operations, MedGenome empowers researchers to unlock the potential of precision oncology. We analyze TSO 500 data using Dragen with Illumina Connected Insights, delivering the highest accuracy in variant calls with the fastest analysis time. Our variant summary report includes results from Dragen and Illumina Connected Insights, with advanced analysis reports featuring rich visualizations of mutations, fusions, copy number alterations, and immuno-oncology biomarkers such as TMB and MSI.
Contact us now to learn how our TSO Targeted sequencing can drive your research forward.
Froyen G, Geerdens E, Berden S, Cruys B, Maes B. Diagnostic Validation of a Comprehensive Targeted Panel for Broad Mutational and Biomarker Analysis in Solid Tumors. Cancers (Basel), 14, 2457 (2022).
Jung J, Heo YJ, Park S. High tumor mutational burden predicts favorable response to anti-PD-(L)1 therapy in patients with solid tumor: a real-world pan-tumor analysis. J Immunother Cancer., 11, e006454 (2023).
Heuvelings DJI, Wintjens AGWE, Moonen L, Engelen SME, de Hingh IHJT, et al. Predictive Genetic Biomarkers for the Development of Peritoneal Metastases in Colorectal Cancer. Int J Mol Sci., 24, 12830 (2023).
Kinzler MN, Schulze F, Jeroch J, Schmitt C, Ebner S, et al. Heterogeneity of small duct- and large duct-type intrahepatic cholangiocarcinoma. Histopathology, 84, 1061-1067 (2024).
Choi SJ, Lee JB, Kim JH, Hong MH, Cho BC, Lim SM. Analysis of tumor mutational burden and mutational landscape comparing whole-exome sequencing and comprehensive genomic profiling in patients with resectable early-stage non-small-cell lung cancer. Ther Adv Med Oncol., 16, 17588359241240657 (2024).
Aisagbonhi O, Ghlichloo I, Hong DS, Roma A, Fadare O, et al. Comprehensive next-generation sequencing identifies novel putative pathogenic or likely pathogenic germline variants in patients with concurrent tubo-ovarian and endometrial serous and endometrioid carcinomas or precursors. Gynecol Oncol., 187, 241-248 (2024).
Single-cell sequencing offers unprecedented resolution for analyzing the unique molecular signatures of individual cells. By deconstructing the cellular landscape, it reveals hidden cell populations, tracks how cells differentiate, and identifies disease biomarkers. By dissecting the genomic, transcriptomic, and epigenetic landscape of single cells, researchers can now gain a deeper understanding of complex biological processes, disease mechanisms, and how patients respond to treatments. This method is essential for both basic and clinical research for unlocking the mysteries of cellular complexity and dynamic biological processes.
By Dr. Lavanya Balakrishnan, Vinay C. G. and Michelle Balakrishnan Vierra, MedGenome Scientific Affairs
Single-cell sequencing offers unprecedented resolution for analyzing the unique molecular signatures of individual cells. By deconstructing the cellular landscape, it reveals hidden cell populations, tracks how cells differentiate, and identifies disease biomarkers. By dissecting the genomic, transcriptomic, and epigenetic landscape of single cells, researchers can now gain a deeper understanding of complex biological processes, disease mechanisms, and how patients respond to treatments. This method is essential for both basic and clinical research for unlocking the mysteries of cellular complexity and dynamic biological processes.
Single-cell sequencing: techniques and practical applications in cellular research
Single-cell sequencing utilizes a variety of techniques, with each assay tailored to target specific cellular information according to the investigator’s needs. Here’s an overview of key techniques and their applications:
Single-cell RNA sequencing (scRNA-seq)
scRNA-seq enables gene expression analysis at the single-cell level, revealing hidden cellular diversity. By profiling RNA molecules, it identifies new cell types and discovers disease-associated gene networks, facilitating biomarker discovery and treatment improvement. This technique accelerates drug development by enabling precise patient stratification and monitoring, leading the way to personalized medicine.
Case Example: Recent MedGenome-Supported Publication
A recent study by Pasqualina Colella et al. from Stanford University, published in Nature, employed scRNA-seq to reveal unexpected diversity in microglia-like cells within the brain. This finding shed light on how blood-derived cells can repopulate the central nervous system. Their work further demonstrates the therapeutic potential of stem cell transplants for progranulin-deficient neurodegenerative diseases1.
Single-cell DNA sequencing (scDNA-seq)
scDNA-seq is highly useful in understanding the genetic makeup of individual cells, identify driver mutations and copy number variations to reveal tumor heterogeneity and cancer evolution. One interesting publication by Ng et al. (2024) employed scDNA-seq to investigate the mechanisms leading to amplicons in esophageal adenocarcinoma, a cancer fueled by frequent gene amplifications. This advanced approach provided deeper insights into how amplified regions contribute to tumor evolution over time, offering valuable understanding of cancer progression2.
Single-cell TCR/BCR sequencing
Single-cell TCR/BCR sequencing technique is highly useful in understanding the immune system’s adaptive defenses by profiling T and B cell populations, uncovering rare immune cells crucial for fighting infections and tumors. The technique aids in analyzing unique variable regions within T and B cell receptors, enabling researchers to map antigen specificities and track immune cell development. Blanco-Heredia et al, Memorial Sloan Kettering Cancer Center, New York, used single-cell TCR sequencing to explore how the immune response interacts with tumor evolution in triple-negative breast cancer. The study found that as cancer progresses, the immune response weakens, marked by declining T cell diversity and the emergence of tumor escape mechanisms3.
Single-cell ATAC-seq (scATAC-seq)
scATAC-seq decodes gene regulation within individual cells, identifying unique regulatory patterns and shedding light on cellular differentiation mechanisms. A study by Terekhanova et al. (2023) in Nature highlighted the impact of chromatin accessibility on cancer. Analyzing over 1 million cells from 225 tumor samples across 11 cancer types, the researchers found that changes in DNA accessibility can initiate, progress, and metastasize cancer. They identified both common and cancer-specific gene regulation patterns, including key pathways like TP53, hypoxia, and TNF signaling. The study also emphasized the role of estrogen response and epithelial-mesenchymal transition in metastasis, linking DNA accessibility to gene expression and cancer dynamics4.
Single-cell multi-omics analysis
Single-cell multi-omics analysis integrates diverse data types beyond gene expression (scRNA-seq), including cell surface proteins, antigen receptors (TCR/BCR), and chromatin accessibility (scATAC-seq). CITE-seq further empowers this approach by simultaneously analyzing a vast number of cell surface proteins alongside gene expression in a single cell, offering a deeper understanding of cellular interactions and regulation. For example, a study using scRNA-seq and scATAC-seq on pediatric KMT2A-rearrangedleukemia revealed key insights for targeted as well as immunotherapies5.
A seamless workflow for new discoveries
As a trusted 10x Genomics certified provider, MedGenome, offers end-to-end solutions tailored to your specific single-cell research needs. From sample preparation to data analysis, our expert team is there to support your project.
Extracting meaning from the data: Bioinformatics tools
At MedGenome, we have the expertise and the extensive computational infrastructure to provide a seamless experience from sample through insightful data analysis.
MedGenome offers a range of bioinformatics analysis options to meet your specific research needs:
Standard analysis: We provide raw sequencing data (FASTQ files) and insights from Cell Ranger, including quality control metrics, gene expression levels, and heatmap visualizations for initial exploration using Loupe Browser software.
Advanced analysis: MedGenome’s advanced analysis capabilities include all standard analysis deliverables along with advanced QC via Seurat and enhanced filtering of low-quality cells, contamination, and multiplets. The analysis also encompasses dimensionality reduction and clustering analysis on filtered data, interactive t-SNE plots with cluster information, and general cell type annotation. Additionally, MedGenome offers differential gene expression analysis for clusters and annotated cell types, pathway enrichment analysis, and group comparison.
Specialized analysis: Tailor the analysis further with custom cell type annotations based on your unique markers. Uncover functional differences between cell types through differential gene expression analysis and map their interactions via cell-to-cell interactome analysis, providing a more comprehensive understanding of cellular behavior.
MedGenome, your single cell sequencing partner
MedGenome, a certified 10x Genomics partner, offers flexible and scalable single-cell sequencing solutions. We specialize in analyzing diverse samples—from fresh to cryopreserved and fixed specimens—using a range of advanced techniques. Our meticulous approach includes gentle tissue dissociation, viability checks, and precise cell sorting via FACS. Beyond processing, our expert bioinformatics team integrates samples, performs tailored analysis, and provides comprehensive, publication-ready reports. Researchers benefit from detailed metrics, plots, and raw data access for further exploration.
Unlock the full potential of your research with MedGenome’s cutting-edge single-cell sequencing solutions. Contact us today to learn how we can accelerate your scientific discoveries and drive groundbreaking insights in biology and medicine.
References
Colella P, Sayana R, Suarez-Nieto MV, Sarno J, Nyame K, et al. CNS-wide repopulation by hematopoietic-derived microglia-like cells corrects progranulin deficiency in mice. Nat Commun 15, 5654 (2024).
Ng AWT, McClurg DP, Wesley B, Zamani SA, Black E, et al. Disentangling oncogenic amplicons in esophageal adenocarcinoma. Nat Commun, 15, 4074 (2024).
Blanco-Heredia, J., Souza, C.A., Trincado, J.L. et al. Converging and evolving immuno-genomic routes toward immune escape in breast cancer. Nat Commun15, 1302 (2024).
Terekhanova, N.V., Karpova, A., Liang, WW. et al. Epigenetic regulation during cancer transitions across 11 tumour types. Nature 623, 432–441 (2023).
Chen C, Yu W, Alikarami F, Qiu Q, Chen CH, Flournoy J, et al. Single-cell multiomics reveals increased plasticity, resistant populations, and stem-cell-like blasts in KMT2A-rearranged leukemia. Blood 139(14), 2198-2211 (2022).
Short-read sequencing technologies have, without a doubt, revolutionized genomics. The ability to look at genetic variants at the base pair level and compare gene expression levels between normal and other conditions has made it possible to diagnose more diseases, develop more robust crops, and protect the biodiversity here on Earth.
By Michelle Vierra Balakrishnan
Short-read sequencing technologies have, without a doubt, revolutionized genomics. The ability to look at genetic variants at the base pair level and compare gene expression levels between normal and other conditions has made it possible to diagnose more diseases, develop more robust crops, and protect the biodiversity here on Earth.
However, short reads are inherently limited and often fall short in fully characterizing complex genomes and transcriptomes. Short reads struggle to resolve repetitive regions, assemble complex structural variants, and fully capture isoform diversity. This is where the power of PacBio long-read sequencing comes into play.
PacBio HiFi sequencing delivers highly accurate long reads, enabling us to more fully profile not only the bases involved, but the context in which genetic variation exists. The ability to sequence every base in a genome or transcriptome opens doors previously blocked by extreme GC content, repeat rich regions, and long stretches of homozygosity. And as scientists dedicated to making the world a healthier place, MedGenome now offers researchers an unparalleled ability to delve deeper into the intricacies of genomes and transcriptomes with both short-read and long-read sequencing solutions.
Generating high-quality, contiguous genome assemblies for any organism, regardless of complexity.
Identifying and annotating genes, isoforms, and regulatory elements with greater accuracy than ever before.
Understanding the complete genomic blueprint of an organism, crucial for evolutionary studies, conservation efforts, and agricultural advancement.
Comprehensive human variant detection solution:
Detecting all types of genetic variation, including SNVs, Indels, SVs, and variants in complex regions, with high confidence.
Providing haplotype-resolved variant information, crucial for understanding inheritance patterns and disease mechanisms.
Enabling more accurate diagnosis, personalized treatment strategies, and a deeper understanding of human health and disease.
Full-Length RNA sequencing solution:
Capturing the complete length of RNA transcripts, providing a comprehensive view of isoform diversity.
Identifying novel transcripts and fusion genes, crucial for understanding disease mechanisms and identifying potential drug targets.
Offering both bulk and single-cell full-length RNA sequencing, enabling researchers to dissect cellular heterogeneity and unravel the complexities of gene expression at single-cell resolution.
Combined capabilities – your omics superpower
The future of genomics lies in leveraging the strengths of both short-read and long-read technologies. This hybrid approach, combining the comprehensiveness of long reads with the affordability and counting ability of short reads, will undoubtedly accelerate discoveries and advance our understanding of complex biological systems.
From strategy to publication: crafting omic insights with your research in mind
At MedGenome, we believe in the power of comprehensive genomic exploration. That’s why we offer both short-read and PacBio long-read sequencing services, allowing you to choose the best strategy for your specific needs or combine both for unparalleled insights. Whether you’re assembling complex genomes, characterizing cryptic genetic variation, or unraveling the complexities of the transcriptome, our expert team is here to guide you every step of the way. Contact us today to discuss how our comprehensive suite of sequencing solutions can empower your next breakthrough discovery.
#De novo genome assembly and annotation, #human variant detection, #Full-Length RNA sequencing, #long-read sequencing, #pacbio
Spatial analysis has become an indispensable tool in unraveling the complexities of biological systems. From understanding gene function to dissecting tissue microenvironments, spatial data holds the key to unlocking valuable insights into biological processes. However, harnessing the full potential of spatial data requires not only sophisticated analytical techniques but also considerable computational resources and expertise.
By Michelle Vierra Balakrishnan
Spatial analysis has become an indispensable tool in unraveling the complexities of biological systems. From understanding gene function to dissecting tissue microenvironments, spatial data holds the key to unlocking valuable insights into biological processes. However, harnessing the full potential of spatial data requires not only sophisticated analytical techniques but also considerable computational resources and expertise.
Standard spatial analysis
At the heart of spatial analysis lies the extraction of meaningful information from spatially resolved transcriptomic data. However, the output of a spatial experimental run is a FASTQ file from a sequencer that has to be further analyzed for useful information. Space Ranger, the go-to solution for standard spatial analysis of 10x Genomics spatial data, provides researchers with essential statistics and visualizations to assess data quality and explore gene expression patterns.
Space Ranger’s standard analysis pipeline generates key metrics such as the number of spots, genes per spot, and gene expression overlaid on slide images.
While Space Ranger is a free tool, it does require expertise for installation and execution as well as significant computational resources to perform the analysis. For researchers who want to save the time and computational resources required, MedGenome offers this standard analysis as a nominal add on to your spatial project.
Advanced spatial analysis: getting multidimensional insights and publication-ready images for your tissue sections
To truly get valuable insights from spatial data, you need additional QC and biological context. MedGenome’s advanced spatial analysis solutions go beyond standard readouts, incorporating rigorous filtering criteria to ensure your data is high-quality and reliable. We first identify and filter out low-quality spots and contaminants such as mitochondrial and ribosomal RNA, providing a clean dataset.
Following spot filtering, we use principal component analysis (PCA) and reclustering to provide a more accurate representation of the spatial transcriptome. We then provide visualizations as spatial plots by cluster to look at spatial localization of the spots. These plots, along with a heatmap and table of differential gene expression, furnish our clients with deeper insights into spatial gene expression patterns, paving the way for novel biological discoveries.
Specialized analysis: cell type annotation that fulfills your specific custom needs and interactome data for unique research questions
For researchers seeking specialized insights, MedGenome offers tailored analysis solutions designed to address specific research questions. Cell type annotation, a cornerstone of understanding the basis of tissue organization, is seamlessly integrated into MedGenome’s repertoire of analysis solutions. Leveraging public datasets, we can annotate the cell types in your data based on the tissue you are exploring. With this analysis, we provide spatial plots by cell type and provide additional plots for the confidence of cell annotation assignments.
For researchers that have specific markers in mind for their cell types, we can take client-provided markers and annotate the specific cell types you’re interested in within your spatial data. With custom marker cell annotation we provide spatial plots by cell type and a table of differential gene expression.
In addition to cell type annotation, MedGenome’s specialized analysis extends to pathway analysis and visualization of cell-to-cell interactions. Cell-to-cell communication models the probability of cell-to-cell communication by integrating gene expression with prior knowledge of the interactions between signaling ligands, receptors and their cofactors.
As part of the analysis, we provide a visualization of the interactome plotted on a spatial image showing not only the number of interactions between cell types, but also the weighted strength of those interactions. By elucidating molecular pathways and mapping intercellular communication networks, MedGenome equips researchers with a comprehensive understanding of tissue biology at the spatial level.
From strategy to publication: crafting spatial insights with your research in mind
Spatial data analysis holds immense promise for advancing our understanding of complex biological systems. However, navigating the intricacies of spatial data analysis requires more than just off-the-shelf tools—it demands specialized expertise and tailored solutions. MedGenome’s comprehensive suite of spatial analysis services, ranging from standard analysis to specialized annotation and pathway analysis, offers researchers a streamlined pathway to unlocking the rich biological insights hidden within spatial data.
Skin cancer is the most common type of cancer in the US. One in five Americans have the likelihood of developing skin cancer by the age of 70. Some common manifestations of skin cancer include basal cell carcinoma (BCC), squamous cell carcinoma (SCC) and melanoma. Research suggests over 5 million cases of skin cancer occur annually in the US alone. Non-melanoma skin cancers (basal cell carcinoma and squamous cell carcinoma) are the most common, followed by melanoma, the more aggressive form. Other types of skin cancer include Merkel cell cancer, cutaneous T-cell lymphoma, Kaposi sarcoma, skin adnexal tumors and sarcomas.
By MedGenome Scientific Affairs
Skin cancer is the most common type of cancer in the US. One in five Americans have the likelihood of developing skin cancer by the age of 70. Some common manifestations of skin cancer include basal cell carcinoma (BCC), squamous cell carcinoma (SCC) and melanoma. Research suggests over 5 million cases of skin cancer occur annually in the US alone. Non-melanoma skin cancers (BCC and SCC) are the most common, followed by melanoma, the more aggressive form. Other types of skin cancer include Merkel cell cancer, cutaneous T-cell lymphoma, Kaposi sarcoma, skin adnexal tumors and sarcomas.
Understanding skin cancer risks: Several factors are associated with risk of developing skin cancer, including chronic exposure to UV light, number and size of moles on the skin, fair skin, and light hair color. In addition, a familial history of skin cancer increases the likelihood of developing the disease due to genetic predisposition. Other risk factors include sunburn history, light complexion, eye color, and hair color, as well as certain skin conditions such as chronic inflammation, immune suppression, and environmental exposures like arsenic exposure and radiation1.
Genetic insights into melanoma’s aggressive nature
While non-melanoma skin cancer has a negligible impact on cancer mortality, melanoma accounts for most skin cancer-related deaths1.
Malignant melanoma arises from unchecked growth or damage to melanocytes. It is characterized by significant genetic heterogeneity, with tumors showcasing numerous genetic alterations and mutations. This complexity fuels the aggressiveness of melanoma, driving tumor progression, metastasis, and resistance to treatment. A study featured in Nature Cell Biology delved into the role of LAP1 protein in melanoma advancement. Elevated LAP1 levels in metastatic cells hint at its crucial role in cancer spread, as demonstrated in this research2, highlighting LAP1 as a pivotal regulator of melanoma’s aggressiveness. Targeting LAP1 emerges as a promising strategy to curb melanoma spread. Conducted by researchers from the Francis Crick Institute, Queen Mary University of London, and King’s College London, the study suggests classifying LAP1 as a potential prognostic marker in melanoma patients. Higher LAP1 levels at the perimeter of primary tumors correlate with increased melanoma aggressiveness and poorer outcomes.
Genetic factors and syndromic associations in skin cancer pathogenesis
Skin cancer occurrence is significantly influenced by hereditary genetic predisposition. For example:
Melanoma: Hereditary melanoma arises from mutations in two key genes, cyclin-dependent kinase inhibitor 2A (CDKN2A) and cyclin-dependent kinase 4 (CDK4), both major tumor suppressor genes. These mutations notably elevate melanoma risk, with CDKN2A mutations alone accounting for 35-40% of familial melanomas. Other implicated genes include BAP1, BRAF, CDK4, KIT, MITF, NF1, NRAS, PTEN, TERT, and TP53, affecting crucial pathways such as the phosphoinositide 3-kinase (PI3K)/AKT pathway and the RAS/RAF/MEK/ERK signaling cascade.
Basal cell carcinoma: Mutations in PTCH1 and PTCH2 genes underlie Basal Cell Nevus Syndrome, increasing the risk of BCC.
Squamous cell carcinoma: Certain syndromes like oculocutaneous albinism, epidermolysis bullosa, and Fanconi anemia are associated with higher susceptibility to SCC.
Current strategies for prevention and treatment of skin cancers:
Preventing skin cancer involves several key strategies aimed at reducing exposure to harmful UV radiation and minimizing other risk factors. These include using sunscreen, avoiding artificial sources of UV exposure like tanning beds and sunlamps, and regularly performing skin self-exams. By adopting these preventive measures, individuals can reduce their risk of developing skin cancer and promote overall skin health.
Early diagnosis is critical for skin cancer, as survival rates plummet with advanced disease stages. While surgery offers a cure for localized cases, advanced stages like metastatic melanoma have significantly worse outcomes, with 3-year overall survival ranging from a mere 4.7% to 26.4%. Fortunately, promising new avenues like immunotherapy and targeted therapies are providing renewed hope. BRAF/MEK inhibitors, anti-PD1 therapy, and combinations like nivolumab and ipilimumab have shown promising results in melanoma (Table 1). For advanced BCC, vismodegib and sonidegib target the Hedgehog pathway, while cemiplimab serves as a second-line therapy. Similarly, anti-PD1 agents like cemiplimab, pembrolizumab, and cosibelimab have yielded significant responses in locally advanced or metastatic SCC. PD-1/PD-L1 inhibitors and locoregional approaches are also under investigation for Merkel cell carcinoma3.
Additionally, Tumor mutation burden (TMB) analysis, facilitated by genomic techniques, is increasingly recognized as a pivotal determinant of treatment response, particularly in immunotherapy. High TMB correlates with heightened responsiveness to immunotherapeutic interventions, offering a promising avenue for personalized treatment strategies.
Table. 1. List of therapeutic agents approved by FDA for melanoma4
Sl. No.
Therapeutic drug
Mode of action
Targeted therapy
1
Vemurafenib
BRAF inhibitor
2
Cobimetinib + Vemurafenib
MEK inhibitor + BRAF inhibitor
3
Binimetinib + Encorafenib
MEK inhibitor + BRAF Inhibitor
4
Dabrafenib + Trametinib
BRAF inhibitor + MEK inhibitor
Immunotherapy
5
Ipilimumab
Antibody against CTLA-4
6
Nivolumab
Antibody against PD-1
7
Pembrolizumab
Antibody against PD-1
8
Talimogene Laherparepvec (T-VEC)
Oncolytic virus
9
Ipilimumab + Nivolumab
Antibody against CTLA-4 + antibody against PD-1
10
Tebentafusptebn
T-cell receptor-bispecific molecule that targets both glycoprotein 100 and CD3
11
Nivolumab + Relatlimab
Antibody against PD-1 + antibody against LAG-3
Combined (Immunotherapy + Targeted therapy)
12
Atezolizumab + Cobimetinib + Vemurafenib
Antibody against PD-L1 + MEK inhibitor + BRAF inhibitor
Cellular therapy
13
Lifileucel
Tumor-derived autologous T-cell immunotherapy
Numerous clinical trials are exploring pharmacological treatments for melanoma, BCC and SCC. Preclinical research has identified potential future targets such as CD126, CSPG4, tandem CD70 and B7-H3, and αvβ3 integrin for targeted melanoma therapy. In addition, novel approaches like oncolytic virus therapy and interventions to enhance immunotherapy effectiveness are being investigated. Despite the success of immunotherapy in some cases, its efficacy varies among patients, with only about 50% experiencing long-term survival. Consequently, current research aims to identify predictors of immunotherapy response and develop strategies for better prognosis in refractory patients. These ongoing investigations aim to develop more effective and personalized treatment options, ultimately improving patient outcomes and survival rates5.
FDA has recently approved the first cancer tumor-infiltrating lymphocytes (TIL) therapy lifileucel (Amtagvi) for treatment of advanced melanoma. TIL (tumor-infiltrating lymphocyte) therapy, entails augmenting the population of immune cells within tumors, leveraging their potency to combat cancer6.
Genomics in skin cancer: In recent decades, advancements in skin cancer treatment, including targeted agents and immunotherapy, have significantly improved patient outcomes. Despite these strides, challenges persist, including limited efficacy, therapy resistance, and adverse effects. For example, while immunotherapy shows promise, it benefits only a subset of patients and can trigger adverse reactions. Combining therapies can enhance effectiveness but often leads to heightened side effects. Therefore, ongoing research aims to explore new targets, refine existing therapies, mitigate side effects, and comprehend resistance mechanisms. Genomic technologies, like next-generation sequencing (NGS), play a pivotal role in this pursuit. NGS panels have transformed skin cancer treatment by enabling precise molecular characterization. By analyzing individual patient tumor profiles, specific genetic alterations can be identified for targeted personalized therapies. This approach holds significant promise for enhancing treatment efficacy, minimizing side effects, and ultimately improving the prognosis for skin cancer patients.
Conclusions
The significant burden of skin cancer necessitates continued research efforts to improve prevention, early detection, and treatment strategies. This includes public health initiatives to promote sun protection and early detection measures, alongside advancements in personalized medicine through leveraging genetic insights. By tailoring treatment approaches based on individual patient characteristics and tumor profiles, we can strive for a future with more effective and less toxic therapies, ultimately reducing the morbidity and mortality associated with skin cancer.
MedGenome’s advanced solutions for skin cancer
MedGenome offers comprehensive tumor microenvironment solutions supported by a targeted sequencing approach, facilitating the identification of immune-oncology biomarkers such as microsatellite instability (MSI) and tumor mutational burden (TMB). Our optimized NGS assays enable the detection of low-prevalence pathogenic variants and epigenetic regulators relevant to various skin cancers. As pioneers in advanced genomic technologies and certified 10x service provider, we support your research journey from design to publication. Our expertise ensures the selection of optimal workflows, precise sample processing, and timely result delivery. We augment your research with custom visualizations, tailored analysis workflows, and seamless integration of external data, ensuring readiness for publication.
Feel free to contact our expert scientific team at research@medgenome.com for any questions or further information.
Jung-Garcia, Y., Maiques, O., Monger, J. et al. LAP1 supports nuclear adaptability during constrained melanoma cell migration and invasion. Nat Cell Biol25, 108–119 (2023).
Rubatto M, Sciamarrelli N, Borriello S, Pala V, Mastorino L, Tonella L, Ribero S, Quaglino P. Classic and new strategies for the treatment of advanced melanoma and non-melanoma skin cancer (2023). Front Med (Lausanne)., 9:959289.
Fateeva A, Eddy K, Chen S. Current State of Melanoma Therapy and Next Steps: Battling Therapeutic Resistance (2024). Cancers (Basel). 16(8):1571.
Natarelli N, Aleman SJ, Mark IM, Tran JT, Kwak S, Botto E, Aflatooni S, Diaz MJ, Lipner SR. A Review of Current and Pipeline Drugs for Treatment of Melanoma (2024). Pharmaceuticals (Basel). 17(2):214.
Head and neck squamous cell carcinoma (HNSCC) ranks among the most prevalent cancers worldwide. In the United States, it is estimated that 58,450 new cases will be diagnosed in 2024, primarily affecting the oral cavity and pharynx1. Incidence rates among males are highest in non-Hispanic White and American Indian/Alaska Native individuals, with lower rates observed in Hispanic and Asian/Pacific Islander populations. Among females, incidence rates are elevated in non-Hispanic White and Asian/Pacific Islander individuals, while being lowest in Hispanic and Black populations.
By MedGenome Scientific Affairs
Head and neck squamous cell carcinoma (HNSCC) ranks among the most prevalent cancers worldwide. In the United States, it is estimated that 58,450 new cases will be diagnosed in 2024, primarily affecting the oral cavity and pharynx1. Incidence rates among males are highest in non-Hispanic White and American Indian/Alaska Native individuals, with lower rates observed in Hispanic and Asian/Pacific Islander populations. Among females, incidence rates are elevated in non-Hispanic White and Asian/Pacific Islander individuals, while being lowest in Hispanic and Black populations. Major risk factors for HNSCC include tobacco and alcohol use, as well as Human Papilloma Virus (HPV) infection. While tobacco-related HNSCC rates have declined over time, rising incidence rates, particularly among younger individuals, are attributed to HPV-related disease2.
Molecular pathogenesis of HNSCC
HNSCC, like other solid tumors, develops through genetic and epigenetic alterations, leading to various cancer phenotypes. Whole exome sequencing analyses of HNSCC specimens have identified mutations targeting key oncogenes and tumor suppressor pathways, such as p53, Rb/INK4/ARF, and Notch, which regulate cellular processes like proliferation, differentiation, and metastasis. Studies, including The Cancer Genome Atlas (TCGA) project, have categorized HNSCC based on genetic and expression patterns, revealing distinct subtypes with unique molecular features and clinical characteristics. Notably, mutations in genes like TP53 and CDKN2A are prevalent, while HPV+ tumors exhibit different mutation profiles, with frequent amplifications in PIK3CA and SOX2 genes. Furthermore, the Notch signaling pathway and PIK3CA alterations have been implicated in HNSCC progression and immune evasion3.
Tumor microenvironment of HNSCC
The tumor microenvironment (TME) in HNSCC and cancer generally comprises a diverse mix of cancer cells and nonmalignant components, including immune cells like T lymphocytes, tumor associated macrophages (TAMs), myeloid-derived suppressor cells (MDSCs), NK cells, tumor associated neutrophils (TANs), and Cancer associated fibroblasts (CAFs), along with fibroblasts, mesenchymal cells, and vascular endothelial cells. These nonmalignant cells play dual roles in tumor growth and dissemination. Understanding the immune landscape within the TME is crucial for assessing cancer progression and the efficacy of immunotherapy. Research on the TME in HNSCC highlights its impact on cancer behavior through complex cellular interactions mediated by growth factors and cytokines. Inflammatory processes within the TME worsen malignancy, with various immune cells exhibiting diverse functions affecting prognosis. Additionally, cytokines like TGFβ and IL-6 contribute to immunosuppression, influencing therapeutic responses, while hypoxia further promotes immune evasion and tumor progression3,4.
Single cell sequencing and its applications in HNSCC
Intra-tumoral heterogeneity (ITH) and plasticity present significant hurdles in translating cancer research into effective therapies due to the varied cellular compositions and dynamic functional states within tumors. Recent advances in single-cell sequencing have substantially improved the resolution of studies exploring ITH, the TME, and intra-tumoral cell-cell communication. This technology allows for the analysis of individual cells within the TME, revealing previously unseen diversity and dynamics. By identifying rare cell populations, characterizing cellular interactions, and discovering novel biomarkers and therapeutic targets, single-cell sequencing has the potential to revolutionize our understanding of HNSCC biology and pave the way for personalized diagnostic and therapeutic strategies. Some of the key highlights of single cell sequencing studies in HNSCC are provided below.
Insights into tumor composition and behavior
In one of the initial single cell sequencing studies on oral squamous cell carcinoma (OSCC), distinct non-malignant cell clusters were identified, including T-cells, macrophages, fibroblasts, and others, with T-cell subsets exhibiting variable sizes across patients. Conversely, malignant cells displayed patient-specific clustering, with only a few shared signatures among tumors, notably featuring a partial Epithelial-mesenchymal transition (EMT) program associated with advanced disease characteristics. Subsequent investigations involving a wider range of HNSCC subsites and HPV statuses corroborated these findings. These studies identified additional subclusters of fibroblasts and revealed patient-specific clustering patterns of malignant cells, with a correlation observed between these patterns and HPV status. Further exploration into cancer stem cells through scRNAseq highlighted metabolic program variations and extensive ITH across multiple tumor types, suggesting that transcriptional ITH reflects tissue heterogeneity5,6.
Decoding the complexity of TME
The immune system serves as a critical defense mechanism against malignant cells, leading to the development of immunotherapy as a prominent treatment strategy in cancer. However, despite its approval for HNSCC, immune checkpoint inhibitors only benefit a minority of patients. To understand the intricacies of the HNSCC TME, several studies have utilized single-cell RNA sequencing (scRNAseq) on sorted cell populations, including CD45+ hematopoietic cells or CD3+ T-cells. These studies often incorporate adjacent normal tissue, peripheral blood leukocytes, and non-tumorous tonsils for comparison. Analyses primarily focused on T-cells due to their pivotal role in antitumor immunity and immunotherapy. These studies revealed distinct T-cell subsets, such as CD8+ cells, which were found in higher proportions within tumor tissue compared to adjacent normal tissue. Additionally, investigations into CD4+ T-cells highlighted an increased presence of regulatory T-cells (Tregs) within tumors, indicative of an immunosuppressive tumor microenvironment. Furthermore, studies identified potential routes of cell-cell communication, particularly emphasizing the interaction between macrophages and T-cells mediated by PD-L1. Mouse models and scRNAseq data also showed how immune responses within tumors can vary, with T-cells multiplying in specific ways. This provided insights into tumor antigen recognition and how immune responses differ between patients. Explorations into the humoral arm of anti-tumor immunity and the contribution of natural killer (NK) cells underscored their potential therapeutic implications in HNSCC5,6.
Conclusions
HNSCC presents a complex interplay between genetics, the tumor microenvironment, and the immune system. Understanding these factors is crucial for developing effective therapeutic strategies. Single-cell sequencing has emerged as a powerful tool, shedding light on the intricate cellular diversity within HNSCC tumors and the dynamic interactions within the TME. These insights not only hold promise for personalized medicine in HNSCC but also contribute significantly to our overall understanding of cancer biology, paving the way for advancements in cancer research across different tumor types.
MedGenome solutions
As a 10x certified service provider and an early pioneer in single-cell genomic sequencing, MedGenome offers comprehensive support throughout your research journey, from experimental design to publication. Our expertise spans selecting the most suitable single-cell workflow, processing diverse sample types with efficiency and accuracy, and delivering timely results. With custom visualizations, tailored analysis workflows, and seamless integration of external data, we ensure your research is publication-ready. Our proprietary algorithm, OncoPeptTUMETM utilizes RNA-seq data to create high-resolution maps of the tumor microenvironment based on specific cell type gene signatures.
For any queries or additional details, please reach out to our expert scientific team at research@medgenome.com.
Siegel RL, Giaquinto AN, Jemal A. Cancer statistics 2024 (2023). CA Cancer J Clin. 2024. 74(1):12-49.
Barsouk A, Aluru JS, Rawla P, Saginala K, Barsouk A. Epidemiology, Risk Factors, and Prevention of Head and Neck Squamous Cell Carcinoma (2023). Med Sci (Basel). 13;11(2):42
Elmusrati, A., Wang, J. & Wang, CY. Tumor microenvironment and immune evasion in head and neck squamous cell carcinoma (2021). Int J Oral Sci 13, 24.
Ruffin AT, Li H, Vujanovic L, Zandberg DP, Ferris RL, Bruno TC. Improving head and neck cancer therapies by immunomodulation of the tumour microenvironment. Nat Rev Cancer. 23(3):173-188.
Qi Z, Barrett T, Parikh AS, Tirosh I, Puram SV. Single-cell sequencing and its applications in head and neck cancer (2019). Oral Oncol. 99:104441.
Heller G, Fuereder T, Grandits AM, Wieser R. New perspectives on biology, disease progression, and therapy response of head and neck cancer gained from single cell RNA sequencing and spatial transcriptomics (2023) Oncol Res. 32(1):1-17.
Colorectal cancer (CRC) stands as the third most prevalent cancer and the second leading cause of cancer-related deaths in the US. It is projected that in 2024, there will be around 106,590 new cases of colon cancer and 46,220 new cases of rectal cancer. CRC incidence is notably higher among African Americans and lowest in Asian Americans/Pacific Islanders.
By MedGenome Scientific Affairs
Introduction
Colorectal cancer (CRC) stands as the third most prevalent cancer and the second leading cause of cancer-related deaths in the US. It is projected that in 2024, there will be around 106,590 new cases of colon cancer and 46,220 new cases of rectal cancer. CRC incidence is notably higher among African Americans and lowest in Asian Americans/Pacific Islanders. The five-year relative survival rate for localized CRC is estimated to be 91%, contrasting with a 14% rate for metastatic disease. Mortality rates among older adults have seen a decline in recent years owing to factors such as the implementation of screening programs, advancements in imaging technology for precise staging, improvements in surgical procedures, and the development of new treatment modalities. Nevertheless, concerning trends reveal a 1% annual increase in CRC mortality rates among individuals under 55 since the mid-2000s1,2.
Risk factors
Risk factors for CRC include genetic predisposition, environmental factors, and lifestyle behaviors such as obesity, smoking, and unhealthy diet. Long-standing ulcerative colitis and Crohn’s disease increase CRC risk. Other factors include family history of cancer, colon polyps, diabetes mellitus, and cholecystectomy. Additionally, gut microbiome composition, age, gender, race, and socioeconomic status influence CRC risk2.
Molecular pathways associated with CRC carcinogenesis
Approximately 75–80% of CRCs are sporadic, originating from the accumulation of genetic and epigenetic alterations within specific molecular pathways. These pathways play a crucial role in regulating cell growth, differentiation, and survival. This intricate process of carcinogenesis, known as the adenoma–carcinoma sequence, entails mutations in at least 15 cancer-related genes. From a molecular point of view, CRC is highly heterogeneous and this can be ascribed to three major molecular pathways. The most prevalent pathway, accounting for 85% of sporadic CRCs, is chromosomal instability (CIN). CIN is a hallmark of genomic instability and is characterized by gain and loss of large chromosomal segments, leading to gene copy number variations, frequent loss of heterozygosity (LOH) at specific gene loci and chromosomal rearrangements. These alterations often involve mutations in specific oncogenes such as BRAF, KRAS, PIK3CA or tumor suppressor genes such as APC, SMAD4 and p53, which regulate cell proliferation and play crucial roles in CRC initiation and progression pathways. The second pathway is the CpG island methylator phenotype (CIMP), evident in 15% of CRC cases. This pathway is characterized by the hypermethylation of CpG islands in their promoter regions resulting in the epigenetic silencing of the adjacent genes. The third one is microsatellite instability (MSI), accounting for about ~13–16% of sporadic cases and is often associated with hereditary forms of the disease, such as Lynch syndrome. This pathway results from defects in the DNA mismatch repair (MMR) genes responsible for correcting errors during DNA replication3,4.
Colorectal cancer: from genomic profiling to precision medicine
Genomic insights
Next-generation sequencing (NGS) technologies have revolutionized the understanding of CRC by facilitating comprehensive genomic profiling of tumors. In the past decade, extensive sequencing investigations have explored the genetic foundations of CRC, revealing significant pathways involved in its development, such as WNT, RAS-MAPK, PI3K, TGF-β, P53, and DNA mismatch repair pathways. NGS technology, utilized by global consortia like The Cancer Genome Atlas (TCGA) Research Network, adopted a comprehensive approach, examining exome sequences and DNA copy numbers, clarifying epigenetic modifications, and delineating the role of microRNA in human cancers, including CRC. These studies provided fundamental genetic insights and identified numerous new theranostic and prognostic molecular biomarkers, prompting further investigation and integration into clinical trials. These findings highlighted the genetic diversity of colorectal cancer, challenging its prior classification as a histopathologically homogeneous disease5.
Recently, single-cell RNA sequencing has effectively enhanced current molecular classifications of CRC by identifying unique sub-clones within previously identified subtypes through bulk transcriptomics, providing potential prognostic insights. Moreover, single-cell multi-omics approaches have been employed to monitor transcriptomic and epigenomic alterations in CRC, as well as to identify clinically relevant cell sub-clones linked to cancer progression and metastasis5.
Immunotherapy for metastatic CRC
Immunotherapy, particularly immune checkpoint inhibitors, has emerged as a promising treatment modality in CRC. Currently, immunotherapy in CRC treatment is primarily used for patients with metastatic disease and whose tumors are mismatch repair deficient (dMMR) or microsatellite instability-high (MSI-H). These tumors accumulate a higher number of mutations, making them more easily recognized and targeted by the immune system. Genomic research is focused on identifying biomarkers, such as tumor mutational burden (TMB) and immune-related gene expression profiles, that can predict response to immunotherapy. Additionally, researchers are investigating strategies to enhance the efficacy of immunotherapy in CRC, including combination approaches with other targeted therapies or chemotherapy.
Role of precision medicine
The concept of precision medicine, which involves tailoring treatment strategies to individual patients based on their unique genetic makeup and tumor characteristics, is gaining traction in CRC research. Although targeted therapies have improved outcomes for some patients, predictive models incorporating genetic and environmental factors for risk assessment and personalized screening are emerging but need validation across various populations. Precision medicine approaches aim to maximize treatment efficacy while minimizing adverse effects.
Liquid biopsies, a minimally invasive approach
Liquid biopsy techniques, such as circulating tumor DNA (ctDNA) analysis and circulating tumor cell (CTC) enumeration, are being increasingly utilized in CRC research and clinical practice. Liquid biopsies offer a minimally invasive method for monitoring disease progression, detecting minimal residual disease, and identifying treatment-resistant mutations. In addition to plasma cfDNA levels, which have conventionally been associated with tumor burden, sequencing cfDNA has demonstrated the ability to recapitulate the mutational profile of the primary tumor. They have the potential to revolutionize cancer diagnosis, monitoring, and treatment response assessment.
Characterization of tumor microenvironment
The tumor microenvironment (TME) plays a crucial role in CRC progression and response to therapy. Genomic research is investigating the complex interactions between tumor cells, immune cells, stromal cells, and the extracellular matrix within the TME, including the gut microbiota. Understanding these interactions may lead to the development of novel therapeutic approaches targeting the TME, such as immune-modulating agents and stromal-targeting therapies.
Table. 1. List of targeted therapy drugs used for CRC2
Molecular target
Targeted therapy
Mechanism of action
VEGF
Bevacizumab, Ramucirumab, Ziv-aflibercept and Fruquintinib
Inhibits angiogenesis mediated through VEGF pathway
EGFR
Cetuximab and Panitumumab
Binds to external domain of EGFR receptor and prevents its activation
BRAF
Encorafenib
Targets key enzymes in the MAPK signaling pathway
HER2
Trastuzumab, Pertuzumab and Lapatinib
Binds to extracellular domain of HER2 (Trastuzumab), inhibits the heterodimerization of HER2 (Pertuzumab) and disrupts the downstream signaling pathways activated by HER2 (Lapatinib)
NTRK
Larotrectinib and Entrectinib
Inhibits the tropomyosin-related kinase (TRK) receptor domains found in TRKA, TRKB, and TRKC proteins, resulting in reduced cellular proliferation
RET
Selpercatinib
Inhibits RET kinase through ATP competitive mechanism
KRAS
Adagrasib and Sotorasib
Binds to and stabilizes RAS in its GDP-bound state, leading to decreased signal transduction, particularly through the RAF-MEK-ERK/MAP pathway
Immune checkpoint inhibitors
MSI or Deficient MMR
Pembrolizumab, Nivolumab and Dostarlimb
Binds to PD-1, a receptor expressed on activated T cells, inhibiting its activation by ligands resulting in the activation of T-cell-mediated immune responses against tumor cells
MSI or Deficient MMR
Ipilimumab
Inhibits CTLA-4 leading to T cell activation
Conclusions
A concerted global research endeavor is currently underway toadvance treatments for colorectal cancer. This research encompasses a broad range of activities, from fundamental investigations aimed at unraveling the biological intricacies of CRC to inquiries into the societal factors influencing cancer risk. By adopting a multifaceted approach, we aspire to a future where CRC is not only prevented and detected at its earliest stages but also treated with personalized, efficient, and accessible strategies, ultimately enhancing the quality of life for patients worldwide.
MedGenome offerings
At MedGenome, we provide advanced NGS solutions with optimized workflows and protocols. As a 10x certified service provider, we also offer state-of-the-artsingle-cell sequencing solutions. Our robust in-house bioinformatics platform is tailored to transform raw data into actionable insights, delivering publication-ready, high-quality figures, and detailed reports across a range of NGS data types.
Please reach out to us at research@medgenome.com to get in touch with our expert scientific team for any queries and additional details.
Huang Z and Yang M. (2022). Molecular Network of Colorectal Cancer and Current Therapeutic Options. Front Oncol. 12:852927.
Luo XJ, Zhao Q, Liu J, Zheng JB, Qiu MZ, Ju HQ and Xu RH. (2021). Novel Genetic and Epigenetic Biomarkers of Prognostic and Predictive Significance in Stage II/III Colorectal Cancer. Mol Ther. 29(2):587-596.
Kyrochristos ID, Ziogas DE, Goussia A, Glantzounis GK and Roukos DH (2019). Bulk and Single-Cell Next-Generation Sequencing: Individualizing Treatment for Colorectal Cancer. Cancers (Basel). 11(11):1809.
National Cancer Prevention Month is observed in the month of February every year, with an objective to raise awareness and promote initiatives to prevent cancer. Cancer ranks as the second leading cause of death in the United States (US). Despite government-led cancer education initiatives, the battle against this disease remains complex, with variations in cancer risk persisting among different ethnic groups due to genetic predispositions and disparities in healthcare access.
By MedGenome Scientific Affairs
National Cancer Prevention Month is observed in the month of February every year, with an objective to raise awareness and promote initiatives to prevent cancer. Cancer ranks as the second leading cause of death in the United States (US). Despite government-led cancer education initiatives, the battle against this disease remains complex, with variations in cancer risk persisting among different ethnic groups due to genetic predispositions and disparities in healthcare access. The incidence of different cancer types varies among population groups, influencing cancer rates within diverse demographics, often associated with genetic factors.
Table 1: Cancer type and rates by ethnic groups2,3
Ethnic group
Cancer incidence rate
Hispanic/Latino and Black/African American women
Higher incidence rate of cervical cancer
American Indians/Alaska Natives
Higher death rate by kidney cancer
American Indians/Alaska Natives
Highest rates of liver and intrahepatic bile duct cancer
African-American Males
Highest incidence rate of lung and prostate cancer
White, non-Hispanic
Highest incidence rate of breast cancer
Ashkenazi Jewish Women
Higher risk of breast cancer
Cancer prevention and screening
Studies have indicated that nearly 50% of cancer deaths could be prevented through healthier lifestyles and addressing key risk factors. Some of these risk factors include tobacco use, alcohol intake, poor diet, lack of physical activity, obesity, infections with cancer-related pathogens (such as Human Papilloma Virus (HPV) and Hepatitis B Virus (HBV)), and exposure to ultraviolet radiation. In the US, about four out of ten new cancer cases are linked to preventable causes.
The primary objective of cancer screening is to detect cancer at an early stage or even before symptoms develop, aiming to improve treatment outcomes and reduce mortality rates. The United States Preventive Services Task Force (USPSTF) has provided evidence-based recommendations for conducting screening tests for individuals at average or higher-than-average risk of developing cancer. These recommendations are formulated after carefully evaluating the advantages and potential drawbacks of different strategies for disease prevention, such as cancer screening tests, genetic testing, and preventive treatments.
Some of the screening tests recommended include digital mammography and digital breast tomosynthesis for breast cancer, pap smear and HPV test for cervical cancer, stool-based tests and direct visualization tests (such as flexible sigmoidoscopy, colonoscopy, or computer tomography colonography) for colorectal cancer, low-dose spiral CT scan for lung cancer, and prostate-specific antigen (PSA) test for prostate cancer.
Understanding the healthcare implications of cancer genomics
Cancer genomics stands at the forefront of medical research due to its ability to provide unique insights into the genetic makeup of cancerous cells and tumors. This in-depth understanding facilitates various advancements in cancer diagnosis, treatment, and prevention:
Variant detections: Genetic variants and mutations within cells are the primary cause of cancer or tumor development. Identification of such novel variants can aid in monitoring tumor progression and develop treatment strategies tailored to individual needs. These variants encompass a range of alterations, including single nucleotide substitutions, insertions, deletions, copy number alterations, and other structural rearrangements.
Biomarker discovery: Genomic techniques help to identify various cancer-causing molecular indicators. It allows to understand the various gene expression patterns implicated in cancer thus guiding clinical decision-making, predicting patient outcomes, and monitoring treatment effectiveness. Eg: BRCA1 and BRCA2 mutations in breast cancer, EGFR Mutations in Non-small cell lung cancer, KRAS Mutations in colorectal cancer, BRAF V600E mutation in melanoma, Microsatellite Instability (MSI) in Lynch syndrome, and PD-L1 expression in immunotherapy.
Personalized medicine: Genomics assists in pinpointing specific population cohorts susceptible to cancer types and can even furnish a comprehensive genomic portrait of individuals, expediting treatment and facilitating the delivery of effective therapies for favorable results. Some of the common cancers where precision medicine can be very useful are colorectal cancer, breast cancer, lung cancer, leukemia, lymphoma, melanoma, esophageal cancer, stomach cancer, ovarian cancer and thyroid cancer5.
Targeted therapies: Cancer genomics aids in zeroing in on those genetic mutations within individual tumors and the pathways that propel cancer progression, thereby identifying precise therapeutic targets for effective treatments. Examples of targeted therapies include: Selective BRAF inhibitor vemurafenib for BRAF mutant melanoma, Imatinib and nilotinib targeting the BCR-ABL protein, Erlotinib targeting epidermal growth factor receptor (EGFR), Trastuzumab targeting HER2 cell signaling protein, lapatinib for breast cancer, crizotinib for lung cancer, bevacizumab for lung and colon cancer; and sorafenib for liver and kidney cancer etc6.
Immunotherapy: Analyzing the genomic profiles of cancer and immune cells sheds light on their diverse interactions within the tumor microenvironment. This insight aids in understanding how cancer cells evade immune detection and informs the development of targeted immunotherapies. Neoantigens, identified through prediction algorithms, are emerging as crucial players in cancer immunotherapy. They are unique molecules found on the surface of cancer cells due to tumor mutations. Neoantigens activate the immune system, enabling it to selectively attack cancer cells. Harnessing this knowledge is critical in designing effective cancer immunotherapies, such as immune checkpoint inhibitors and cancer vaccines.
Table 2: List of different types of immunotherapy along with examples
Next-generation sequencing (NGS) has transformed cancer detection and treatment by offering extensive genomic profiles across diverse cancer types. This innovative technology allows for the sequencing of entire genomes, exomes, transcriptomes, or specific genes, thereby facilitating a deeper understanding of cancer genomics. Furthermore, NGS offers several advantages, including the ability to tailor treatment plans to individuals, predict disease outcomes, and identify individuals at higher risk.
Conclusions
Exploring cancer genomics deepens our understanding of the molecular underpinnings of cancer, including its origins, progression, and resistance to therapy. This insight propels continuous investigation into the intricate facets of cancer biology, driving the development of novel approaches for both preventing and treating the disease.
MedGenome offers a cutting-edge genomics-based approach to analyze the tumor microenvironment with unique insights beyond IHC and FACS methods. OncoPeptTUMETM deeply interrogates RNA-Sequencing data sets to produce high resolution mapping of the tumor microenvironment using proprietary cell type specific gene expression signatures. Also, we provide comprehensive genomic profiling of tumor samples using TruSight Oncology 500 (TSO-500) assay, with 523 cancer-related gene variants and 55 RNA variants, this panel provides extensive coverage of biomarkers frequently found in various cancer types. Additionally, our scientific team excels in addressing challenging sample processing scenarios and managing high-throughput sample workflows, ensuring accurate and efficient analysis of cancer genomic data.
References
Siegel RL, Giaquinto AN, Jemal A. Cancer statistics, 2024. CA Cancer J Clin. 74(1):12-49.
Single-cell RNA sequencing (scRNA-seq) is a powerful method that is widely used in biomedical research. It is extensively used to determine cell composition of complex tissues, identify rare cell types, map heterogeneity at single cell level and identify paired, full-length immunoglobulin sequence and T-cell receptor α/β. Advancements in high-throughput single-cell RNA sequencing technologies, in combination with powerful computational tools, has made scRNA-seq a widely used technology
By MedGenome Scientific Affairs
Single-cell RNA sequencing (scRNA-seq) is a powerful method that is widely used in biomedical research. It is extensively used to determine cell composition of complex tissues, identify rare cell types, map heterogeneity at single cell level and identify paired, full-length immunoglobulin sequence and T-cell receptor α/β. Advancements in high-throughput single-cell RNA sequencing technologies, in combination with powerful computational tools, has made scRNA-seq a widely used technology across a broad spectrum of therapeutic areas such as oncology, immunology, neuroscience and developmental biology. Requirement of live cells for most single cell workflows is a bottleneck that limits its wider usage. Advent of 10x genomics Flex protocol has enabled single cell gene expression profiling using fixed samples including FFPE samples. This offers several advantages compared to conventional single cell workflows.
Advantages of using 10x Flex
The conventional methods for scRNA-seq primarily depend on freshly isolated, or cryopreserved cells, rendering them unsuitable for formaldehyde-fixed or FFPE samples. With 10x Genomics’s Chromium Single Cell Gene Expression Flex kit, it is now possible to fix, and store cells or nuclei at -80°C, allowing subsequent analysis without compromising the data quality. Once the fixed single-cell or nuclei samples are prepared for analysis, they undergo hybridization with probe sets designed to target specific regions in the transcriptome. These probe sets are barcoded, facilitating either individual processing in a singleplex or a multiplex workflow. The hybridized transcripts are then amplified to generate sequencing libraries using Gel Bead-in-emulsion (GEM) droplets and the Chromium system. Finally, the samples are sequenced and analyzed using Cell Ranger, a software suite designed specifically by 10x Genomics for performing single-cell RNA sequencing data analysis. An additional multiomic benefit of Flex is the ability to integrate gene expression data with the identification of cell surface proteins at the single-cell level, utilizing both singleplex and multiplex workflows.
A key feature of Flex is its ability to make scRNA-seq adaptable for fragile tissues, ensuring immediate preservation to minimize the loss of quality. Flex is extremely useful when dealing with infectious samples as the samples are fixed. Fixing the samples can neutralize the infectious agents, potentially allowing researchers to handle and analyze samples outside of Biosafety Level 3 facilities, depending on the specific agent, fixation method, and regulatory guidelines. Also, it offers a cost-effective price per cell and is well-suited for large-scale projects. Furthermore, the option for sample multiplexing contributes to decreased batch and experimental variability.
Table illustrating the differences in capabilities between Flex and 3′ Gene Expression assays from 10x Genomics
Features
Flex
3′ Gene expression
Type of chemistry
Probe-based
Reverse transcription-based
Sample type
Primary cells, dissociated fresh or fixed tissue, including FFPE and cell lines
Primary cells, dissociated fresh tissue and cell lines
Cell throughput
Singleplex: 10,000 cells/channel; up to 80,000 cells/chip Multiplex: 128,000 cells/channel; up to 1,024,000 cells/chip
Low: 1,000 cells per channel, with a maximum of 8,000 cells/chip Standard: 10,000 cells per channel, with a maximum capacity of 80,000 cells/chip. High: 20,000 cells per channel, with a total of 320,000 cells/chip.
Species compatibility
Human and Mouse
Human, mouse, rat, model organisms, and plants
Number of reads per cell
Between 10,000 and 40,000
Between 30,000 and 80,000
Cell recovery
High
Variable
Sensitivity
High
Moderate
Exploring spatial gene expression using Visium from 10x Genomics
Spatial transcriptomics is another powerful technique for measuring gene expression across a tissue section, thus providing spatial context. Characterizing spatial distribution of different cell types in healthy and disease conditions can provide significant insights. It can also provide valuable insights into biomarker discoveries, and the elucidation of tumor heterogeneity and its dynamic microenvironments.
Since its inception, spatial transcriptomics has been widely used to study tissue architecture and associated expression pattern in various conditions. Spatial technologies can be broadly categorized into two groups: imaging-based and sequencing-based technologies. The major difference between these two approaches lies in how the spatial localization and abundance of mRNA molecules are determined within a tissue section.
Among several platforms available for spatial transcriptomics, Visium from 10x Genomics is one of the most widely methods. It is an in situ capturing method, wherein the transcript is captured within the tissue and subsequently sequenced externally. The Visium workflow consists of slides, imprinted with oligo capture barcoded probes. The tissue sections are placed onto a glass slide, stained, and then imaged. The tissue sections are then permeabilized, decrosslinked and incubated with transcript specific probes. Transcriptomic probes are then transferred to Visium slides that contain capture probes and extended with barcodes. These barcoded probes are then transferred to microfuge tubes to prepare 10x barcoded sequencing library. These libraries are then sequenced using standard short read sequencing technologies like Illumina.
The Visium technology is compatible with fresh frozen and FFPE tissues.
Both Flex and Visium are powerful tools for single-cell gene expression analysis, but they differ in their capabilities and workflow.
Applications of 10x Flex and Visium
Oncology: Characterize tumor heterogeneity and tumor microenvironments
Drug discovery and development: Understand how drugs affect cells at the single-cell level, identify potential drug targets, and predict therapeutic responses
Immunology: Decipher the immune response at single-cell level, investigate immune cell composition and dynamics within tissues, understand immune response mechanism to diseases or infections
Neuroscience: Analyze gene expression in specific brain regions to gain insights into neural circuits and brain function
Developmental Biology: Characterize diverse cell types within complex tissues, identify gene expression patterns crucial for tissue formation and determine the spatiotemporal dynamics
MedGenome sequencing and bioinformatics solutions
MedGenome is a 10x Genomics Certified Service Provider empowering researchers with cutting-edge single-cell sequencing solutions. Our comprehensive bioinformatics solutions enable researchers to interrogate single cell and spatial transcriptomics data to answer questions ranging from cellular heterogeneity to cell type composition to differential gene expression analysis and much more.
Explore MedGenome’s efficient and rapid 10x Flex and Visium solutions for a budget-friendly option. For detailed insights into our multi omics solutions, connect with the MedGenome scientific team at research@medgenome.com.
References
Wang S, Sun ST, Zhang XY, Ding HR, Yuan Y, He JJ, Wang MS, Yang B, Li YB (2023). The Evolution of Single-Cell RNA Sequencing Technology and Application: Progress and Perspectives. Int J Mol Sci. 24(3):2943.
Du, J., Yang, YC., An, ZJ. et al. (2023). Advances in spatial transcriptomics and related data analysis strategies. J Transl Med. 21, 330.
Williams CG, Lee HJ, Asatsuma T, Vento-Tormo R, Haque A (2022). An introduction to spatial transcriptomics for biomedical research. Genome Med. 14(1):68.
Transcriptome sequencing/RNA sequencing allows unbiased characterization of global gene expression profiles associated with different cells/tissues. As genes govern cellular function, transcriptome profile can provide valuable insights into molecular mechanisms operating in a biospecimen. RNA sequencing has transformed biological research by discovering almost all transcripts encoded by a genome including mRNAs, long non-coding RNAs and miRNAs. It has also revealed many alternatively spliced variants which is a common feature among complex multicellular organisms.
By MedGenome Scientific Affairs
Overview of Transcriptomics
Transcriptome sequencing/RNA sequencing allows unbiased characterization of global gene expression profiles associated with different cells/tissues. As genes govern cellular function, transcriptome profile can provide valuable insights into molecular mechanisms operating in a biospecimen. RNA sequencing has transformed biological research by discovering almost all transcripts encoded by a genome including mRNAs, long non-coding RNAs and miRNAs. It has also revealed many alternatively spliced variants which is a common feature among complex multicellular organisms. It has revolutionized biomedical research by enabling characterization of global gene expression profiles associated with cells/tissues in healthy and disease conditions. Understanding molecular mechanisms of disease has paved way for identification of biomarkers and development of novel drugs targeting specific genes that drive disease pathology.
RNA sequencing has been widely used to study tumor microenvironment and the interactions between cancer cells and immune cells, providing insights into mechanisms of tumor immune evasion and potential targets for immunotherapy. In infectious disease research, RNA sequencing has been used to study host-pathogen interactions and identify host factors that contribute to disease susceptibility or resistance. By analyzing gene expression profiles of infected cells, researchers have uncovered molecular mechanisms underlying pathogen replication and host immune responses. This knowledge can aid in the development of new antiviral therapies and vaccines.
Applications of RNA Sequencing
Gene expression profiling of cells/tissues
Differential gene expression profiling to identify and quantify gene expression differences between healthy and disease tissues
Identification and characterization of alternatively spliced transcripts
Identification of biomarkers based on gene expression signatures associated with different diseases
Identification of drug targets based on molecular mechanisms that drive disease pathology
Identification of oncogenic fusion transcripts that drive cancers
Characterization of molecular mechanisms associated with host-pathogen interactions in infectious diseases
Characterization of gene expression at single cell resolution
MedGenome offers a variety of RNA sequencing services based on sample quality and amount of available starting material.
Types of RNA sequencing services offered by MedGenome
Library Prep Services
Assay Type
Stranded Type
Starting Material
Input Amount
TruSeq Stranded mRNA
mRNA Seq
Yes
RNA
100 ng – 1 μg RNA
Illumina Stranded mRNA
mRNA Seq
Yes
RNA
25 ng – 1 μg RNA
Takara SMART-Seq V4
mRNA Seq
No
RNA, Cells
10 pg – 10 ng RNA, 1 – 1,000 cells
TruSeq Stranded Total RNA
Total RNA
Yes
RNA
100 ng – 1 μg RNA
Pico V2 / V3
Total RNA
Yes
RNA
250 pg – 10 ng RNA
SMART-Seq Stranded
Total RNA
Yes
RNA, Cells
10 pg – 10 ng RNA, 1 – 1,000 cells
Single-Cell RNA Sequencing
Distinct gene expression profiles are associated with different cell types. Single-cell RNA sequencing has emerged as a powerful tool, allowing researchers to unravel the heterogeneity and complexity of biological systems. This technique enables the identification and characterization of rare cell populations, which are often missed in bulk RNA sequencing. By analyzing gene expression profiles at the single-cell level, researchers can identify cell types, define cell states, and uncover novel disease-associated biomarkers. The same technology is now widely used for characterizing immune repertoire by sequencing T-cell and B-cell receptors. It has also enabled identification of heavy and light chain pairs from B cells to characterize antigen specific antibodies.
The process of single-cell RNA sequencing involves multiple steps, including cell isolation, RNA extraction, library preparation, sequencing, and data analysis. Cells are typically dissociated from the tissue of interest and captured in microfluidic devices or droplet-based systems. Individual cells are then lysed, and RNA molecules are extracted and converted into complementary DNA (cDNA). The cDNA is amplified, and sequencing libraries are prepared for high-throughput sequencing.
Data analysis in single-cell RNA sequencing is a complex task due to the large number of cells and the high-dimensional nature of the data. Bioinformatic tools and algorithms are used to cluster cells based on their gene expression profiles, identify differentially expressed genes, and infer cellular trajectories. By integrating single-cell RNA sequencing data with other omics data, such as genomics and proteomics, researchers can gain a more comprehensive understanding of disease mechanisms. MedGenome uses 10x Genomics platform for offering single cell transcriptomics services.
Data Generation and Analysis
RNA sequencing generates vast amounts of data, which require sophisticated computational tools and algorithms for analysis. The raw sequencing data undergoes quality control to remove low-quality reads and sequencing artifacts. The processed reads are then aligned to a reference genome or transcriptome to determine the origin of each read.
Once the reads are aligned, the next step is to quantify gene expression levels. This involves counting the number of reads that map to each gene or transcript. Various statistical methods are used to normalize the expression data and identify differentially expressed genes between different conditions or cell types.
Data analysis in RNA sequencing also involves the identification of alternative splicing events, non-coding RNAs, and fusion genes. These events can provide valuable insights into disease mechanisms and potential therapeutic targets. Furthermore, RNA sequencing data can be integrated with other types of omics data, such as genomic and proteomic data, to unravel complex interactions and regulatory networks.
Looking Ahead: The Future of RNA Sequencing
The field of RNA sequencing is rapidly evolving, with new technologies and analytical approaches being developed. One of the major challenges in RNA sequencing is the analysis of low-quality or degraded RNA samples. Researchers are actively working on improving the sensitivity and accuracy of RNA sequencing methods to overcome this limitation.
Another area of active research is the integration of RNA sequencing data with other omics data, such as genomics, proteomics, and metabolomics. Integrative omics analysis can provide a more comprehensive understanding of disease mechanisms and identify novel therapeutic targets. Machine learning and artificial intelligence algorithms are being developed to analyze and interpret large-scale omics data, enabling the discovery of complex interactions.
In conclusion, RNA sequencing has revolutionized biomedical research by providing a comprehensive view of gene expression patterns and molecular signatures. It has enabled the identification of biomarkers for various diseases and has shed light on the underlying mechanisms of complex diseases. With continuous advancements in technology and data analysis, RNA sequencing holds great promise for personalized medicine and the development of targeted therapies.
MedGenome RNA Solutions or Case study
Papillary thyroid carcinoma (PTC) is one of the most common forms of thyroid cancer with >90% of cases achieving remission post-surgery. Despite this favorable outcome, the emergence of aggressive variants underscores the growing need for personalized therapeutic approaches.
In collaboration with researchers at the University of Mainz, Germany, MedGenome generated RNA sequencing and proteomic data from PTC tumor samples from 22 patients1. Multiomic analysis identified a novel rearrangement in one of the patients. This novel rearrangement led to a BAIAP2L1-BRAF fusion gene product that transformed immortalized human thyroid cells. We also identified two previously known RET fusions in two other patients (Figure 1) as well as other druggable targets including TRIM25, PKCδ, and PDE5A.
Integrative analysis of RNA-seq and proteomics data was performed using the list of differentially expressed genes (Figure 2) and proteins derived independently from RNA-seq and proteomics data, respectively in the fusion-carrying patients to identify factors significantly deregulated at both the mRNA and protein levels. This analysis yielded 20 identified factors (Figure 3), including PDE5A and IGSF1 (also called p120), a factor known to be associated with hypothyroidism, which are upregulated in tumor and/or metastatic tissue in comparison to the matching normal tissue.
Taken together, this study demonstrates the power of multiomic analyses to identify and characterize cancer therapy targets, which in turn can advance precision medicine and personalized therapeutics.
Renaud, E., Riegel, K., Romero, R. et al. Multiomic analysis of papillary thyroid cancers identifies BAIAP2L1-BRAF fusion and requirement of TRIM25, PDE5A and PKCδ for tumorigenesis. Mol Cancer21, 195 (2022).
The field of immune repertoire profiling has witnessed remarkable advancements in recent years, revolutionizing our understanding of the immune system and its role in various diseases. One of the key techniques to understand this complex mechanism is TCR sequencing. TCR, or T-cell receptor, plays a crucial role in the adaptive immune response by recognizing and binding to specific antigens.
By MedGenome Scientific Affairs
The field of immune repertoire profiling has witnessed remarkable advancements in recent years, revolutionizing our understanding of the immune system and its role in various diseases. One of the key techniques to understand this complex mechanism is TCR sequencing. TCR, or T-cell receptor, plays a crucial role in the adaptive immune response by recognizing and binding to specific antigens. By sequencing the TCR repertoire, researchers can gain valuable insights into the diversity and specificity of T-cell populations, leading to the development of novel treatment modalities in infectious diseases, autoimmunity, and immuno-oncology.
TCR sequencing has proven to be a powerful tool for understanding:
Host-pathogen interactions and designing effective therapeutic strategies: Through analysis and identification of specific T-cell clones associated with protective immune responses, novel vaccines or immunotherapies that target these specific T-cell clones can be developed.
TCR sequencing provides crucial insights into the underlying mechanisms of Auto-immune disease development: By comparing the TCR repertoires of patients with autoimmune disorders to those of healthy individuals, researchers have been able to identify aberrant T-cell populations that are associated with autoimmune pathology. This knowledge aids in the development of targeted therapies that restore immune balance and alleviate autoimmune symptoms.
TCR sequencing holds immense promise for personalized cancer treatment: By profiling the TCR repertoire of tumor-infiltrating lymphocytes (TILs), researchers can identify T-cell clones that are specifically targeting tumor antigens. This information can then be used to engineer T-cell-based immunotherapies, such as chimeric antigen receptor (CAR) T-cell therapy, that specifically target and eliminate cancer cells. TCR sequencing also allows for the monitoring of treatment response and the identification of potential immune escape mechanisms employed by tumors.
RNA based TCR repertoire profiling
While traditional TCR sequencing methods rely on genomic DNA, recent advancements in RNA-based sequencing techniques have expanded the scope of immune repertoire profiling. RNA-based TCR repertoire profiling offers several advantages over DNA-based methods, providing deeper insights into T-cell dynamics and functionality.
By profiling the TCR repertoire at the RNA level, researchers can capture the transcriptomic landscape of T-cells, allowing for the identification of actively expressed T-cell clones. This enables a more accurate representation of the T-cell diversity and functionality within a given sample. Moreover, RNA-based TCR sequencing facilitates the characterization of antigen-specific T-cells, enabling researchers to map the immune response to specific antigens with greater precision.
RNA-based TCR repertoire profiling also allows for the detection of alternative splicing events within the TCR transcripts. Alternative splicing can result in the generation of T-cell receptor isoforms with distinct antigen-binding properties. By capturing these isoforms, researchers can gain a deeper understanding of T-cell receptor diversity and its implications for immune recognition and response.
BCR repertoire profiling in diagnostic biomarker discovery and disease diagnosis
B-cells, through their B-cell receptors (BCRs), play a crucial role in humoral immunity by recognizing and binding to antigens. Profiling the BCR repertoire offers valuable insights into B-cell differentiation, BCR somatic hypermutation, class switching, and antigen specificity.
One of the key applications of BCR repertoire profiling is in diagnostic biomarker discovery. By analyzing the BCR repertoires of patients with certain diseases, researchers can identify disease-specific B-cell clones or antibody sequences. These disease-associated BCR sequences can then be utilized as diagnostic biomarkers, aiding in the early detection and monitoring of diseases. Furthermore, BCR repertoire profiling can also provide insights into disease progression and treatment response, enabling personalized medicine approaches.
BCR repertoire profiling provides a better picture for disease diagnosis. By comparing the BCR repertoires of healthy individuals to those of patients, researchers can identify disease-specific B-cell clones or antibody sequences. This information can aid in the early diagnosis and classification of diseases, facilitating timely interventions and improving patient outcomes.
Deeper insights into B-cell differentiation, BCR somatic hypermutation, class switching, and antigen specificity
Beyond its applications in biomarker discovery and disease diagnosis, BCR repertoire profiling provides deeper insights into various aspects of B-cell biology. By analyzing the BCR repertoires of different B-cell subsets, researchers can unravel the intricate processes of B-cell differentiation. This knowledge not only enhances our understanding of normal immune development but also sheds light on the dysregulation of B-cell differentiation in diseases such as leukemia and lymphoma.
BCR repertoire profiling is also instrumental in studying BCR somatic hypermutation and class switching. Somatic hypermutation is a key mechanism through which B-cells generate high-affinity antibodies, while class switching allows to produce different antibody isotypes with distinct effector functions. By analyzing the BCR repertoires of B-cell subsets at different stages of somatic hypermutation and class switching, researchers can decipher the underlying molecular mechanisms and regulatory networks governing these processes.
Furthermore, BCR repertoire profiling enables the characterization of antigen-specific B-cell populations. By identifying B-cell clones that are enriched for specific antigen-binding sequences, researchers can gain insights into the antigen-specific immune response. This information can be valuable for vaccine development, as it helps in the identification of immunogenic epitopes and the assessment of vaccine efficacy.
In conclusion, immune repertoire profiling, particularly through TCR and BCR sequencing, has revolutionized our understanding of the immune system and its role in various diseases. From infectious diseases to autoimmunity and immuno-oncology, TCR sequencing has paved the way for novel treatment modalities. RNA-based TCR repertoire profiling offers deeper insights into T-cell dynamics and functionality. On the other hand, BCR repertoire profiling provides valuable information about B-cell differentiation, somatic hypermutation, class switching, and antigen specificity. By harnessing the power of immune repertoire profiling, we are unlocking new frontiers in diagnostics, therapeutics, and our overall understanding of the immune system.
At MedGenome, we provide TCR and BCR repertoire profiling using bulk input (from cells, RNA) using the SMARTerTCR/BCR Profiling Kit (Takara Bio USA Inc) the Chromium Immune Profiling solutions (10X Genomics). We have expertise in processing a variety of sample types at high-throughput mode.
Complete workflow — Sample extraction, library prep, sequencing and for seamless data analysis and visualization
Detection of TCR clonotypes — Detection of novel clonotypes, and sensitive identification of full-length V(D)J and gives high resolution TCR-α and TCR-β pairing information.
Detection of all BCR isotypes — sequence all heavy and light chains seamlessly with pooled primers
References
Shugay M. et al. Towards error-free profiling of immune repertoires. Nat. Methods11, 653–655 (2014).
Yaari, G. and Kleinstein, S.H. Practical guidelines for B-cell receptor repertoire sequencing analysis. Genome Med. 7:121 (2015).
Georgiou, G., Ippolito, G., Beausang, J. et al. The promise and challenge of high-throughput sequencing of the antibody repertoire. Nat Biotechnol 32, 158–168 (2014).
Six, A., et al. (2013) The past, present, and future of immune repertoire biology–the rise of next-generation repertoire analysis. Front. Immunol. 4(413):1–16.
Genetic variation can range from changes at the level of single bases to whole-chromosomal aneuploidies. Structural variations (SVs) refer to a large alterations in chromosomal structure, typically encompassing larger than 1 Kbp of DNA. SVs include both balanced changes, such as inversions and some forms of translocations, as well as those that alter DNA copy number through duplications and deletions of chromosomal segments.
By Dr. Anwesha Ghosh, PhD, Manager – Scientific Affairs and Communications Specialist
Genetic variation can range from changes at the level of single bases to whole-chromosomal aneuploidies. Structural variations (SVs) refer to a large alterations in chromosomal structure, typically encompassing larger than 1 Kbp of DNA. SVs include both balanced changes, such as inversions and some forms of translocations, as well as those that alter DNA copy number through duplications and deletions of chromosomal segments. SVs account for 25% of protein truncating mutations and are 3 times more likely to associate with a genome-wide association study (GWAS) signal than single nucleotide variants (SNVs). SVs contribute to all classes of genetic disease: sporadic development syndromes, Mendelian diseases, complex disorders and infectious diseases, as well as health-related metabolic phenotypes.
While next generation sequencing (NGS) has enabled extensive characterization of SNVs in the human genome, the short length of sequencing reads it employs impairs its ability to provide insight into larger genomic changes like SVs. Conventional methods to detect SVs include karyotyping, fluorescence in situ hybridization (FISH) and chromosomal microarray analysis (CMA). While karyotyping is cost effective, it suffers from numerous drawbacks, including low resolution, high labor and time consumption, requirement for cell culture, and subjectivity in interpretation. FISH, while not requiring cell culture and significantly improving resolution, is a targeted approach that cannot provide genome-wide information. While CMA can overcome these limitations, it cannot detect certain classes of SVs, such as balanced translocations or inversions, expansions of repeat regions, or low-level mosaicism.
Optical genome mapping (OGM) offers a solution to these pitfalls by essentially combining the genome-wide scope of karyotyping with the visualization principle of FISH into a single workflow that provides data at the highest resolution reported for the field of cytogenetics at 500 bp, while successfully detecting all classes of SVs. A comparison of the requirements, scope, and performance of karyotyping, FISH, CMA and OGM is provided in Table 1.
Table 1
OGM relies on the isolation of intact, ultra-high molecular weight (UHMW) DNA using an extraction protocol specifically designed to minimize the shearing forces generated by typical standard column-based extraction methods. This yields DNA fragments of ~150 Kb to a few Mb in size. This DNA is then fluorescently labeled via covalent modification at CTTAAG hexamer motifs which occur throughout the genome at a frequency of ~14-17 per 100kb in sequence specific patterns. The labeled DNA is loaded on silicon microfluidic chips containing thousands of parallel nanochannels in which individual DNA molecules are linearized, imaged, and digitized. Each DNA fragment bears a distinct spacing and pattern of the hexamer labels (also known as the label profile), which are subsequently grouped and aligned based on label profile matching to produce consensus maps. These maps are then compared in silico to the expected labeling profile of a reference genome. DNA containing structural variations will display a labeling profile that differs from the reference genome at the location of the variation. The type of structural variation present can be determined based on the nature of the altered labeling profile (Fig 1). Bionano’s Saphyr system is currently the best in class for OGM.
SV calling can be performed using Bionano pipelines such as annotated de novo assembly for somatic variations or the annotated rare variant pipeline for germline variations. A snapshot of all SVs detected in the genome in any given experiment can be viewed in a circos plot, while closer inspection of specific regions can be performed in the genome map view (Figure 2).
Bionano EnFocus pipelines have also been developed for targeted detection of specific genomic variations known to be found in diseases like facioscapulohumeral muscular dystrophy (FSDH) and Fragile X syndrome.
Currently, MedGenome offers Bionano’s EnFocus solution for detection of FSHD, which is the third most common inherited skeletal muscle disease. FSHD is associated with contraction of the D4Z4 microsatellite repeat regions within the sub-telomeric region of chromosome 4q35. Normal individuals harbor 11-100 such repeats, while afflicted individuals harbor less than 10 repeats. The current standard of care to confirm an FSHD diagnosis is mainly through a Southern blot assay. However, Southern blots are time and labor intensive with a greater scope for human error, which are drawbacks that can be overcome with the OGM approach.
OGM has applications in the fields of hematological malignancies, solid tumor research, constitutional genetic disorders and quality control for cell and gene therapy such as CAR-T immune cell therapy. As a precision medicine tool for hemato-oncology, it can identify classical actionable fusions (such as the BCR-ABL1 fusion in acute lymphoblastic leukemia) and could enable patient stratification based on signature SVs associated with biological phenomena underlying specific therapeutic sensitivities, such as high replication stress. It can easily identify multiple complex rearrangements within a single patient in challenging cases, thereby precluding the need to perform multiple conventional techniques. It can also identify novel variations, such as those that are missed by sequencing because they are too long or high in GC content. Additionally, it unlocks the potential to characterize SV landscape of solid tumors, which had remained largely unexplored due to the challenges of karyotyping. Finally, when combined with NGS, OGM can not only provide a more comprehensive understanding of cancers, but also aid the diagnosis of rare monogenic diseases.
Bioinformatics plays a vital role in analyzing complex high-throughput sequencing data, particularly in the realm of single cell research. The ability to analyze and interpret massive amounts of single cell data has revolutionized our understanding of cellular heterogeneity and its implications in various biological processes. The blog explores the capabilities of bioinformatics team at MedGenome in analyzing single cell sequencing data. Here, we explore different types of bioinformatics reports, the importance of data visualization and generation of interactive reports such as differential gene expression analysis, heatmap visualization, interactive tSNE plots with cell type and cluster information.
By MedGenome Scientific affairs
Bioinformatics plays a vital role in analyzing complex high-throughput sequencing data, particularly in the realm of single cell research. The ability to analyze and interpret massive amounts of single cell data has revolutionized our understanding of cellular heterogeneity and its implications in various biological processes. The blog explores the capabilities of bioinformatics team at MedGenome in analyzing single cell sequencingdata. Here, we explore different types of bioinformatics reports, the importance of data visualization and generation of interactive reports such as differential gene expression analysis, heatmap visualization, interactive tSNE plots with cell type and cluster information.
Types of Bioinformatics Reports
In the realm of single cell analysis, bioinformatics reports play a pivotal role in summarizing and presenting the findings derived from complex datasets. There are several types of bioinformatics reports commonly used in single cell research, each serving a unique purpose.
1. Cell Type Identification Report: Cell phenotype and its function is determined by gene expression repertoire. Single cell transcriptome profiling is best suited for determining cell type composition of different tissues and also to identify relative proportion of different cell types. This report focuses on the identification and categorization of different cell types within a given dataset. It utilizes unsupervised clustering algorithms to assign cells to distinct clusters based on their gene expression profiles. The report provides insights into the composition and heterogeneity of the sample.
2. Cell State Analysis Report: This report aims to uncover the different cellular states within a cell type. It utilizes dimensionality reduction techniques such as principal component analysis (PCA) or t-distributed stochastic neighbor embedding (tSNE) to visualize the variation in gene expression across cells. By identifying different cellular states, researchers can gain insights into cell fate determination, cell differentiation, and cellular plasticity.
3. Cell-Cell Interaction Analysis Report: This report focuses on deciphering the interactions between different cell types within a tissue or organism. It utilizes network analysis algorithms to infer the communication networks and regulatory relationships between cells. By understanding cell-cell interactions, researchers can unravel the mechanisms underlying tissue development, immune response, and disease progression.
Data Visualization and Interactive Reports
In the realm of single cell research, data visualization plays a pivotal role in unraveling the hidden patterns and structures within complex datasets. It allows for a comprehensive understanding of cellular heterogeneity and facilitates the interpretation of biological phenomena.
Data visualization tools in bioinformatics enable researchers to create interactive reports that provide a dynamic and intuitive representation of single cell data. These interactive reports allow users to explore the data at different levels of granularity, visualize gene expression patterns, perform differential gene expression analysis, and even interact with individual cells.
By leveraging advanced data visualization techniques, such as scatter plots, heatmaps, and bar plots, researchers can gain valuable insights into the relationships between cells, identify key genes or pathways associated with specific cell states or cell types, and even discover novel cellular subpopulations.
Differential Gene Expression and Heatmap Visualization
Differential gene expression analysis is a powerful bioinformatics technique used to identify differentially expressed genes between different groups of cells or conditions. It is particularly useful in single cell research, as it allows researchers to identify genes that play a crucial role in defining specific cell types or cellular states.
We provide various options to visualize the data. For example, the following heatmap provides a graphical representation of gene expression patterns across different cell types or conditions. By visualizing gene expression patterns in a heatmap, researchers can easily identify clusters of genes that are co-expressed and gain insights into the underlying regulatory networks.
Interactive tSNE Plots with Cell Type and Cluster Information
t-distributed stochastic neighbour embedding (tSNE) is a dimensionality reduction technique widely used in single cell analysis. It allows for the visualization of high-dimensional data in a two-dimensional space, while preserving the local structure of the original dataset.
Interactive tSNE plots with cell type and cluster information provide an intuitive representation of cellular heterogeneity within a sample. By assigning different colors or shapes to different cell types or clusters, researchers can easily identify the distribution and composition of cell populations.
These interactive plots enable researchers to explore the data at different resolutions, zoom in on specific cell types or clusters of interest, and even interact with individual cells to extract additional information. They serve as a powerful tool for hypothesis generation, data exploration, and result validation in single cell research.
Future of Bioinformatics in Single Cell Research
As single cell analysis continues to evolve, so does the field of bioinformatics. The future of bioinformatics in single cell research holds tremendous potential for further advancements and breakthroughs.
One of the key areas of development is the integration of multi-omics data in single cell analysis. By combining single cell RNA sequencingwith other omics techniques such as proteomics, epigenomics, and metabolomics, researchers can gain a more comprehensive understanding of cellular heterogeneity and molecular mechanisms.
Moreover, the development of machine learning algorithms and artificial intelligence techniques will enhance the ability of bioinformatics tools to handle large-scale single cell datasets and extract meaningful information. These advanced algorithms will enable the identification of novel cell types, the prediction of cell fate trajectories, and the discovery of new therapeutic targets.
In conclusion, bioinformatics has become an indispensable tool in single cell research. It enables researchers to tackle the challenges posed by massive amounts of single cell data, extract meaningful insights, and unravel the complexities of cellular heterogeneity. With the continuous development of bioinformatics tools and techniques, the future of single cell analysis holds great promise for further advancements in our understanding of biology and disease.
• CITE-seq: Cell surface protein expression + Gene Expression
• Single cell immune profiling: VDJ expression for paired B-cell or T-cell receptors (possible coupling with GEX data)
• Visium spatial transcriptomics: GEX analysis on sectioned tissue layer
MedGenome’s advanced analysis pipeline provides researchers with a comprehensive report which includes publication ready tables, plots and detailed metrics to visualize and interpret the results.
To know more about our capabilities and solution offerings reach us at research@medgenome.com
The advent of single cell sequencing technologies has enabled us to understand and study the complexities of biological systems at a finer resolution. Traditional bulk sequencing methods provide an average representation of gene expression across a population of cells, masking the inherent heterogeneity that exists within a tissue or organism. However, single cell sequencing allows us to capture the maximal transcript diversity in a given cell and allows for a multi-model analysis strategy to generate meaningful insights.
By MedGenome Scientific Affairs
The advent of single cell sequencing technologies has enabled us to understand and study the complexities of biological systems at a finer resolution. Traditional bulk sequencing methods provide an average representation of gene expression across a population of cells, masking the inherent heterogeneity that exists within a tissue or organism. However, single cell sequencing allows us to capture the maximal transcript diversity in a given cell and allows for a multi-model analysis strategy to generate meaningful insights.
Single Cell Technologies
In recent years, technological advancements have improved the efficiency, throughput, and accuracy of single cell sequencing methods.
To achieve single cell resolution, various technologies have been developed, each with its own strengths and limitations. One commonly used approach is droplet-based sequencing, which encapsulates individual cells into tiny droplets along with a unique barcode. This barcode allows for the identification and quantification of transcripts originating from each cell. Droplet-based technologies have the advantage of high throughput, enabling the profiling of thousands to millions of cells in a single experiment. However, they may suffer from certain technical constraints, such as limited sensitivity and the inability to capture full-length transcripts.
Another approach is plate-based sequencing, where single cells are sorted into individual wells of a microplate. This method allows for more precise control over cell capture and is particularly useful when studying rare cell populations. Plate-based technologies also enable the isolation of intact cells for downstream functional assays, such as cell culture or transplantation experiments. However, they are generally lower throughput and require more extensive manual handling.
Regardless of the specific technology used, scRNA-seq data analysis is a critical step in extracting meaningful insights from the vast amount of information generated. Computational methods have been developed to handle the unique challenges posed by single cell data, such as high dimensionality, sparsity, and batch effects. These tools allow researchers to identify differentially expressed genes, perform clustering and trajectory analysis, and visualize the resulting data in a biologically interpretable manner.
Here we explore three broader areas of single cell research that helps us to discover novel insights:
Single Cell RNA Sequencing
One of the key advantages of scRNA-seq is its ability to capture the transcriptomes of individual cells, allowing for the identification of cell types, subpopulations, and rare cell states that may have been overlooked in bulk analyses. By profiling the gene expression patterns of thousands or even millions of single cells, researchers can gain unprecedented insight into the dynamic nature of cellular heterogeneity and its impact on development, disease progression, and therapeutic response.
Moreover, scRNA-seq has shed light on the existence of transitional cell states that occur during cellular differentiation processes. By capturing the gene expression profiles of cells at different time points, researchers can construct lineage trajectories and decipher the molecular events that drive cell fate decisions. This newfound knowledge has the potential to transform regenerative medicine, as it provides a blueprint for generating specific cell types in the laboratory for transplantation or disease modeling purposes. This has led to significant progress in various fields, such as cancer research, immunology, neuroscience, and developmental biology.
Single Cell Immuneprofiling
In recent years, single cell sequencing has also made significant contributions to the field of immunology. By profiling the transcriptomes of individual immune cells, researchers can gain a deeper understanding of the complex interactions between different cell types and their roles in immune responses. This approach, known as single cell immuneprofiling, has the potential to revolutionize the development of immunotherapies and personalized medicine.
For example, scRNA-seq has revealed the existence of rare subsets of immune cells that have distinct functional properties and play crucial roles in disease pathogenesis. By characterizing these rare cell types, researchers can identify novel therapeutic targets and develop more effective treatments. Additionally, single cell immuneprofiling has shed light on the mechanisms underlying immune evasion in cancer and autoimmune diseases, providing new avenues for therapeutic intervention.Furthermore, scRNA-seq has enabled the study of immune cell dynamics in response to infection or vaccination. By capturing the gene expression profiles of immune cells at different time points, researchers can decipher the molecular events that drive immune activation and memory formation. This knowledge can inform the development of vaccines and adjuvants that elicit robust and long-lasting immune responses.
Single Cell Epigenetics
In addition to gene expression analysis, single cell sequencing has also opened the door to studying the epigenetic landscape of individual cells. Epigenetic modifications, such as DNA methylation and histone modifications, play a crucial role in regulating gene expression and cellular identity. Traditional bulk sequencing methods provide an average measurement of these modifications, masking the cell-to-cell variability that exists within a population. However, with single cell epigenetics, researchers can now explore the dynamics of epigenetic regulation at a single cell resolution.
Single cell DNA methylation sequencing allows for the identification of cell-specific DNA methylation patterns, providing insights into cell lineage relationships and developmental processes. By comparing the methylomes of different cell types, researchers can unravel the epigenetic mechanisms that drive cell fate decisions and contribute to disease states.
Furthermore, single cell chromatin accessibility assays have enabled the characterization of cell-type-specific regulatory elements and the identification of transcription factor binding sites, shedding light on the transcriptional regulatory networks that underlie cellular diversity.
Novel techniques, such as spatial transcriptomics and multiomics approaches, are also being used to further enhance, gain holistic understanding of gene and protein expression in the tissue microenvironment. This opens the way to high resolution spatial analysis of cells and tissues without introducing biases in cell recovery.
Overall, single cell sequencing has provided a powerful toolkit for dissecting the complexities of biological systems at an unprecedented level of resolution. By profiling the transcriptomes, immune repertoires, and epigenomes of individual cells, researchers have gained new insights into the mechanisms that govern development, disease, and therapeutic response. As single cell technologies continue to evolve and improve, we can expect even greater discoveries and advancements in the field of genomics and beyond. Therefore, it’s important to stay updated with the latest developments and breakthoroughs gained through single cell sequencing.
MedGenome’s Powerful Single Cell Bioinformatics Analysis Pipeline
To support the single cell research, MedGenome has created highly specific single cell advanced analysis pipelines for different data modalities. Our pipelines can analyze all of 10X Genomics data outputs using well adopted tools in the industry. Our PhD level team can perform sample integration and comparisons, customized analysis, integration of ad hoc tools, project specific visualizations and final customized reporting to support your scientific publications.
• Single 3’ and 5’ Gene Expression
• Single Cell Multiome: ATAC + Gene Expression
• CITE-seq: Cell surface protein expression + Gene Expression
• Single cell immune profiling: VDJ expression for paired B-cell or T-cell receptors (possible coupling with GEX data)
• Visium spatial transcriptomics: GEX analysis on sectioned tissue layer
Cite-Seq, short for Cellular Indexing of Transcriptomes and Epitopes by sequencing, is a powerful technology that has revolutionized single-cell sequencing. With its ability to analyze transcriptomes and protein expression at a single-cell level, Cite-Seq has the potential to greatly advance our understanding of cellular heterogeneity and function in biological systems. In this article, we will discuss the workings of Cite-Seq, its current and potential applications in various fields of research, and its limitations.
By Derek Vargas and Dr. Anantha Kethireddy , Scientific Affairs, MedGenome Inc
Cite-Seq, short for Cellular Indexing of Transcriptomes and Epitopes by sequencing, is a powerful technology that has revolutionized single-cell sequencing. With its ability to analyze transcriptomes and protein expression at a single-cell level, Cite-Seq has the potential to greatly advance our understanding of cellular heterogeneity and function in biological systems. In this article, we will discuss the workings of Cite-Seq, its current and potential applications in various fields of research, and its limitations.
How Cite-Seq Works?
Cite-Seq is a technique that combines single-cell RNA sequencing (scRNA-seq)with antibody-based surface protein detection. The goal is to analyze the transcriptome of each individual cell, along with the surface proteins that are expressed on the cell membrane. By doing so, researchers can get a better understanding of the diversity and functionality of individual cells in a population. The Cite-Seq workflow involves several key steps:
1. Surface Protein Staining with Antibody-oligonucleotide Conjugates: The first step is to dissociate the tissue into a single cell suspension. Next, the cells are stained with a panel of antibodies targeting specific surface proteins of interest. Each antibody is conjugated to a unique oligonucleotide barcode, which allows for the identification of the protein that is bound to each cell. The cells are then sorted based on the presence or absence of each surface protein, and the RNA and protein are isolated from each individual cell.
2.Single Cell RNA Sequencing & Cell surface protein detection: The next step is to generate gel bead in emulsion (GEM) with antibody labeled cells. Because the antibodies attached to the individual cell, they end up together in one GEM. This can be done using a droplet-based method developed by 10X Genomics. The cells are encapsulated in tiny droplets, along with a bead that contains a unique barcode. The reverse transcriptase then adds the barcode into the mRNA transcripts of the cell, allowing for its identification during downstream analysis. The cell barcodes are also added to the antibody-oligonucleotide conjugate.
3. Sequencing and Data Analysis: The RNA and protein from each individual cell are then sequenced using standard techniques. The sequencing data is then analyzed using bioinformatic tools that allow for the identification of individual cells based on their gene expression and protein markers. By analyzing the transcriptomes and protein expression of individual cells, researchers can identify new cell types, characterize the heterogeneity of cell populations, and study the relationships between different cell types.
4. Advantages of CITE-SEQ: Link a cell’s RNA profile with its surface proteins. Profiles multiple surface proteins simultaneously. Combines long standing knowledge of surface protein analysis with ever more complete RNA -Seq data.
Applications of Cite-Seq in Biological Research
Cite-Seq has a wide range of applications in biological research. One of the most significant applications is in the identification of new cell types and the characterization of cellular heterogeneity. By analyzing the transcriptomes and protein expression of individual cells, researchers can identify rare or previously unknown cell types and explore the differences between cell populations. This has important implications for understanding disease states and developing new treatments.
For example, Cite-Seq has been used to identify new immune cell subsets and characterize their roles in the immune response. Researchers used Cite-Seq to investigate the heterogeneity of T cells in the lung tissue of mice infected with influenza virus. They identified a new subset of T cells that expressed a specific set of surface proteins and had a unique gene expression profile. This subset was found to be important for controlling viral replication and preventing lung tissue damage.
Cite-Seq has also been used to investigate the differentiation of stem cells into specific cell types. By analyzing the transcriptomes and protein expression of individual cells during the differentiation process, researchers can identify the genes and proteins that are important for cell fate determination. This has important implications for regenerative medicine and the development of cell-based therapies.
MedGenome offers end-to-end project support for Cite-Seq (TotalSeq A, B & C ) experiments. Our high-throughput lab can take fresh tissue samples, dissociate them into single-cell suspensions, then stain with oligonucleotide-conjugated antibodies (we recommend Biolegend’s universal cocktail for maximum coverage). We generate scRNA-Seq libraries using the10X Genomics platform.
Our bioinformatics team is specialized in providing cutting-edge analysis of single-cell data, using the latest technology and techniques to help you gain deep insights into your biological samples. Whether you are working in genomics, transcriptomics, or other fields, our team of expert analysts can help you interpret your data with precision and efficiency. We use advanced algorithms and machine learning techniques to analyze your data and provide customized reports that meet your unique needs. With our comprehensive approach to single-cell data analysis, you can be sure that you are getting the most accurate and reliable results possible.
#Cite-Seq, #for Cite-Seq experiments, #scRNA-Seq libraries, #Surface Protein Staining, #Antibody-oligonucleotide Conjugates, #single-cell data analysis
Single cell sequencing is a cutting-edge technique used in molecular biology that enables the sequencing of the transcriptome of individual cells. In traditional bulk sequencing techniques, RNA is extracted from a large group of cells, and then sequenced as a whole. However, single cell sequencing allows researchers to analyze the genetic material of individual cells, providing a much more detailed and precise understanding of the diversity and heterogeneity of cell populations.
By Derek Vargas, Scientific Affairs, MedGenome Inc
Single cell sequencing is a cutting-edge technique used in molecular biology that enables the sequencing of the transcriptome of individual cells. In traditional bulk sequencing techniques, RNA is extracted from a large group of cells, and then sequenced as a whole. However, single cell sequencing allows researchers to analyze the genetic material of individual cells, providing a much more detailed and precise understanding of the diversity and heterogeneity of cell populations.
Single cell sequencing involves several steps, including isolating individual cells from a sample, lysing the cells to release their genetic material, amplifying the RNA to generate enough material for sequencing, and then sequencing the material using high-throughput sequencing technologies. There are several different types of single cell sequencing, each with its own strengths and limitations.
Single cell sequencing has many potential applications in basic research and clinical settings. It can be used to study complex biological processes such as embryonic development, cancer progression, and immune system function. It can also be used in research settings to identify rare cell types or genetic mutations that may be missed using traditional sequencing methods. MedGenome offers end-to-end single cell sequencing services using the 10X Genomics platform. Our extensive experience with single cell work allows us to process many samples with fast turnaround time.
Single Cell Sequencing at MedGenome
Despite the many advantages of single cell sequencing, there are also several challenges associated with this technique. Some of these challenges include:
1. High cost: Single cell sequencing can be expensive, as it requires specialized equipment and reagents, as well as significant computational resources for data analysis.
2.Low throughput: Single cell sequencing is a time-consuming process that can only sequence a limited number of cells at a time, which can limit the statistical power of the analysis.
3. Difficulty processing fresh samples: Many research labs are not able to process tissue samples quickly while cells are still viable. This may impact overall data quality
4. Limited availability of fresh samples: Single cell sequencing generally requires high quality fresh/frozen samples. This is often not possible for scientists who focus their studies on retrospective research using biobanked samples.
As a leading provider of single cell sequencing services, MedGenome has developed solutions to many of these problems. Using a combination of skilled scientists and automation, we can support large scale projects while keeping costs low. We also have protocols in place to help researchers preserve their precious samples until they can ship them to our labs. This has allowed MedGenome to process hundreds of single cell samples every year.
10X Genomics Single Cell Flex Kit
As I mentioned, a major limitation of single cell sequencing is that it can be difficult for researchers to preserve samples for processing. There are also researchers doing retrospective studies and getting new samples is impossible. 10X Genomics has recently released a new Single Cell Flex kit which addresses these issues. With this new Flex kit, cells are fixed and permeabilized and can be safely stored or transported without compromising data quality. Once ready to proceed, samples are hybridized to probe sets and may be processed individually (singleplex workflow) or pooled with up to sixteen samples in a single lane of a Chromium chip (multiplex workflow). During GEM generation the probe sets are ligated and extended to incorporate unique barcodes. Libraries are then prepared, sequenced, and analyzed using 10x Genomics Cell Ranger and other bioinformatics tools.
The ability to fix cells and tissues will help scientists who aren’t able to immediately process their samples for single cell sequencing. It is even possible to stain the cells with Totalseq antibodies prior to fixation. The cells can then be stored for months at -80 oC. This will benefit researchers doing longitudinal studies where samples are expected to be collected over a long period of time. In this case the samples can be fixed, frozen, and then shipped once the entire experiment is completed.
The Single Cell Flex kit also allows profiling of FFPE tissues. This workflow uses the Miltenyi Gentlemacs Dissociator (available at MedGenome) to dissociate FFPE scrolls into single cell suspensions. Since the Single Cell Flex kit is probe-based, it allows for a pseudo-transcriptome profiling of these highly degraded samples and allows researchers to gain insights into the cell populations of tissues that may have been collected several years ago.
MedGenome has always been an early adopter of single cell sequencing technologies. We started offering single cell gene expression, and have expanded over the years to support multiomic analysis of samples- including CITE-seq, immune profiling, and ATAC-seq. We have also adopted technologies for tissue dissociation and nuclei isolation. Our goal is to make single cell sequencing technologies available and affordable to genomics research labs, and to support research projects regardless of sample limitations or experiment complexity. Our team is excited about the new Single Cell Flex kit and expect to be offering this service very soon.
Spatial transcriptomics is a technology that allows the analysis of gene expression patterns within a tissue sample in their spatial context. It enables researchers to obtain a comprehensive and high-resolution view of the transcriptome, the set of all expressed genes, across different regions of the tissue. In traditional transcriptomics, gene expression is measured from homogenized cell populations, which can mask important differences in gene expression between different cell types and regions. Spatial transcriptomics, on the other hand, allows researchers to analyze gene expression patterns in intact tissue sections while retaining their spatial information.
By Derek Vargas, Scientific Affairs, MedGenome Inc
Spatial transcriptomics is a technology that allows the analysis of gene expression patterns within a tissue sample in their spatial context. It enables researchers to obtain a comprehensive and high-resolution view of the transcriptome, the set of all expressed genes, across different regions of the tissue. In traditional transcriptomics, gene expression is measured from homogenized cell populations, which can mask important differences in gene expression between different cell types and regions. Spatial transcriptomics, on the other hand, allows researchers to analyze gene expression patterns in intact tissue sections while retaining their spatial information.
Spatial transcriptomics typically involves the following steps:
• Preparation of the tissue sample: Tissue sections are cut and placed on a surface that contains oligonucleotide-labeled spots or barcode arrays.
• Capture of mRNA molecules: The mRNA molecules in the tissue section are captured and attached to the labeled spots, allowing for the spatial location of the mRNA to be retained.
• High-throughput sequencing: The captured mRNA molecules are amplified and subjected to high-throughput sequencing, generating a large amount of data.
• Data analysis: The sequencing data is analyzed to identify the expression levels of different genes and their spatial distribution within the tissue section.
Overall, spatial transcriptomics provides a powerful tool for studying complex tissues, such as the brain, where multiple cell types with distinct gene expression profiles are tightly organized in intricate spatial arrangements. The technology can help researchers to gain new insights into the molecular mechanisms of development, disease, and tissue function. There are several platforms available for spatial transcriptomics, however the two most widely used platforms are Visium (10x Genomics) and GeoMx (Nanostring).
Visium Advances Biomarker Discovery
Visium is a spatial transcriptomics technology developed by 10x Genomics that enables high-throughput analysis of gene expression in intact tissue sections. Visium builds on the principle of spatial transcriptomics and allows for the analysis of the whole transcriptome in a spatially resolved manner, meaning that researchers can obtain detailed information on the gene expression patterns within a tissue sample while maintaining their spatial context.
The Visium platform is based on the capture of mRNA molecules on an array of polymeric spots on a glass slide. The captured mRNA molecules are then barcoded, reverse transcribed, and amplified. The amplified cDNA is sequenced using high-throughput sequencing technologies, generating millions of reads that are aligned to a reference genome. The resulting data can be visualized and analyzed using various software tools provided by 10x Genomics or other bioinformatics platforms. The technology can be used to study complex tissues and biological processes, such as development, disease, and tumor microenvironments, by providing information on the expression levels of thousands of genes in each region of the tissue section.
Nanostring GeoMx Allows Targeted Analysis of Spatial Transcriptome
NanoString’s GeoMx is another spatial profiling technology that enables the analysis of gene expression at high resolution within a tissue sample while retaining its spatial context. The technology is based on the digital barcoding and imaging of RNA molecules in situ, allowing for the precise spatial localization of gene expression patterns within a tissue sample. The GeoMx system utilizes a set of molecular probes that are pre-designed or customized for specific gene targets. The probes are attached to a surface and hybridized to the RNA molecules in the tissue sample, creating a unique barcode sequence for each molecule. The barcoded RNA molecules are then imaged using a high-resolution imaging system, enabling the precise spatial localization of gene expression patterns in the tissue.
The GeoMx technology can be used to analyze hundreds to thousands of genes simultaneously, allowing researchers to gain a comprehensive view of the transcriptome across different regions of the tissue. The technology can be used to study various biological questions, such as the identification of cell types, the characterization of disease-associated gene expression patterns, and the discovery of new biomarkers for diagnosis and therapy. The GeoMx system is also compatible with other NanoString technologies, such as the nCounter platform, enabling researchers to combine spatial profiling with digital quantification of gene expression in the same sample. The technology has applications in various fields, including oncology, immunology, and neuroscience, and can be used in both research and clinical settings.
Conclusion
Visium and GeoMx are both spatial profiling technologies, but there are some key differences between the two platforms. Visium doesn’t require any large lab equipment. The chemistry takes place on a special microscope slide and the workflow can be completed with standard lab equipment found in many cell biology labs. On the other hand, GeoMx does require special machinery, which can make it more difficult for smaller labs to utilize this technology. Another major difference is that Visium takes an unbiased approach to spatially profiling tissues; any tissue placed in the capture area of the slide will be sequenced. GeoMx requires some prior knowledge of the tissue since regions of interest must be chosen. This difference makes Visium a great tool for discovery research, while GeoMx is great for clinical research.
Overall, both Visium and GeoMx are powerful tools for studying gene expression patterns and cellular heterogeneity within complex tissues. The choice of platform depends on the research question and the specific needs of the experiment, as each platform has its own strengths and limitations. Currently, MedGenome is offering full bioinformatics services related to Visium datasets. There are many standard and custom analysis options available to accommodate most projects. Additionally, our genomics lab is in the process of adopting spatial transcriptomics technologies. We expect end-to-end Visium spatial profiling services to be offered in the near future.
Next-generation sequencing (NGS) data is being increasingly used in clinical diagnosis to identify genetic variation that can be a cause for the disease. A major challenge in using NGS data in a clinical setting is to make the right interpretation because of its huge size and complexity. Also, there are possibilities of technical errors during the sample processing and/or sequencing stage that may be inherent to the kind of sequencing technology used. Therefore, the use of reference standards is of paramount importance to mitigate and minimize these errors.
By Archana Deshpande, QA Manager, MedGenome Inc
Introduction
Next-generation sequencing (NGS) data is being increasingly used in clinical diagnosis to identify genetic variation that can be a cause for the disease. A major challenge in using NGS data in a clinical setting is to make the right interpretation because of its huge size and complexity. Also, there are possibilities of technical errors during the sample processing and/or sequencing stage that may be inherent to the kind of sequencing technology used. Therefore, the use of reference standards is of paramount importance to mitigate and minimize these errors.
Reference standards play an important role in the life cycle of a typical NGS method implementation before clinical application. A typical NGS assay life cycle includes assay development, optimization, validation, and continuous quality management – standard is of consequence in all these aspects ranging from assay validation to technical validation to sample processing. This article will discuss the general selection of these reference standards and describe in detail the results of technical validation of a target-capture assay (Illumina’s TruSight Oncology 500 panel) that was performed at MedGenome Labs.
NIST and GIAB Standards
There are several consortiums like GIAB (Genome in a Bottle) and companies (Horizon Diagnostics and SeraCare) that have developed DNA reference material over the years to support clinical translation of whole genome sequencing. NIST (National Institute for Standards and Technology) also had a program to develop whole human genome reference materials. In general, reference standards are well-characterized samples, that are consistent and stable over time. Essentially, the DNA reference is characterized by collating data from various sequencing and bioinformatics methods and from multiple datasets to yield highly confident genotype calls. This data can then be used by laboratories for evaluating assay performance and accreditation agencies for benchmarking results.
In addition to being homogeneous and stable, NIST has defined standards with values that they have certified indicating confidence in their accuracy. This certification indicates that NIST has fully investigated and accounted for all known or suspected sources of bias seen in the data.
MedGenome Validation Data
At MedGenome Labs, we have validated Illumina’s TSO 500 workflow and pipeline using reference samples from SeraCare. TSO 500 is a target-capture based panel that interrogates multiple biomarkers and tumor types; it identifies all relevant DNA and RNA variants implicated in various solid tumor types. Thus, it allows for in-house comprehensive genomic profiling of tumor samples. It also accurately measures key current immuno-oncology biomarkers: microsatellite instability (MSI) and tumor mutational burden (TMB). The other advantage is that the assay has a ctDNA panel that can be used for liquid biopsies.
The workflow for either TSO 500 is a hybrid capture protocol, and we validated our process by using three control ctDNA (with different allele frequencies 0.1%, 0.5% and Wild Type) from SeraCare. The data generated was from as little as 30 ng of starting input.
We performed analysis of the Somatic mutations with gene list present in Seraseq ctDNA Complete Mutation Mix AF 0.5%, AF 0.1% and WT. The analysis included sensitivity, specificity, positive predictive value, and inter-run comparison. Below are some of the results that we obtained.
Library Quality Report
The libraries created from SeraSeq controls ranged from 23 to 46 nM and had an average size of 330 bp with ~200 bp insert size (see Figure 1 for example). This library size fell in the range that is specified in the TSO 500 protocol.
The reference SeraSeq sets were analyzed using TruSight Oncology 500 ctDNA local app. The results were compared with the reference set data provided by the vendor. We were able to obtain 100% sensitivity and specificity for all 3 controls of dataset. The variants are also represented in the lollipop plot which is given below along with the sensitivity and specificity information for the controls (Table 1).
For the inter-run comparison, the samples were compared for the tumor mutation burden count and variant allele frequency (VAF) of the same sample between two runs. The Identified variants were found to be nearly identical and VAF values are highly correlated (See Fig 2 below).
Our process workflow for Illumina’s TSO 500 panel passed the technical validation and we have started offering the TSO500 panel for both solid tumor and ctDNA as part of our services for clients.
Conclusion
DNA reference standards are vital for translational medicine as well as research and MedGenome uses commercially available reference standards to perform technical validation wherever they are available. The validation allows us to have confidence in our process workflows and generated data.
References
1. Genomic Reference Materials for Clinical Application Justin Zook and Marc Salit Biosystems and Biomaterials Division, National Institute of Standards and Technology, 100 Bureau Dr., Gaithersburg, MD 20899
2. Reference standards for next-generation sequencing Simon A. Hardwick, Ira W. Deveson and Tim R. Mercer, Nature Reviews Genetics · June 2017
NGS technologies is at the forefront of Biological Research. They produce enormous data running into gigabases in a single round of sequencing. However, several sequencing artifacts such as read errors (base calling errors and small insertions/deletions), poor quality reads and primer/adaptor contamination are quite common with the NGS data obtained after sequencing.
By Parimala Nagaraja, Scientist, NGS, MedGenome Inc.
NGS technologies is at the forefront of Biological Research. They produce enormous data running into gigabases in a single round of sequencing. However, several sequencing artifacts such as read errors (base calling errors and small insertions/deletions), poor quality reads and primer/adaptor contamination are quite common with the NGS data obtained after sequencing. It can impose significant impact on the downstream analysis such as sequence assembly, single nucleotide polymorphisms (SNP) identification and gene expression studies.
Quality control metrics play a critical role in ensuring to minimise the number of errors and help in achieving high quality data for a successful experimental study. MedGenome strives to maintain strict guidelines in terms of QC metrics to achieve high quality data for our clientele.
QC metrics are mainly applied at 3 levels:
• Sample QC (DNA/RNA)
• Library QC
• Sequencing QC
Sample QC
An Ideal NGS assay would require high quality DNA/RNA which is usually determined using Tapestation/Bioanalyzer that provide the DIN/RIN (DNA/RNA Integrity number) values ranging between 1-10, where 10 is the highest quality sample and 1 is the highly degraded and poor-quality samples.
Depending on the assay type and Sample source, MedGenome has a set of guidelines in terms of Quantity, Quality and Volumes for the clients. At MedGenome, all samples are first subjected to QC using Qubit to determine the quantity and Tapestation/Bioanalyzer to determine the quality.
Based on the QC determined, samples are classified as a Pass or Marginal or Fail. Replacement samples are usually requested for the samples that failed Sample QC. For Marginal samples, replacements are highly encouraged, else they will be proceeded to library preparation after client’s approval.
Library QC
All libraries which are prepared in-house are checked for their quality using Tapestation/Bioanalyzer and quantified using Qubit. Tapestation and Bioanalyzer results are thoroughly reviewed for the expected library size, adapter contamination, primer dimers and PCR artifacts before they are pooled and loaded onto the sequencer. MedGenome also offers sequencing support for Premade libraries which are prepared by various clientele based on their project requirements. All premade libraries are also subjected to MedGenome QC methodologies and are diligently reviewed and classified as Pass or Marginal or Fail before sequencing. Following Images provides an example for Good vs Bad Library QC.
Sequencing QC
Illumina facilitates the users to monitor the runs in real time without interfering with the run performance using a software called Sequencing Analysis Viewer (SAV). This software is compatible with all HiSeq, NextSeq, MiSeq and NovaSeq platforms. The following table describes the features used for evaluating the Sequencing QC:
Table 1: Different terms and their corresponding definitions as viewed in SAV.
Term
Definition
Intensity
The 90% percentile extracted intensity for a given image (lane/tile/cycle/channel combination). On platforms using four-channel sequencing, 4 channels (A, C, G, and T) are shown.
FWHM
The average full width of clusters at half maximum (representing their approximate size in pixels).
% Base
The percentage of clusters for which the selected base has been called.
%Q >/= 20, %Q >/=30
The percentage of bases with a Phred or Q quality score of 20 or 30 or higher, respectively
Density
The density of clusters for each tile (in thousands per mm2).
Density PF
The density of clusters passing filter for each tile (in thousands per mm2).
Clusters
The number of clusters for each tile (in millions).
Clusters PF
The number of clusters passing filter for each tile (in millions. (Metrics given in below images)
% Pass Filter
The percentage of clusters passing the Chastity filter (Metrics given in below images)
% Phasing, % Prephasing
The average rate (percentage per cycle) at which molecules in a cluster fall behind (phasing) or jump ahead (prephasing) during the run.
% Aligned
The percentage of the passing filter clusters that aligned to the PhiX genome.
Error rate
The calculated error rate, as determined by the PhiX alignment. Subsequent columns display the error rate for cycles 1–35, 1–75, and 1–100.
Yield Total
The number of bases sequenced, which is updated as the run Progresses. (Metrics given in below images)
Projected Total Yield
The projected number of bases expected to be sequenced at the end of the run.
Illumina provides the standardised expectations of reads outputs, Reads Passing Filters, and Quality Scores for each Flow Cell type on every sequencing platform. Following Images provide the metrics for different flowcells on NovaSeq 6000.
Sequencing QC also depends on the library types pooled into the same lane or Flow Cell. If libraries prepared using the same protocol (For ex: Illumina Stranded mRNA) are pooled and sequenced, we can see NovaSeq outperforming the Illumina specifications. However, this is usually not the case in an ideal world for any NGS service providing company with high throughput fast paced Turn Around Times. Hence, when multiple libraries of different library types are pooled, it is expected to see the variations in the run performances and the data yields. Following images provide an example of the Sequencing stats achieved by pooling similar libraries and Mixed libraries.
Quality Control of the Sequencing raw data
Raw data quality control should be the initial step of data analysis for any successful study. There are several tools that are publicly available for conducting quality control on raw FASTQ files. FastQC developed by Babraham Institute bioinformatics group is one of the most popular tools that offers QC control parameters such as average base quality score per read, the GC content distribution and identification of the most duplicated reads.
The important parameters to check for raw sequencing data quality are:
• Base Quality
• Nucleotide distribution
• %GC distribution
• PCR duplicates
Base Quality check:
A common way to visualize base quality is to draw a base Q-score versus cycle plot. Sequencing data generated on Illumina platforms tend to observe a median base quality score between 35 and 40 in the Phred scale. Large variations in base quality scores (Figure 10a) usually indicate poor Library QC. Sudden drop in the Quality scores (Figure 10b) usually indicate Adapter dimer contaminations or Fluidics issue in the instrument. For paired-end reads, it is common to observe higher quality in the first end of the read than the second end owing to the amount of time the template was on the instrument and increasing laser exposure over time.
Nucleotide Distribution
This parameter is useful for Whole genome and Whole exome libraries (High diversity) but not for Amplicons or RNA libraries (Medium-Low diversity). For a perfect sequencing run, the distribution of the four nucleotides (A T C G) across all reads should remain relatively stable (Figure 11)
%GC Distribution
The percentage of GC in the genome varies across species and across the regions of each genome. For exome regions, the GC content is about 49–51%, while for whole-genome sequencing (Human), the GC content is around 38–40%. Abnormal GC content percentage (>10% deviation from normal range), can indicate contamination.
PCR Duplicates
PCR duplicates arise during library preparation when PCR amplifies the fragments with adapters. Presence of PCR duplicates can lead to potential biases in variant calling algorithms. Hence these are removed by most of the Bioinformatic analysis pipelines during the pre-processing of the data. General causes for high rate of PCR duplicates are Low input quantity, Over sequencing, too many PCR cycles, Low pre-enrichment yield/final library yield, and short library fragments.
Conclusion
MedGenome strives to follow all the best practices in Lab and QC methodologies. Apart from just performing QC, we also interpret and communicate with the client regarding any deviations from MedGenome’s QC standards and recommend the best possible actions to proceed. After the sequencing is performed to the best of our abilities, the raw data is thoroughly reviewed as per Illumina’s standards prior to the data being shared with clients. MedGenome also offers data and sample storage facilities as per clients’ requests.
5. Guo Y, Ye F, Sheng Q, Clark T, Samuels DC. Three-stage quality control strategies for DNA re-sequencing data. Brief Bioinform. 2014 Nov;15(6):879-89. doi: 10.1093/bib/bbt069. Epub 2013 Sep 24. PMID: 24067931; PMCID: PMC4492405.
Our journey in 2022 was focused on providing the utmost customer experience for the services and solutions that we delivered to you. Along with expanding our portfolio of services and solutions – the tissue dissociation and nuclei isolation services to support our single cell customers, streamlined antibody discovery using high-throughput single B cell receptor sequencing, TSO500 targeted panels for oncology research, single cell and bulk epigenetics assays.
By Hiranjith GH, VP & Head of Research Services, MedGenome Inc.
2022 was an eventful year,
Our journey in 2022 was focused on providing the utmost customer experience for the services and solutions that we delivered to you. Along with expanding our portfolio of services and solutions – the tissue dissociation and nuclei isolation services to support our single cell customers, streamlined antibody discovery using high-throughput single B cell receptor sequencing, TSO500 targeted panels for oncology research, single cell and bulk epigenetics assays. We improved our turn-around times on bulk transcriptomics, whole exome and whole genome projects by incorporating automation at multiple project stages and installing new sequencing capacity in 2022. Our sequencing team has delivered high-quality data consistently for a variety of library types throughout the year.
We also rolled out a series of advanced and interactive analyses reports for each of our assays through our unique ManGo platform with novel data representations. This scale up on our bioinformatics capabilities is in line with our objective to support biologists and researchers to maximize the utility from genomic data.
We partnered with our customers to discuss the data and the subsequent analyses after each project is delivered to ensure that the results meet the researcher needs.
We built streamlined systems and communication processes for sample and data management with our customers, which allowed us to build transparency and a trusted relationship.
In 2023,
We expect to continue to offer high quality support to your projects in 2023. We will be spending lab resources to optimize spatial transcriptomics and Hi-C assays in-house in 2023 to expand our services portfolio. With supporting bioinformatics analyses and tools to provide end-to-end service to our customers.
We will engage with customers on antibody discovery solutions and protein expression services given the highly experienced R&D talent that we have at MedGenome.
With growing sample volumes, we are committed to investing in our sequencing capacity even further in 2023 – to help us maintain the turn-around times (TAT) for the projects. We will discontinue our HiSeq X service by the end of this year.
We are also a preferred partner to customers who are looking to access South Asian genomic datasets in specific diseases areas (rare diseases, oncology, neuro-degenerative disorders, blood disorders, metabolic diseases) for discovery or genetic modifier studies. With a vast network of hospital collaborations in India, MedGenome is able to accelerate these studies with high impact.
Pandemic has tested our systems, processes and quality of team members beyond doubt. It has shown the importance of value added engagements with our customers that MedGenome strives for. Going into 2023, MedGenome is looking forward to continuing those relationships to advance genomics research by our customers.
The discovery of genetic and epigenetic mechanisms underlying the onset and progression of numerous diseases, including cancer, has helped redefine clinical research, diagnostic and treatment paradigms. Oncology research and diagnostics have undergone radical changes because of the development of next-generation sequencing (NGS). NGS has improved rationally designed personalized cancer medicine by identifying novel cancer mutations, detecting circulating tumor DNA (ctDNA), and discovering causative mutations for hereditary cancer syndrome. With NGS, it is now possible to sequence the whole genome, whole exome, whole transcriptome, or just targeted genes to provide detailed genomic landscape descriptions for many cancers.
By Dr. Chaitanya Ekkirala, Lab Director, NGS Operations, MedGenome Inc
Overview
The discovery of genetic and epigenetic mechanisms underlying the onset and progression of numerous diseases, including cancer, has helped redefine clinical research, diagnostic and treatment paradigms. Oncology research and diagnostics have undergone radical changes because of the development of next-generation sequencing (NGS). NGS has improved rationally designed personalized cancer medicine by identifying novel cancer mutations, detecting circulating tumor DNA (ctDNA), and discovering causative mutations for hereditary cancer syndrome. With NGS, it is now possible to sequence the whole genome, whole exome, whole transcriptome, or just targeted genes to provide detailed genomic landscape descriptions for many cancers.
MedGenome is committed to providing the highest-quality NGS services for research and clinical development. Our expert scientific team and laboratory facility with cutting-edge sequencing platforms, including NovaSeq, MiSeq, and 10X Chromium Controller, guarantee highly optimized protocols, customized solutions, and quick turnaround times.
Targeted panel for Oncology mutational profiling using TSO-500
MedGenome offers comprehensive genomic profiling of tumor samples using the TruSight Oncology 500 (TSO-500) from Illumina. The TSO-500 is a pan-cancer NGS assay with broad availability, rapid turnaround time, and a standardized bioinformatics pipeline that identify key genomic signatures for clinical research and immuno-oncology. The panel includes 523 cancer-relevant gene variants and 55 RNA variants that provide comprehensive coverage of biomarkers (Figure 1) frequently mutated in multiple cancer types.
TSO-500 employs a highly standardized single integrated workflow for both DNA and RNA input material. Library preparation involves a hybridization capture-based target enrichment strategy. High analytical specificity is achieved by adding unique molecular identifiers (UMIs) during library prep, which allows the detection of gene variants even at low variant allele frequency (VAF) while simultaneously suppressing errors. Sequencing reactions are carried out using fluorescence-labelled oligonucleotides, and an off-the-shelf bioinformatics pipeline provides robust and reliable results.
Features
• Variable sample types and throughput MedGenome offers flexible and scalable genomic profiling from tumor biopsies, FFPE tissues, and liquid biopsies (to detect ctDNA) using the Illumina TruSight Oncology 500 portfolio (Figure 2). The platform also allows analysis of sample sizes ranging from 8-192 samples, even allowing accurate detection in low input samples (from 30ng DNA and 40ng RNA input material).
• Easy ctDNA detection from liquid biopsies Noninvasive plasma-based assays have emerged as an important complementary diagnostic approach to tissue-based assays which is not feasible for repeated sampling or inaccessible tissues. Further, single biopsies lack information on tumor heterogeneity. Blood plasma contains tumor cell fragments and DNA from apoptotic or necrotic cancer cells providing information on cancer aggressiveness, progression and therapeutic outcomes. The TSO-500 ctDNA assay enables non-invasive, comprehensive genomic profiling of ctDNA from simple blood draws to evaluate >500 gene variant classes in a single assay.
• Optimized data Analysis The variant calling algorithms are optimized to eliminate errors, artefacts, and germline variants for high accuracy and analytical specificity (99.9998%). Data interpretation and reporting are powered by PierianDx Clinical Genomics Workspace (CGW), which filters and prioritizes biologically relevant variants providing an automated and customizable genomic report.
• Accurate TMB and MSI analysis TSO-500 assay implements an error-corrected sequencing and informatic pipeline that provides an accurate quantitative score for MSI status and a precise and reproducible TMB value. TMB calculation involves the measurement of both nonsynonymous and synonymous SNVs and InDels based on specific criteria. The results were shown to have high concordance with whole-exome studies.
• High sensitivity Using the TSO-500 platform, we provide highly sensitive variant detection for CNVs with a limit of detection at 2.2× fold-change. In addition, the TSO-500 library prep protocol implements high binding specificity to hybridize to targets containing small mutations and SNVs, even from low-quality DNA samples and FFPE tissues. The assay reproducibility has been verified in FFPE samples with a VAF as low as 5%. Furthermore, with regards to the detection of RNA fusions, the hybrid-capture method accurately captures gene fusions from both known and novel fusion gene partners, even from FFPE samples where RNA yields can be >= 40ng.
The MedGenome Advantage
MedGenome offers end-to-end customized solutions for comprehensive genomic profiling of small and large-scale tumor samples to accelerate your clinical research and diagnostics R&D. We have expertise in processing a variety of sample types, including biopsies, FFPE tissue (DNA & RNA), and blood plasma (ctDNA) for DNA sequencing. Our scientific team can also provide solutions to challenging sample processing and high-throughput samples.
Utilizing the TruSight targeted panel we provide exclusive services and advantages:
• Multiplexing solutions – Save time and samples by analyzing multiple tumor variant types in 523 genes in a single assay
• Speed – Streamlined validated workflow with quick turnaround times for TSO-500 assay
• Powerful bioinformatics pipelines – Additional insights on drug-gene interactions, identification of actionable mutations, and comprehensive reports
• High-throughput processing – Scientific expertise and state-of-the-art lab facility for large sample sizes suitable for clinical research
• TST170 Panel for ctDNA – We have validated the TruSight Tumor 170 panel on ctDNA samples for assessment of SNVs and indels in 151 genes, amplifications in 59 genes, and fusions plus splice variants in 55 genes.
Need more insights on tumor profiling using NGS?
Click here to get in touch with our expert scientific team for unique solutions to your research. You can also email us at research@medgenome.com for any queries and further details.
4. Wei, B., Kang, J., Kibukawa, M., Arreaza, G., Maguire, M., Chen, L., Qiu, P., Lang, L., Aurora-Garg, D., Cristescu, R., & Levitan, D. (2022). Evaluation of the TruSight Oncology 500 Assay for Routine Clinical Testing of Tumor Mutational Burden and Clinical Utility for Predicting Response to Pembrolizumab. The Journal of molecular diagnostics: JMD, 24(6), 600–608. https://doi.org/10.1016/j.jmoldx.2022.01.008
Alzheimer’s disease (AD) has long been one of the great challenges in medicine and imposes a constant burden on our aging population. Recent statistics show that approximately 50 million people worldwide suffer from AD or some other form of dementia. The World Health Organization has estimated that the total number of people with dementia worldwide will reach 82 million by 2030 and 152 million by 2050. Of the top 10 leading causes of death based on United States cancer statistics, cardiovascular disease ranks first, tumors rank second and AD ranks sixth.
By Dr. Anantha Kethireddy Ph. D., MedGenome Scientific Affairs
Why is Alzheimer’s relevant?
Alzheimer’s disease (AD) has long been one of the great challenges in medicine and imposes a constant burden on our aging population. Recent statistics show that approximately 50 million people worldwide suffer from AD or some other form of dementia. The World Health Organization has estimated that the total number of people with dementia worldwide will reach 82 million by 2030 and 152 million by 2050. Of the top 10 leading causes of death based on United States cancer statistics, cardiovascular disease ranks first, tumors rank second and AD ranks sixth.
AD is a slowly progressing and eventually fatal neurodegenerative disorder and a major contributor of dementia leading to degeneration of neurons and their connections in parts of the brain involved in memory. The symptoms are impairment in thinking, remembering, reasoning, cognitive functions and behavior are known as dementia. Other diseases and conditions can also cause dementia, with AD being the most common cause of dementia in older adults. AD is not a normal part of aging. It’s the result of complex changes in the brain that starts years before symptoms appear and lead to loss of brain cells and connections. The hallmark of AD is the presence of plaques of the amyloid and neurofibrillary tangles of the phosphorylated protein tau. Much evidence suggests the involvement of neuroinflammation, multiple systemic comorbidities in the pathology of AD.
Since AD was first described in the early 1900s, clinicians and scientists all over the world dedicated their career to study the pathophysiology of this most common form of disease in the hope of developing methods of prevention, treatments to halt the progression and ultimately a cure. AD’s pathophysiology involves neuron-glia interactions, supported by transcriptomic and epigenomic analyses that reveal downregulation of neuronal functions and upregulation of innate immune responses in AD brains. Currently, there are some FDA approved tools that, when applicable, can be used to aid in diagnosis of AD symptoms (brain imaging), while other emerging biomarkers are promising but still under investigation (blood tests, genetic risk profiling).
Drugs do exist, if administered early enough they may help to treat the symptoms of early-stage AD and improve the person’s quality of life. Very recently, two pharmacological companies announced encouraging results from a clinical trial for patients with AD. A monoclonal antibody treatment, called lecanemab, suppressed cognitive decline by 27% in people with early-stage disease compared with those on a placebo after a year and half. Black or Hispanic populations also have a higher risk of Alzheimer’s disease than non-Hispanic white people, researchers don’t fully understand the reasons. There is no drug that works in everybody to stop or reverse AD.
Understanding AD
The molecular and cellular mechanisms of AD is incompletely understood. Typically, when scientists studied gene expression in the brain, they just mashed up the tissue and took average measurements from that mixture. Such “bulk” measurements are hard to interpret, and we lose the gene expression signals that come from individual cell types especially for lowly-represented cell types. Although numerous studies using bulk RNA-seq analysis have revealed dysfunctions of neurons and/or innate immune responses, they are unable to entangle the heterogeneity of different disease subtypes and distinct responses across cell types. Characterization of the heterogeneity of a region of the brain important for learning and memory, the first region affected in Alzheimer’s disease.
Single Cell Genomics
Even within a single brain region, there is a significant variation between the morphology, connectivity and electrophysical properties of individual neurons. A key step towards understanding the basic components of the nervous system is systematic classification of individual neurons. For cells to be classified on a molecular basis, gene expression must be assessed at single-cell resolution. Today’s high throughput technologies such as single cell and spatial multiomics are revolutionizing neurological research at single cell level resolution.
A cohesive demonstration of how gene expression is regulated within discrete cell types and specific anatomical regions of the brain during the early stages of AD is crucial to study the cellular heterogeneity of the brain by profiling tens of thousands of individual cells, capturing the molecular and cellular basis of AD and identifying novel therapeutic targets.
Figure 2: Multiomic integration from single brain (Image source: 10x Genomics)
Summary
Capturing a holistic view of cell type specific contributions to pathogenesis, mapping anatomical protein accumulation in the brain during disease progression, and understanding the relationship between abnormal protein accumulation and cellular phenotypes diagnosis at an early stage is very crucial to save the people from this dreadful disease.
By using Chromium Single Cell Multiome ATAC (Assay for Transposase-Accessible Chromatin) + Gene Expression (the multiome assay), which profiles open chromatin and gene expression from the same cell, and Visium Spatial Gene Expression for FFPE (Visium for FFPE) plus immunofluorescence (IF), which combines whole transcriptome spatial analysis with immunofluorescence protein detection will help to extract the shared and unique neurological disorders among different conditions.
The multidimensional datasets obtained through single cell genomicsapproaches will have major impact on biological research and clinical pathology. implementation and expansion of single cell technologies will lead to vast improvements in the diagnosis and treatment of patients worldwide.
Modern medicine now derives its insights through the deeper understanding of the cellular and molecular mechanisms, which involves modification of the cellular behavior through targeted molecular approaches. Experimental biologists and clinicians now employ various molecular techniques to assess the intrinsic behavior of cells in a variety of ways, such as through analyses of genomic DNA sequences, chromatin structure, messenger RNA (mRNA) sequences, non-protein-coding RNA, protein expression, protein modifications and metabolites.
By Dr. Anantha Kethireddy Ph. D., MedGenome Scientific Affairs
Modern medicine now derives its insights through the deeper understanding of the cellular and molecular mechanisms, which involves modification of the cellular behavior through targeted molecular approaches. Experimental biologists and clinicians now employ various molecular techniques to assess the intrinsic behavior of cells in a variety of ways, such as through analyses of genomic DNA sequences, chromatin structure, messenger RNA (mRNA) sequences, non-protein-coding RNA, protein expression, protein modifications and metabolites.
Today, single‐cell RNA expression profiling is rapidly becoming an irreplaceable method for various research including humans, animals and plants enabling more accurate, rapid identification of rare and novel cells in tissues like never before (Figure 1). Moreover, with this information about gene expression at mRNA and protein levels, metabolites, cell‐cell communication, and spatial landscape, it becomes possible to solve the puzzle of cell composition and functions in health and disease.
Although single cell sequencing studies have been conducted mostly by research groups over the past few years, it has become clear that biomedical researchers and clinicians can make important new discoveries using this powerful approach. While great promises have been demonstrated with the technological advancement in all areas, and its great potentials in transforming current protocols in diagnosis of the genetic drivers of the disease and treatment response mechanism from single cells to tissues.
Currently, there is a growing demand for single-cell technology, with nearly 200 different methods to profile not only transcriptomic but genetic, epigenetic, and proteomic information in individual cells.
An Overview of Single–Cell Technologies:
Single–cell technologies can be broadly classified into analysis of either DNA (genomics, epigenomics) or RNA (transcriptomics) with newer applications around the corner moving to combine both within the same cell.
Single-Cell genomics
Of relevance to cancer biology, is the ability to study genetic variations in individual cells. Although bulk DNA sequencing (DNA-seq) can be used to infer clonal sub-populations based on variant allele frequency analysis, it cannot be used to definitively test the co-occurrence of specific mutations in individual cells. Thus, single-cell DNA sequencing (scDNA-seq) can reveal cancer clonal architecture in far greater detail.
Single-Cell Transcriptomics
Single-cell transcriptomics has been used to study cancer stem cells, metastasis-initiating cells, chemotherapy resistance, and cancer immune responses.
Single-Cell Epigenomics
Many epigenetic processes (including DNA methylation, histone modifications, and chromatin accessibility) become dysregulated in cancer, and this fact has been exploited in various clinical applications.Single cell epigenomics can reveal the regulatory processes that lead to transcriptional heterogeneity in cancer, with important clinical implications. Single-cell DNA methylation analysis has also been applied to characterize circulating tumor cells and response to epigenetic therapies.
Single-Cell multiomics
It is also possible to combine analysis of the genome, transcriptome, epigenome, and other modalities using single-cell multiomic analyses. These combinatorial approaches allow genetic regulation to be studied in incredible detail and were named the 2019 “Method of the Year” by Nature Methods.
Single-Cell and CRISPR
While the human genome was sequenced 20 years ago, we still don’t know the cellular function of most genes. Single-cell CRISPR screens are a great way to cluster genetic perturbations by phenotypes like differentiation, chromosomal instability, the cell cycle, retrovirus activation, alternative-polyadenylation, etc., not to mention the potential for combining scRNA-seq with other phenotypes like imaging or protein measurements. By understanding mechanisms, it should become easier to rationally target multiple genetic dependencies in cancer.
Spatial Transcriptomics
Spatial single-cell transcriptomics is the next wave after single-cell analysis and will be particularly useful to labs studying human disease. Spatial transcriptomics, is the Nature’s 2020 “Method of the Year” and it can be performed on tissue sections using barcode arrays that record the coordinates of mRNA molecules in a sample. This technology was first applied in prostate cancer studies.
Spatial techniques can be divided into those that involve gene expression analysis on micro dissected tissues and those that involve in situ hybridization, in situ sequencing, in situ capturing, and computational reconstruction of spatial data.
Single-molecule fluorescence in situ hybridization (smFISH) is “the beginning of the hybridization-based approaches” with spatial techniques. In this method, multiple oligonucleotides carry fluorescent labels and bind to an RNA molecule. smFISH yields a quantitative mRNA readout with “a near 100% detection sensitivity”.
Single-Cell proteomics
Recent studies have used high-sensitivity mass spectrometry to achieve single-cell proteomics,and another report has coupled click chemistry with mass spectrometry to study lipid metabolism in single cells. Thus, it will soon be possible to study cell signaling pathways and altered metabolism in single cells.
Single-cell technologies are providing unique insights into disease biology and treatment response.
The “cost per cell” currently remains prohibitive for routine analysis. However, it is to be expected that these costs will fall with time as they did for bulk sequencing, allowing this technology to eventually be used in routine patient care. Perhaps it does seem impossible, overwhelming, or ambitious to talk of these numbers, but that’s what was said about the human genome project over 20 years ago.
Pushing single-cell sequencing into clinical application is one of the important missions for clinical and translational medicine (CTM), although there still are a large number of challenges to be overcome.
References
1. Single‐cell RNA sequencing technologies and applications: A brief overview, Dragomirka Jovic et.al, Clin Transl Med, 2022 Mar,12
Emerging single-cell technologies have provided us with a powerful tool to dissect the clonal complexity of tumor cells, deconvolute the role of immune cell types in disease mechanisms, and monitor risk and treatment strategies to guide early patient diagnosis, since being highlighted as the ‘method of the year’ in 2013. As our capabilities in single cell sequencing continue to increase, latest advances in multi-omics of single cells are providing newer ways of integrating single cell transcriptomics with the multiple molecular measurements in a single experiment.
By Savita Jayaram Ph. D., Sheethal Umesh Nagalakshmi, Anay Limaye, Kushal Suryamohan Ph. D. , MedGenome Scientific Affairs
Emerging single-cell technologies have provided us with a powerful tool to dissect the clonal complexity of tumor cells, deconvolute the role of immune cell types in disease mechanisms, and monitor risk and treatment strategies to guide early patient diagnosis, since being highlighted as the ‘method of the year’ in 2013. As our capabilities in single cell sequencing continue to increase, latest advances in multi-omics of single cells are providing newer ways of integrating single cell transcriptomics with the multiple molecular measurements in a single experiment.
MedGenome provides novel assay and bioinformatics services to analyze multimodal single-cell datasets such as: CITEseq that simultaneously interrogates RNA and surface protein expression in single cells via the sequencing of antibody-derived tags (ADTs) and ATACseq that leverages transcriptome changes alongside chromatin accessibility and nucleosome occupancy. Concurrent estimation of both protein and transcript levels opens opportunities to use CITE-Seq in various biological areas, for instance, to profile disease heterogeneity, identifying rare cell sub-populations and novel subtypes, and to explore the mechanisms of host-pathogen interactions. ATACseq assays, on the other hand, can be applied to investigate chromatin accessibility signatures in diseases like macular degeneration and in human cancers, mapping transcription factor binding sites, exploring disease-relevant gene regulation, and studying evolutionary divergence of enhancer regions during development. Additionally, single-cell data can be used to reconstruct lineage trajectory maps, that can enhance our understanding of cell-fate transitions and identify putative branch points. Spatial transcriptomics provide users with extra insights into the cellular biology by providing a three-dimensional spatial context at single-cell resolution, and can be applied to both FFPE and frozen tissue sections. We have handled several such ‘multiome’ projects that have required customization/optimization of the lab protocols which helped us better understand the various QC checkpoints from both a wet lab and an analysis perspective. We have streamlined appropriate protocols, and built robust analysis pipelines, incorporating the latest tools and workflows.
Multimodal Analysis Workflow:
Although, single-cell transcriptomics has transformed our ability to characterize cell states, deep biological understanding requires advanced workflows such as the one depicted in the schematic below. A key analytical challenge is to integrate these multiple modalities to better understand cellular identity and function.1 Single-cell analysis tools need to accommodate different levels of resolution and throughput of the different datatypes, to comprehensively analyze the single cells at molecular level.
Single Cell CITEseq Analysis:
CITE-Seq (Cellular Indexing of Transcriptomes and Epitopes by Sequencing) is a multimodal single cell phenotyping method for performing RNA sequencing along with gaining quantitative and qualitative information on surface proteins with available antibodies on a single cell level.2 CITE-seq uses DNA-barcoded antibodies to convert detection of proteins into a quantitative and ‘sequenceable’ readout. Antibody-bound oligos act as synthetic transcripts that are captured during most large-scale oligodT-based scRNA-seq library preparation protocols (for e. g. 10x Genomics, Drop-seq, ddSeq). This allows for immunophenotyping of cells with a potentially limitless number of markers and unbiased transcriptome analysis using existing single-cell sequencing approaches. For phenotyping, this method has been shown to be as accurate as flow cytometry which is considered as the gold standard for absolute quantitative measurements. It is currently one of the main methods, to evaluate both gene expression and protein levels simultaneously in different species. Recently, this method has been successfully applied to understand the ongoing immune response in COVID-19 patients with varying severity, revealing discrete cellular compartments that can be targeted for therapy.3 The single-cell readout of both protein and transcript data at the same time can uncover novel information on protein-RNA correlations enabling precision health assessments. The increased copy number of protein molecules compared to RNA molecules typically leads to more robust detection of protein features. The protein data in CITE-seq may therefore represent the most informative modality.2 For data analysis, we leverage the weighted nearest neighbor (WNN) analysis provided by Satija lab in Seurat R package, which is an unsupervised strategy that defines the cellular state based on a weighted combination of both modalities.2 We find the WNN algorithm successfully recapitulates the biological expectations in comparison to separate analysis of each modality where certain cell populations can be masked allowing one datatype to compensate for weaknesses in another, demonstrating the importance of joint analysis. Additionally, this methodology enables interpretation of sources of heterogeneity from single-cell transcriptomic measurements, and integration of diverse types of single-cell data.
Single Cell Multiome (ATACseq) Analysis:
ATAC-seq aims at identifying DNA sequences located in open chromatin, i.e., genomic regions whose chromatin is not densely packaged and that can be more easily accessed by proteins than closed chromatin.4 The ATAC-seq technique makes use of an optimized hyperactive Tn5 transposase that fragments and tags the genome with sequencing adapters in regions of open chromatin. The output of the experiment is millions of DNA fragments that can be sequenced and mapped to the genome of origin for identification of regions where sequencing reads concentrate and form “peaks”. The hyperactivity of the Tn5 transposase makes the ATACseq protocol a simple, time-efficient method that requires 500–50,000 cells. The major steps in ATAC-seq data analysis include (1) Quality control and alignment, (2) Peak calling, (3) Advanced analysis at the level of peaks, motifs, nucleosomes, and TF footprints, and (4) Integration with multiomics data to reconstruct regulatory networks.4 ScATAC-seq can be applied in multiple situations including clinical specimens and developmental biology to study the heterogenous cell populations at single-cell resolution. However, this analysis is particularly challenging, due to both the sparsity of genomic data collected at single-cell resolution, and the lack of interpretable gene markers in scRNA-seq data. Similar to CITEseq, WNN analysis of Seurat can be applied to ATACseq data and it shows an increased ability to resolve cell states through integrated multimodal clustering. Further, ATACseq data analysis uses the Signac package developed by Satija lab, for the analysis of chromatin datasets. The cells are annotated using ScSorter, proven to have a higher annotation efficiency even for marker genes expressed at low levels.
Trajectory (Lineage) Analysis:
Trajectory inference has greatly boosted single-cell RNA-seq research by enabling the study of active and longitudinal changes vital to the discovery of genes governing lineages in the trajectory, or differentially expressed between groups. The wealth of information in the transcriptome of thousands of single cells can provide a snapshot of the dynamic changes at different levels of transition that is used to infer complex trajectories. The Monocle3 R package uses the concept of pseudotime, to order cells along a lineage based on the distance along a trajectory from its root or progenitor cells. For instance, in case of blood cell lineages, hematopoietic stem cells can be selected as the root cells. Monocle3 tracks these gene expression changes as a function of pseudotime, allowing for cells to have a branched structure when there are multiple possible outcomes. It can accurately resolve complicated biological processes and heterogenous cell populations, by learning an explicit principal graph based on advanced machine learning techniques called “Reversed Graph Embedding” followed by clustering.5 Subsequently, one can identify genes that are differentially expressed between different states such as control and experiment, or along the trajectories as cells transition from one state to another during development, disease or cell differentiation. Alternately, the velocity graph depicted in Figure B, describes cellular trajectories using RNA velocity (Velocyto or scVelo) that don’t rely on root cells but model the transitions based abundance of transcribed pre‐mRNAs (unspliced) to mature mRNAs (spliced).6 This can be easily identified in standard single‐cell RNA‐seq protocols due to the presence of introns, using Velocyto or loompy/kallisto counting pipeline.
Spatial Transcriptomics:
Recent developments have sparked a growing interest in spatial transcriptomics technology coming from various platforms, such as, the Visium system from 10X Genomics utilizes spotted arrays of mRNA-capturing probes or SLIDEseq, a method developed at Harvard, for transferring RNA from tissue sections onto a surface covered in DNA-barcoded beads, with known positions. Nature Methods had crowned spatially resolved transcriptomics as the Method of the Year 2020.8 This method leverages spatial gene expression to identify genes and delineate neighbourhoods within fresh frozen or FFPE tissue sections. Using this approach, we can detect RNA species enriched in different subcellular compartments, observe distinct cell states corresponding to different cell-cycle phases, and reveal relationships between spatial position and molecular state. Each of these datasets represents an opportunity to understand principles governing the spatial localization of different genes in different cell types while capturing cellular boundaries (segmentations). Some of these methods use targeted panels i.e., they profile a pre-selected set of genes. Newer adaptations of Single-molecule FISH (smFISH) called as multiplexed error-robust FISH (MERFISH) can achieve near-genome-wide RNA profiling of spatially resolved individual cells with high accuracy and detection efficiency. The Seurat vignette for spatial data analysis uses SCTransform-based normalization, followed by dimensionality reduction and clustering like other multi-modal datasets. However, in addition to UMAP embedding, it overlays the clusters on the images of the tissue sections providing a spatial visualization. It offers additional features to zoom in and visualize individual molecules at a higher resolution. Once zoomed-in, one can also visualize individual cell boundaries as well in all visualizations.
References
1. Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell177, 1888-1902.e21 (2019).
2. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell184, 3573-3587.e29 (2021).
3. Cambridge Institute of Therapeutic Immunology and Infectious Disease-National Institute of Health Research (CITIID-NIHR) COVID-19 BioResource Collaboration et al. Single-cell multi-omics analysis of the immune response in COVID-19. Nat. Med.27, 904–916 (2021).
4. Yan, F., Powell, D. R., Curtis, D. J. & Wong, N. C. From reads to insight: a hitchhiker’s guide to ATAC-seq data analysis. Genome Biol.21, 22 (2020).
6. Bergen, V., Soldatov, R. A., Kharchenko, P. V. & Theis, F. J. RNA velocity—current challenges and future perspectives. Mol. Syst. Biol.17, e10282 (2021).
7. Kulkarni, A., Anderson, A. G., Merullo, D. P. & Konopka, G. Beyond bulk: A review of single cell transcriptomics methodologies and applications. Curr. Opin. Biotechnol.58, 129–136 (2019).
8. Marx, V. Method of the Year: spatially resolved transcriptomics. Nat. Methods18, 9–14 (2021).
Recent advances in next-generation sequencing technologies have heralded a paradigm shift in the field of precision oncology and personalized/genomic medicine, with a large number of somatic- and germline mutation-profiling programs worldwide. These programs have paved the way for personalized medicine in contrast to a unified approach that clearly fails in select individuals, conferring benefits to only a subset of patients. While these genomic analyses become increasingly accessible and almost commonplace to all research scientists, clinicians and molecular geneticists, they are faced with the challenging task of interpreting and translating the results from these analyses.
By Savita Jayaram Ph.D., Kushal Suryamohan Ph.D., MedGenome Scientific Affairs
Recent advances in next-generation sequencing technologies have heralded a paradigm shift in the field of precision oncology and personalized/genomic medicine, with a large number of somatic- and germline mutation-profiling programs worldwide. These programs have paved the way for personalized medicine in contrast to a unified approach that clearly fails in select individuals, conferring benefits to only a subset of patients. While these genomic analyses become increasingly accessible and almost commonplace to all research scientists, clinicians and molecular geneticists, they are faced with the challenging task of interpreting and translating the results from these analyses.
Owing to the inherent heterogeneity and complexity of solid and liquid tumors and inherited cancers, newer bioinformatics tools are required to interpret and prioritize the variants, including SNPs, In-Dels, copy number variations (CNVs), translocations, gene fusions, and splice variants, covering a broad spectrum of genomic alterations. Recently, MedGenome Labs launched the AI-enabled VarMiner pipeline to detect actionable genetic variants in rare and inherited cancers, powered by internally benchmarked tools and databases, to precisely pinpoint these changes with accuracy and efficiency. Identifying these causal variants is like pulling out a needle in a haystack, but is the need of the hour, to improve prediction rate with higher specificity.
Custom Tumor Panels
Further, to provide an easy identification and maximize the utility of these analyses, MedGenome has developed many tumor panels such as TSO170 and TST500. These panels enable fast and seamless reporting ranging from blood-based markers from circulating tumor cells (from liquid biopsies) to high depth sequencing of tumor mutation burden to provide insights into the potential solutions for the patients, in both research and clinical settings. TruSight Oncology 500 offers wide variety of benefits in analyzing multiple tumor variants across 523 genes in a single assay, enabling comprehensive genomic profiling of tumor samples. The assay is highly effective in identifying all types of relevant DNA and RNA variants in different types of solid tumors encompassing sarcomas, lung, melanoma, ovarian, breast, gastric, and bladder cancers. Also, the assay is highly accurate in measuring immuno-oncology biomarkers such as microsatellite instability (MSI) and tumor mutational burden (TMB). Our TST170 Panel had been validated on circulating tumor DNA (ctDNA) providing an in-depth view into cancer genetics.
Neo-antigen prediction
OncoPeptVACTM
One of the greatest achievements in cancer therapies in the past decade has been the introduction of immunotherapy drugs such as Nivolumab and Ipilimumab targeting immune checkpoint inhibitors, PD1 or PDL1 and/or CTLA4, significantly improving clinical outcomes. Notwithstanding a high overall response rate to these drugs, long-term benefit is realized by only a small fraction of the treated patients. Additionally, a potential downside of these antibody drugs such as bispecific antibodies, and chimeric antigen receptor [CAR]-T cells is they can themselves elicit potential immunogenicity effects inducing anti-drug antibodies, on treatment.1 This led to the advent of personalized neoantigen-based cancer therapies and adoptive T cell therapies, that have been shown to prime host immunity against cancer. Despite their growing popularity, cancer vaccines have only had modest success. One of the key impediments to the development of effective cancer vaccines has been the difficulty to select ideal neoantigen candidates. Neoantigens are predicted by exploiting tumor-specific mutations derived from gene fusions, frameshifts, splice variants or other aberrations that sufficiently distinguish it from self-antigens. Neo-antigen prediction helps to identify such 9-15mer neoepitopes candidates for vaccine development, that can elicit a strong disease-specific immunogenic response. Several studies have shown that among the immunodominant epitopes identified for influenza, HIV, SARS-CoV-2 and so on, only a handful of them induced a strong cytolytic CD8 and/or CD4 T-cell response. This necessitated the development of tool that could accurately predict ideal neoantigen candidates for immunotherapy that can be tested, with further broader applications in oncology therapeutics.
We developed a novel proprietary and now patented algorithm, OncoPeptVACTM that can not only accelerate identification of cancer vaccine candidates but also identify immunogenicity risks of antibody-based drugs.2 The algorithm driven by machine learning approaches incorporates features associated with presentation of the antigen on the surface and utilizes features regulating T cell receptor (TCR) binding of the HLA-peptide complex assigning accurate prediction scores for neo-epitope prioritization and neo-antigen prediction. OncoPeptVACTM identifies immunogenic peptides from exome as well as RNA-seq data from tumor/normal pairs, to predict CD8 T-cell activating epitopes. This pipeline was successfully validated in two different studies. Following neoepitope prioritization using OncoPeptVACTM pipeline, three mutant peptide antigens were selected from Lynch syndrome-colorectal cancer patients and shown to induce a potent CD8 T cell response.3 In another recent study, immunodominant T-cell epitopes of SARS-CoV-2 spike antigens showed robust pre-existing T-cell immunity in unexposed individuals, contributed by TCRs that recognize common viral antigens such as influenza and CMV.4 Interestingly, these viral epitopes lacked sequence identity to the SARS-CoV-2 epitopes. Both studies were published in Nature, Scientific Reports. Further ongoing studies from MedGenome showed the effects of peptide length and peptide dosage on CD8 T-cell activation. The immune response of a 9mer or 15mer version of HLA-2-restricted ‘GILGFVFTL’ epitope was compared to determine which made a better vaccine candidate, by measuring the CDR3 expansion as a measure of T-cell epitope engagement diversity.5 It was seen that the 15mer epitope produced a more robust and sustained response, and private CDR3s not expanded by 9mer peptides. All these studies, show the potential utility of our pipeline in accurately predicting prototypical immunodominant vaccine candidates that can be further screened using our proprietary OncoPeptSCRNTM T-cell assay platform described below.
MedGenome’s OncoPeptSCRNTM
Therapeutic revival of tumor-specific exhausted T cells using neutralizing antibodies targeting the immune checkpoint inhibitors, namely, T-lymphocyte-associated protein 4 (CTLA-4) and programmed cell death protein 1 (PD-1) has significantly improved clinical outcomes in cancer. T cells exist in a wide spectrum of functional states – from fully functional at one end of the spectrum, to fully dysfunctional at the other end. One of the factors governing the fate of the tumor response to checkpoint inhibitors is the ratio of functional to dysfunctional state of T cells, which in turn is modulated by a wide array of immune-suppressive signals present within the tumor microenvironment. Tumors that are immunologically ‘hot are characterized by high infiltration of activated T cells that also express PD1 and CTLA4. These inhibitory receptors evolved to prevent over activation of the immune system but cancer cells hijack this mechanism to their benefit by expressing the corresponding ligands driving the T cells to exhaustion.5 Targeting these checkpoint inhibitors can reverse this dysfunctional state and reinvigorate the immune response, only if they are in a ‘partially exhausted’ state. This is the basis for the development of immunotherapy drugs such as Nivolumab(anti-PD1) and Ipilimumab (anti-CTLA4) that can rescue the T cells from exhaustion.
The OncoPeptSCRNTM T-cell Activation Assays leverages the single cell transcriptomics on a 10X Genomics platform to assess immunogenicity of HLA-peptide pairs. The cancer peptides/antigens selected using our OncoPeptVACTM platform are expressed as a minigene or added from outside. These are naturally processed and presented by the T cells and subsequently sequenced using 10X single-cell RNAseq and 10X single-cell TCRseqexperiments. This method was successfully used to identify T cell functional states following antigen-stimulation in an ex vivo T-cell activation system. We successfully identified functional gene clusters and molecular networks that are unique to CD8+ T cell exhausted state. The combined expression of our T cell exhaustion gene signature correlated with poor prognosis when applied to TCGA data of almost 1400 tumors from many different cancers. Furthermore, the molecular pathways identified in this study provide opportunities to develop novel therapeutic interventions specially targeting dysfunctional T cells in cancers thereby enhancing the efficacy of checkpoint inhibitors.6
References
1. Davda, J. et al. Immunogenicity of immunomodulatory, antibody-based, oncology therapeutics. J. Immunother. Cancer7, 105 (2019).
2. OncoPeptVAC: A robust TCR binding algorithm to prioritize neoepitope using tumor mutation (DNAseq) and gene expression (RNAseq) data. 31, 223–223 (2017).
3. Majumder, S. et al. A cancer vaccine approach for personalized treatment of Lynch Syndrome. Sci. Rep.8, 12122 (2018).
4. Mahajan, S. et al. Immunodominant T-cell epitopes from the SARS-CoV-2 spike antigen reveal robust pre-existing T-cell immunity in unexposed individuals. Sci. Rep.11, 13164 (2021).
According to the American Cancer Society, an estimated 1.9 million new cancers will be diagnosed in 2022 [1]. Some of the major cancer types affecting the population are prostate, lung & bronchus, colon & rectum, urinary bladder, melanoma of the skin, kidney & renal pelvis, non-Hodgkin lymphoma, oral cavity & pharynx, leukemia, pancreas, breast, colon & rectum, uterine corpus, thyroid.
By Vinay CG, Derek Vargas and Kushal Suryamohan, MedGenome Scientific Affairs
According to the American Cancer Society, an estimated 1.9 million new cancers will be diagnosed in 2022 [1]. Some of the major cancer types affecting the population are prostate, lung & bronchus, colon & rectum, urinary bladder, melanoma of the skin, kidney & renal pelvis, non-Hodgkin lymphoma, oral cavity & pharynx, leukemia, pancreas, breast, colon & rectum, uterine corpus, thyroid. Lung and Bronchus (21%) in both men and women, prostate in men (11%) and breast cancer (31%) in women are the majority cancer types causing death in the population [1]. Even though our understanding of cancer has broadened over the years it is still a major challenge to tackle across the globe. Widely accepted therapy forms for cancer includes biomarker identification and testing for treatment, chemotherapy, hormone therapy, immunotherapy, photodynamic therapy, radiation therapy, stem cell transplant, surgery, and targeted therapy. Immunotherapy (Table 1) is emerging as a forerunner among all the types of cancer therapies for the simple reason as it considers the various dynamics of immune function in an individual. Genomics has played a key role in enabling the identification of therapeutically actionable targets and in guiding the use of immunotherapy.
The magic of immunotherapy – proven right again
Recently, a team of doctors at Memorial Sloan Kettering Cancer Center published the results of a cancer trial in the New England Journal of Medicine involving Dostarlimab – a potential immunotherapy drug – to be effective in a small group of patients of 14 suffering from rectal cancer who went into complete remission [2,3]. Dostarlimab belongs to a class of drugs called checkpoint inhibitors – a programmed death 1 (PD-1) blockade drug. Mismatch Repair (MMR)-deficient colorectal cancer was found to respond well to PD-1 blockade in this trial and hence the success. PD-1 prevents T cells from killing cancer cells and thus by blocking PD-1, it is possible to activate the T-cell machinery that can then effectively kill the cancerous cells.
Mismatch repair-deficient (MMRd) or Microsatellite instability (MSI) [4] are additional factors that are linked with higher chance of developing cancer. This MMR deficiency is common in colorectal, gastrointestinal, and endometrial cancers. Finding the tumor cells with MMR deficiency can be very useful in determining the course of the treatment. The presence of MSI has also been identified as a predictor of a response to immune-checkpoint inhibition, leading the FDA to approve the anti-programmed cell death protein 1 (PD-1)-antibody pembrolizumab, for use in patients with MSI-high solid tumours regardless of histology or anatomical location.
Table 1: Types of Immunotherapies [5]
Immunotherapy Type
Mode of Action
Monoclonal Antibodies (mAbs)
They are the special kind of proteins which are designed to target antigens or markers present on cancer cells.
Check Point Inhibitor Drugs
These drugs’ common targets are CTLA-4 and PD-1/PD-L1. The check point inhibitor drugs release the breaks allowing T cells to attack the cancer cells more efficiently.
Cancer Vaccines
They trigger an immune response by identifying and attacking certain marker or antigens present on the cancer cells
Oncolytic Virus Immunotherapy
Oncolytic Viruses are the genetically modified viruses that can attack the cancer cells directly. They are often combined with other types of immunotherapies such as a cancer vaccine / mAb therapy.
Adoptive T Cell Transfer
It is an anti-cancer approach where the immune cells are made effective to tackle cancer. One such special approach is to add Chimeric Antigen Receptors (CARs) to T cells in the lab and reinfuse to patient. CAR T cells then can identify cancer cells and kill them.
Cytokines
They aid in control and growth of immune cells
Adjuvant Immunotherapy
These involve methodologies where ligands are used to boost immune response
How MedGenome’s sequencing solutions are highly effective in supporting immunotherapy? MedGenome has a primary focus on tackling immunotherapy challenges through various types of sequencing solutions:
OncoPeptTUME: It is a proprietary platform which interrogates RNA-Seq data sets to produce high resolution mapping of the tumor microenvironment using proprietary cell type specific gene expression signatures. It can be customized to fit cancer immunotherapy project needs and tailored to perform in preclinical and clinical settings.
TCR Sequencing: With the aid of Next-Generation Sequencing we offer deeper insights such as CDR3 repertoire diversity, clonal composition, potential antigenic recognition spectrum, and the quantity of antigen specific T-cell responses – that can be very useful in prescribing the right Immunotherapy for the patients.
BCR Sequencing: Our BCR sequencing solutions offer wider insights such as B-cell differentiation, BCR somatic hypermutation, class switching, and antigen specificity.
HitMab (High-throughput monoclonal antibody discovery): MedGenome’s HitMab platform accelerates drug discovery process through its Single Cell BCR sequencing methodologies that provide crucial information on heavy and light chain antibodies with greater specificity, enables antibody generation for low or poorly immunogenic proteins and offers distinct advantages over the current Hybridoma technology.
TruSight Pan Cancer Targeted Panels (TSO500/TST170): Offers a distinct advantage in identifying immuno-oncology biomarkers such as microsatellite instability (MSI) and tumor mutational burden (TMB). It also helps in assessing fusions, splice variants, insertions/deletions and single-nucleotide variants (SNVs), and amplifications.
MedGenome was part of a multi-collaborative study aimed at identifying key actionable targets and identify potential immunotherapy strategies to treat gallbladder cancers (GBC). The genomic analysis of 167 gall bladder cancer samples revealed mutated GBC genes that include several targetable driver genes such as ERBB2, ERBB3, KRAS, PIK3CA, and BRAF. Since there is no approved line of immunotherapy treatment for Gall Bladder cancers, our efforts successfully identified neoantigens from several mutated GBC genes including ELF3, ERBB2, and TP53. Validation in the lab showed T-cell activation thus indicating that they are potential cancer vaccine candidates. Additionally, some of the samples from this study had MSI which could also be targets for checkpoint inhibitor therapy.
Want to know more about our unique Cancer Immunotherapy Solutions?
Get in touch with our experienced and seasoned scientific team to understand how our unique cancer immunotherapy solutions can provide deeper insights to your research projects. You can also email us at research@medgenome.com for any queries and further details.
6. Pandey, A., Stawiski, E.W., Durinck, S. et al. Integrated genomic analysis reveals mutated ELF3 as a potential gallbladder cancer vaccine candidate. Nat Commun11, 4225 (2020). https://doi.org/10.1038/s41467-020-17880-4
Next-generation sequencing techniques has seen an unimaginable growth in the past two decades. The scope has really broadened, and it is now possible to look at a genome both at macro and micro levels. Single-Cell RNA sequencing (scRNA-seq) is one such technique which deals with understanding the transcriptome at a cellular level. Single cell RNA sequencing can provide unparalleled insights into the various cellular events. scRNA-seq has an advantage over the bulk RNA-seq studies since it provides higher resolution in terms of cell subsets diversity and individual cell heterogeneity in the organisms.
By Vinay CG, Derek Vargas and Neha Varma, MedGenome Scientific Affairs
Next-generation sequencing techniques has seen an unimaginable growth in the past two decades. The scope has really broadened, and it is now possible to look at a genome both at macro and micro levels. Single-Cell RNA sequencing (scRNA-seq) is one such technique which deals with understanding the transcriptome at a cellular level. Single cell RNA sequencing can provide unparalleled insights into the various cellular events. scRNA-seq has an advantage over the bulk RNA-seq studies since it provides higher resolution in terms of cell subsets diversity and individual cell heterogeneity in the organisms.
There are many scRNA-seq techniques available [1], – widely used Smart-seq2, MARS-seq, 10X Genomics, BD Rhapsody, sci-RNA-seq, in-drop and Seq-well (Figure 1) – however, all of them follow a similar approach i.e.
Even though scRNA-seq has several challenges one such being the high cell-to-cell variability owing to both technical and biological noise (gene expression, cellular states, cell sizes and cell cycle state) [2] – it is still an effective technique in revealing complex cellular events that can aid medical research. Now, researchers can obtain a multi-omics profiling involving genome, epigenome and transcriptome or protein information from the same single cell.
Due to its ability to provide a deeper understanding into the nature of immune cells scRNA-seq techniques can be used for:
1. Profiling the immune response to pathogens/infections
2. Evaluation of vaccines
3. Tailor personalized cancer vaccines
4. Interrogate pathogen and host transcriptomes
5. BCR analyses for mAB Development
Profiling the immune response to pathogens/infections
With scRNA-seq it is now possible to understand the molecular details of host-pathogen interaction in greater detail. Understanding the inflammatory factors triggered by immune cells in response to pathogen invasion can be useful in knowing the disease pathogenesis [3]. Few of the useful insights could be in the areas of inflammatory response analysis, identifying differently expressed genes during infection, knowing susceptible cell types, analysing infection dynamics, and studying immune repertoire [3].
Evaluation of Vaccines
scRNA-seq can be used to compare vaccine regimens and responses. Immunogenicity to vaccine is determined by several factors such as vaccine antigen, vaccine platform and adjuvant. scRNA-seq can be very useful in measuring these factors in a more specific and efficient way [1]. Besides, curating and characterizing single cell datasets of known types of pathogens at various stages of infection can help in identifying right vaccine targets [1]. scRNA-seq captures crucial information about cell types susceptible to the infection which will help the development of strategies for intervention. It can also aid in antigenic screening and selection by providing a clear insight into the immune response generated by vaccines with different antigenic makeups.
Tailor Personalised Cancer Vaccines
scRNA-seq can be extensively used in neoepitope selection and vaccine workflows. The parameters such as growth inhibition, antibody selection, cytokine secretion and a comprehensive analysis of scRNA datasets can provide deeper insights into vaccine responses. Especially, with anti-cancer drugs used to target tumor cells, some of the cells will develop resistance to such drugs. scRNA can be used to provide information on these cellular subsets that could be responsible for tumor recurrence, mutations, and pathways that can be responsible for driving tumor growth [4]. The detailed immune cell maps of multiple immunophenotypes in tumor microenvironment have been drawn via this novel sequencing method, which has deepened our understanding of tumor cell heterogeneity. These insights can be useful in identifying novel biomarkers, develop better immune responses and tailor personalized vaccines.
Interrogate pathogen and host transcriptomes
Studying the host-pathogen interaction after their encounter can provide invaluable insights into the outcomes. The scRNA-seq can help us understand which host cell types infected and which ones are able to respond to pathogens [5]. Paired dual scRNA-seq is emerging as a novel technique in understanding of those molecular pathways that would help for the host to have a distinct advantage over the pathogen. One more advantage scRNA-seq provides over the bulk studies is the high resolution it provides in identifying potential signals in rare cell populations during an infection which can help in predicting better pathogenic response, understand disease states and to identify correct biomarkers.
BCR analyses for Mab Development
scRNA-seq can provide valuable insights into full-length heavy and light chain sequences in B cells. Somatic hypermutation (SHM) and class switching are the key determinants for antibody diversity. Since, antibody discovery is important for early stage research studies, diagnostics and therapeutics, at MedGenome we have streamlined antibody discovery using high-throughput single B cell receptor sequencing and a recently licensed proprietary platform, HiTMab* (High-throughput monoclonal antibody discovery). HitMab uses high-throughput single-cell B-cell receptor sequencing (scBCR-seq) to obtain accurately paired full-length variable regions in a massively parallel fashion. Our scBCR-seq not restricted only to mice and humans but also extended to custom species including horse and rat.
Want to know more about our exciting Single-cell RNA sequencing (scRNA-seq) Solutions?
Get in touch with our experienced and seasoned scientific team to understand how our unique single-cell RNA sequencing (scRNA-seq) solutions can provide deeper insights to your research projects. You can also email us at research@medgenome.com for any queries and further details.
References
1. The Application of Single-Cell RNA Sequencing in Vaccinology, Journal of Immunology Research, Volume 2020, Article ID 8624963
2. Hedlund E, Deng Q, Single-cell RNA sequencing: Technical advancements and biological applications, Molecular Aspects of Medicine (2017), http://dx.doi.org/10.1016/j.mam.2017.07.003
3. Geyang Luo, Qian Gao, Shuye Zhang, Bo Yan: Probing infectious disease by single-cell RNA sequencing: Progresses and perspectives, Computational and Structural Biotechnology Journal, Volume 18, 2020, Pages 2962-2971, ISSN 2001-0370, https://doi.org/10.1016/j.csbj.2020.10.016.
4. Li L., Xiong F., Wang, Y. et al. What are the applications of single-cell RNA sequencing in cancer research: a systematic review. J Exp Clin Cancer Res 40, 163 (2021). https://doi.org/10.1186/s13046-021-01955-1
5. Penaranda C, Hung DT. Single-Cell RNA Sequencing to Understand Host-Pathogen Interactions. ACS Infect Dis. 2019 Mar 8;5(3):336-344. doi: 10.1021/acsinfecdis.8b00369. Epub 2019 Jan 31. PMID: 30702856.
The human immune response can be divided into two components: Innate and Adaptive. Innate immune response involves classic primitive reaction through cellular and humoral mechanisms. It’s a first line of defence and can comprise a host of cells such as neutrophils, macrophages, and mast cells which kills the invading pathogens while the humoral response can be through enzymes such as Lysozyme that can kill harmful microorganisms.
By Vinay CG, Associate Director, Content & Communications and MedGenome Scientific Affairs
The human immune response can be divided into two components: Innate and Adaptive. Innate immune response involves classic primitive reaction through cellular and humoral mechanisms. It’s a first line of defence and can comprise a host of cells such as neutrophils, macrophages, and mast cells which kills the invading pathogens while the humoral response can be through enzymes such as Lysozyme that can kill harmful microorganisms.
The most effective component of the Immune system is the Adaptive immunity. Also, known as Acquired immunity – it is a highly advanced evolutionary system which recognizes, identifies pathogens, and tailors a specific response towards them. This system essentially involves Lymphocytes. Each lymphocyte expresses a receptor on its surface that can specifically bind to a particular antigen.
Even the adaptive immunity has two arms to it the cellular and the humoral. The cellular arm comprises of T Lymphocytes (T cells) which help in eliminating pathogens through various mechanisms while the humoral arm involves a subset of lymphocytes called B Lymphocytes (B cells).
The T cells recognize antigen through T-cell antigen receptor (TCR) and the B cells recognize intact antigens through immunoglobulins (antibodies). B cells too have receptors termed as B-cell antigen receptors (BCR).
Profiling the TCRs and BCRs (Figure 1) holds a great promise in understanding mechanisms that can provide extremely useful insights in developing new therapeutics and addressing critical research questions in the areas of translational immunology, immunotherapy, autoimmune disorders, and transplant research [1,2].
TCR Profiling
TCRs are made up of heterodimers α/β (TCR2) or γ/δ (TCR1) chains [3]. These chains are encoded by 4 different gene loci namely V (variable), D (diversity), J (joining) and C (constant). A typical T cell will either express an α/β or a γ/δ receptor [4]. Similarly, BCRs are made up of heavy and light chains [4].
The TCRs are diverse in nature owing to the extensive recombination between different V, D, and J gene segments. These rearrangements can lead to an incredibly (109-1010 sequences) large spectrum of complementary-determining region 3 (CDR3) [1]. CDR3 is critical in binding to their specific antigens. When a TCR expands by binding to an antigen there is a selective expansion in the CDR3 region of the repertoire resulting in a clonotype. With effective NGS based T cell receptor sequencing methodologies and assay it is now possible to identify all such clonotypes in a diverse repertoire of TCRs. This allows researchers to analyze the TCR repertoire in patients and aid in predicting better treatment outcomes.
TCR repertoire sequencing can provide deeper understanding of the clonal diversity, richness and evenness of the T Cell clonal population which in turn can help in identifying biomarkers for better immunotherapy responses [5].
At MedGenome, we have standardized methodologies for bulk TCR sequencing where we have obtained expertise on working with diverse input types to obtain TCR α/β, γ/δ clonotypes (for bulk – cells, RNA and FFPE tissues) and TCR α/β clonotypes from single-cell inputs [1]. Additionally, our powerful bioinformatics analysis pipeline provides broader insights such as full length clonotype sequences, V-J usage summaries, CDR3 length distribution, and shared clonotype analysis (Figure 2).
Starting with total RNA either obtained from client or prepared in-house, sample QC is performed using the Agilent’s TapeStation or Fragment analyzer. If samples meet the QC criteria, then they are processed using the SMARTer TCR α/β Profiling kit or a modified protocol for γ/δ TCR profiling. After library preparation, QC is performed using Agilent’s TapeStation or Fragment analyzer, and sequencing is performed on Illumina platform. Data generated is demultiplexed and FastQC is performed, and after trimming, MiXCR software is used for alignment of reads to the TCR clonotypes, and the final CDR3 and full length clonotypes are assembled. Advanced analyses outputs can also be generated to compare VJ gene usage across samples, and determine Shannon’s diversity and frequency changes in the repertoire across samples.
BCR Profiling
The B cells too play an important role in immune response. However, they also play havoc when they go wrong causing several B-cell mediated diseases such as cancer mostly known and noted type being B-cell malignancies such as non-Hodgkin’s lymphoma and Hodgkin’s lymphoma, autoimmune diseases such as multiple sclerosis, rheumatoid arthritis and systemic lupus erythematosus [6,7] As a B cell matures several recombination of immunoglobulin genes occur owing to an event called class switch recombination (CSR) which can lead to diversification of the B-cell repertoires [8]. Further, B-cell receptors can undergo variations when it binds to a certain antigen. through V(D)J recombination, class switch recombination, and somatic hypermutation that can cause DNA breaks triggering cancer development [8].
Therefore, BCR sequencing gives an insight into the various interplay between B-cell differentiation, BCR somatic hypermutation, class switching, and antigen specificity.
Major Applications of BCR Repertoire Profiling
At MedGenome, we have also developed and standardized, HitMab (High-throughput antibody discovery using single cell sequencing), a streamlined workflow for antibody discovery using high-throughput single-cell B-cell receptor sequencing (scBCR-seq). HitMab allows us to obtain accurately paired full-length variable regions in a massively parallel fashion. With HitMab’s single cell BCR sequencing it is possible for researchers to speed up discovery of thousands of antibodies at a much lesser time compared to the traditional Hybridoma Technology.
Want to know more about our exciting TCR/BCR Solutions? Get in touch with our experienced and seasoned scientific team to understand how our unique immune repertoire solutions can provide deeper insights to your research projects. You can also email us at research@medgenome.com for any queries and further details.
6. Bite-sized immunology by British Society of Immunology.
7. Bashford-Rogers, R.J.M., Bergamaschi, L., McKinney, E.F. et al. Analysis of the B cell receptor repertoire in six immune-mediated diseases. Nature 574, 122–126 (2019). https://doi.org/10.1038/s41586-019-1595-3
Single-cell genomic analysis has emerged as a powerful method for studying complex disease. By providing comprehensive analyses of individual cells, single-cell sequencing allows researchers to examine cellular heterogeneity, which especially useful in oncology, neurology, immunology, and developmental research.
By MedGenome Scientific Affairs
Single-cell genomic analysis has emerged as a powerful method for studying complex disease. By providing comprehensive analyses of individual cells, single-cell sequencing allows researchers to examine cellular heterogeneity, which especially useful in oncology, neurology, immunology, and developmental research.
Because it can analyze individual cells in depth, scientists can obtain unbiased cell analyses. According to Lei et al., “Single-cell sequencing significantly outperforms previous sequencing technologies in terms of our understanding of the human biology of embryonic cells, intracranial neurons, malignant tumor cells and immune cells because it can probe cellular and microenvironmental heterogeneity at single-cell resolution.”1
A relatively new technique, single-cell genomic sequencing technologies continue to improve; in turn, scientists are developing new approaches and discovering novel uses. While technological limitations remain, single-cell genomic analysis promises to advance understanding of cell types to inform novel therapies.
The next milestone in this field is single-nucleus RNA sequencing (snRNA-seq), which allowed the extension of single-cell transcriptomics analyses to human diseases for which live tissue is difficult to obtain. One of the first studies was conducted by Lake et al., which involved single-cell analysis of molecular pathology in the brain of patients with autism spectrum disorder (ASD).
What is Single-Cell Genomic Sequencing?
Scientists use single-cell genomics to study the functionality and properties of a cell.2 Whereas conventional genetic sequencing uses tissue samples to produce the average diversity of cells, single-cell genomics drills down to the single-cell level.3 This level of specificity allows scientists to study cell-to-cell variations and identify rare cells that play a role in disease progression.
Characterize and identify heterogeneous cell populations
Discover new cell markers and regulatory pathways
Uncover novel cell types, cell states and rare cell types
Reconstruct developmental hierarchies and reveal lineage relationships
Single-Cell Genomic Analysis Techniques
In its early days (a little over ten years ago), single-cell sequencing caused a stir in the scientific community, but its high cost made it impractical to use in most situations. Technology has advanced, however, enabling high-throughput single-cell sequencing via multiple profiling strategies.
In addition to single-cell RNA sequencing (RNA-seq), other sequencing platforms and methods allow scientists to capture information on cell-surface proteins, chromatin state, genetic perturbations, and genome data.5 Each method expands on RNA data to provide a unique perspective on cell state and identity.
The Single-Cell Sequencing Protocol Preparing samples for processing and capturing individual cells is a complex process that involves four primary steps:6
Isolation of single cells from a cell population
Extraction, processing, and amplification of the genetic material of each isolated cell
Preparation of a “sequencing library” including the genetic material of an isolated cell
Sequencing of the library using a next-generation sequencer
To successfully generate single-cell libraries from the diverse starting material including tissue types and cells. MedGenome has integrated several validated single-cell isolation and library preparation platforms such as Miltenyi tissue dissociation.
Single-Cell Genomics Applications Because immune cells have several distinct functions, single-cell sequencing is a valuable tool for understanding the immune system and identifying new targets for treatment.4 Researchers have used the technique to identify subpopulations of spleen and blood natural killer (NK) cells in humans and mice, contributing to translational research. Researchers have also used single-cell RNA-seq to study highly heterogeneous immune cells, helping them better understand why the immune system weakens with age.
Researchers out of Duke University recently used single-cell sequencing to study cerebral cavernous malformations (CCMs), a blood vessel abnormality that can lead to brain hemorrhage.7 Using the technique helped them answer questions about CCM pathogenesis.
Single-cell genomic techniques are becoming especially valuable to oncology researchers, allowing them to better understand treatment response and resistance, as well as inform diagnosis and monitoring.8 As a study published in Biomolecules concludes, “the molecular characteristics of each cellular component and its interconnections—either promoting or inhibiting tumor growth—are all points that can be leveraged during therapeutic development. Therefore, single-cell genomics and the related multi-omics technologies are exploratory tools that far exceed the scope and effectiveness of preceding bulk genomic analyses.” 8
Partner with a Single-Cell Genomics Solutions Expert Whether you’re studying tumor cell response or immune cell functions, single-cell genomics is a complex but powerful method for understanding the cellular structure, behavior, and heterogeneity. Get in touch with MedGenome to explore various single-cell sequencing solutions and our abilities to design and support your experiment.
References
1 Lei, Y., Tang, R., Xu, J. et al. Applications of single-cell sequencing in cancer research: progress and perspectives. J Hematol Oncol 14, 91 (2021). https://doi.org/10.1186/s13045-021-01105-2
3 Wang Q, Yang KL, Zhang Z, et al. Characterization of Global Research Trends and Prospects on Single-Cell Sequencing Technology: Bibliometric Analysis. J Med Internet Res. 2021;23(8):e25789. doi:10.2196/25789
4 Tang X, Huang Y, Lei J, Luo H, Zhu X. The single-cell sequencing: new developments and medical applications. Cell Biosci. (2019);9:53. doi:10.1186/s13578-019-0314-y
7 Snellings DA, Girard R, Lightle R, et al. Developmental venous anomalies are a genetic primer for cerebral cavernous malformations. Nat Cardiovasc Res. (2022);1:246-252. doi:10.1038/s44161-022-00035-7
8 Kim N, Eum HH, Lee HO. Clinical Perspectives of Single-Cell RNA Sequencing. Biomolecules. 2021;11(8):1161. doi:10.3390/biom11081161
9 Lake BB, Ai R, Kaeser GE, Salathia NS, et al. Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. SCIENCE 2016; 6293: 1586-1590. doi: 10.1126/science.aaf1204
February 28th is Rare Disease Day. It is a day where the realities of Rare Diseases need to be highlighted for all health industry stakeholders; to celebrate the progress that has been made as well as to inspire us for the challenges that lay ahead. Rare diseases are defined as those conditions thar affect fewer than 1/ 1200 people. More than 300 million people globally are affected by a rare disease 1,2. Patients and families with rare diseases are one of the most underserved communities in medicine today.
By Aditya Pai, VP Corporate and Business Development, MedGenome Inc, USA.
February 28th is Rare Disease Day. It is a day where the realities of Rare Diseases need to be highlighted for all health industry stakeholders; to celebrate the progress that has been made as well as to inspire us for the challenges that lay ahead.
Rare diseases are defined as those conditions thar affect fewer than 1/ 1200 people. More than 300 million people globally are affected by a rare disease1,2. Patients and families with rare diseases are one of the most underserved communities in medicine today. There are 7000 + documented rare diseases, yet for the most, a cure or treatment is not available3. Only 5% of rare diseases have an approved treatment. Most rare diseases get misdiagnosed or remain undiagnosed resulting in complex testing and clinical odysseys. This is often due to the variability in symptoms from disease to disease for also for the same disease from person to person. 70% plus of rare diseases are seen in children while greater than 72% are genetic in origin.
There are three key challenges with rare disease research: 1. Diagnosis 2. Treatment 3. Equity
Diagnosis: Especially in neonates, the emotional trauma caused to new parents with an undiagnosed rare disease is heartbreaking. With costs of sequencing falling, in 2022, it is possible to perform far more elaborate next generation sequencing (NGS) than even five or ten years ago. Whole genome sequencing (WGS) or whole exome (WES) / clinical exome sequencing (CES) has been proven to detect with more sensitivity and specificity previously undiagnosed rare diseases. For example, in a 2020 UK study, WGS data for 13,037 participants, of whom 9,802 had a rare disease was analyzed. A genetic diagnosis was provided to 1,138 of the 7,065 phenotyped participants. 95 Mendelian associations between genes and rare diseases were found. At least 79 of these were confirmed to be etiological. Further, a 2021 UK study for whole genome sequencing for neurological patients with repeat expansion disorders (e.g., Fragile X syndrome) showed high sensitivity and specificity, and led to the identification of neurological repeat expansion disorders in previously undiagnosed patients. The use of NGS – in particular WGS has proven to be invaluable in diagnosis.
Treatment: Rare disease drug development has unique challenges. Finding patients for clinical trials, the small number of patients affected, as well as the heterogeneity of rare diseases and the way they progress makes clinical trial design and participation a challenge. Understanding the natural histories of disease can play a vital role in drug development especially when designing meaningful end points and patient inclusion / exclusion criteria. In the USA, government has provided various incentives to drug manufacturers. The results of a 2020 CDER report4 showed that of the 53 novel drugs approved in 2020, 58% were for rare diseases demonstrating the interest of pharmaceutical and biotechnology companies.
Equity: Not everyone can afford whole genome sequencing or any form of genetic testing or treatments that are costly. This can lead to large inequities in diagnosis, treatment and ongoing care globally. The financial and emotional cost to families affected by rare disorders can be enormous as a result of such inequities.
A Call to Action:
More affordable WGS/WES/CES, its interpretation with appropriate genetic counseling and care key to stymie long and complex diagnostic odysseys.
Pharmaceutical companies need to continue to adopt global strategies to not only understand the natural history of specific rare diseases, but also include more countries to recruit patients for trials. For example, India and South Asia have the world largest population of people affected by rare and inherited disease. This is highly relevant, considering the large number of rare diseases and the small global pool of potential patients available for clinical trials.
Health policy must account for equitable access to diagnosis, treatment and care for rare disease populations. While these will vary by country, its essential to start developing a policy for rare diseases, orphan drugs and their cost-effective access.
Rare Disease Day 2022 represents another opportunity to remind ourselves of the work that remains. At MedGenome, we have formed a strategic alliance with Emmes across six rare disorders with the goal of expanding and accelerating Rare Disease research and development. Our partnership will combine patients’ genomic, phenotypic and epidemiological data into custom rare disease registries. MedGenome in India routinely offers CES and has solved many undiagnosed conditions successfully. Through our clinical collaborators, we continue to highlight the unique aspects of Indian genomes in the understanding the natural history and heterogeneity of rare diseases as well as the role they can play in global rare disease clinical trials.
World Cancer Day is a day to reflect and celebrate research victories, the battles that anyone with cancer fights, the search for new ways to detect cancer early and treat it as effectively as possible. Yet, cancer statistics remain sobering. Globally, there were an estimated 19.3 million new cancer cases and 10 million cancer deaths in 2020 . The number of people living with cancer is expected to grow by around 1 million every decade between 2010 and 2030.
By Aditya Pai, VP Corporate and Business Development, MedGenome Inc.
World Cancer Day is a day to reflect and celebrate research victories, the battles that anyone with cancer fights, the search for new ways to detect cancer early and treat it as effectively as possible. Yet, cancer statistics remain sobering. Globally, there were an estimated 19.3 million new cancer cases and 10 million cancer deaths in 2020i. The number of people living with cancer is expected to grow by around 1 million every decade between 2010 and 2030ii.
Over the past decade, a better understanding of alterations in single driver genes, and genes involved in specific biological pathways have led to the development of better targeted therapies and more recently, immunotherapies. For example, tyrosine kinase inhibitors for non-small cell lung cancer or checkpoint inhibitors used as tissue agnostic immunotherapies have altered the treatment of many cancers. Yet, understanding the molecular mechanisms of various cancers is critical to develop new screening methods, e.g. non-invasive circulating tumor DNA ctDNA tests and novel therapies. The quest to improve overall survival rate in cancer with better and more targeted treatments has been abetted by significant investments by pharmaceutical and device companies as well as government funding. Precision medicine has also been aided by decreased costs of sequencing the genome which at large scale was cost prohibitive in the past. This has led to complex choices and decisions around whether to use a targeted gene panel or sequence the exome or whole genome. Various “omics” technologies have led to understanding the changes at the DNA level in a tumor to understanding the expression of various changes at the level of the transcriptome to understanding the tumor microenvironment using single-cell sequencing approaches. This multi-omic paradigm is gaining rapid traction in research. Yet, for a translational oncology researcher, a question I often get asked is “Should I sequence the entire genome or use a combination of whole exome sequencing and DNA sequencing or should I use a cancer panel?”
A translational oncology researcher is often looking for a broad set of results from a retrospective set of tumor samples stored as FFPE blocks (Formalin Fixed Paraffin Embedded). In this research discovery scenario, either of the above approaches have value, but it depends on factors such as the research hypothesis, time, research budget as well bioinformatics capabilities. For example, a Research Use Only (RUO) comprehensive gene panel like TSO 500 from Illumina includes 523 cancer related genes, includes (1) DNA variations like SNV’s, indels, copy number alterations (2) RNA gene fusion detection and (3) broad biomarker measurements like TMB that integrate many genomic loci across the genome. The TSO 500 panel has the flexibility in being used with solid tumor or ctDNA. The panel comes with added benefits of the bioinformatics pipeline included in the workflow. With more advanced bioinformatics, as we provide at MedGenome (see Figure 1, 2), a researcher could explore various hypothesis to better understand potential pathways and upstream and downstream cancer driver genes. Panels like TSO 500 are rigorously developed and designed to provide very robust, specific and sensitive (typically down to 5% variant allele frequency for most applications) detection of alterations in the genes of interest.
This same researcher could use a whole exome and RNA sequencing or whole genome sequencing approach. However, unlike the TSO 500 assay where bioinformatics analysis is included in the workflow, the researcher would have to separately perform bioinformatics analysis that can be time consuming. Yet, they may find this worthwhile if they are needing to explore the genome beyond what a panel of 523 genes would cover.
Key Takeaway Sequencing and multi-omic approaches in oncology are quickly evolving and must be tailored to the primary questions that an oncology researcher is trying to answer. There are many approaches available.
At MedGenome, we can help you with your “omics” strategy. These include panel-based approaches that leverage comprehensive cancer genes, as well, whole genome / whole exome-based approaches, or more comprehensively single-cell RNA sequencing approaches.
In all these choices, cost is also an important consideration as is the ultimate use of these results which can range from exploratory research to a specific path for companion diagnostic development for a drug in clinical trials. Appropriate effort and detail to study design of a sequencing strategy and method is critical, as is the need for collaboration with high quality laboratories with validated assays.
Figure 1: Oncoplot displaying the somatic landscape of the cohort for top (max 20) most frequently mutated genes. Each row represents a gene and each column represents a sample. Colored squares show mutated genes, while grey squares show non-mutated genes
Figure 2: Somatic interaction plot shows mutually exclusive or co-occurring pair of genes displayed as a triangular matrix with top (max 20) most frequently mutated genes in the cohort. Green indicates tendency toward co-occurrence, whereas pink indicates tendency toward exclusiveness.
Spatial transcriptomics is a revolutionary molecular profiling method that allows scientists to measure in a tissue sample and map the activity to specific cell types and their location. This novel technology is paving the path to new discoveries that are proving instrumental in helping researchers gain a better understanding of biological processes and diseases leading it to be called the Method of the Year in 2020.
By Dr. Neha Verma, Research Scientist – NGS, MedGenome Inc., USA
Spatial transcriptomics is a revolutionary molecular profiling method that allows scientists to measure in a tissue sample and map the activity to specific cell types and their location. This novel technology is paving the path to new discoveries that are proving instrumental in helping researchers gain a better understanding of biological processes and diseases leading it to be called the Method of the Year in 2020.
The beginnings of spatial transcriptomics can be traced back to the 1960s where nucleic acids were stained at their original locations with cells or tissues. Although the term spatial transcriptomics was first coined in 2016, the first steps were already taken in the late ’60s with the use of in situ hybridization. This was followed in the late ’90s by the first microdissection techniques, in which a microscope is used to dissect a small portion of tissue. The term “Spatial Transcriptomics” is a variation of Spatial Genomics, first described by Doyle, et. al., in 2000 and was then modified by Ståhl et. al. in 2016.
Single cell/nuclei sequencing plays a crucial role in identification of cellular subpopulations and their response to various conditions/stimuli. Cells are impacted by their native environment and surroundings. Understanding cellular responses in their endogenous spatial context. With spatial transcriptomics, it is now possible to obtain information on the transcriptomes of a single cell or a small group of cells, while maintaining the information on where the cell (or group of cells) is located within the tissue.
Currently, there are only a few different types of spatial transcriptomics techniques that are available. These include GeoMx from NanoString; Slide-seq, Apex-seq; High-Definition Spatial Transcriptomics (HDST); and 10X Genomics’ Visium Spatial Gene Expression.
GeoMx NanoString’s GeoMx Digital Spatial Profiler allows to define a microscopic region of interest on an FFPE or frozen tissue slide due to a UV-photocleavable barcode engineered into the in-situ hybridization probes. The region of interest is specifically exposed to UV light, and the barcodes are cleaved, used to identify the RNA or protein present in the tissue. The size of the defined regions of interest can vary in between ten to six hundred micrometers allowing targeting of a wide variety of structures and cells in the histological sample.
APEX-seq The method utilizes the APEX2 gene, expressed in live cells which are incubated with biotin-phenol and hydrogen peroxide. In these conditions, the APEX2 enzymes catalyse the transfer of biotin groups to the RNA molecules, and these can then be purified via streptavidin bead purification. The purified transcripts are then sequenced to determine which molecules were near the biotin tagging enzyme.
Slide-seq Slide-seq relies on the attachment of RNA binding, DNA-barcoded micro beads to a rubber-coated glass coverslip. The microbeads are mapped to their spatial location via SOLiD sequencing. Tissue sections are transferred to this coverslip to capture extracted RNA. Captured RNA is amplified and sequenced. Transcript localization is determined by the barcode oligonucleotide sequence from the bead that captured it.
High-Definition Spatial Transcriptomics (HDST) It is based on decoding the location of mRNA capture beads in wells on a glass slide. This is accomplished by sequential hybridization to the barcode oligonucleotide sequence of each bead. Once the location of each bead is decoded, a tissue sample can be placed on the slide and permeabilized. The captured transcripts are then sequenced. HDST uses smaller beads than Slide-seq and thus can resolve at a spatial resolution of two micrometers compared to ten micrometers of Slide-seq.
And the very latest breakthrough by 10X Genomics, Visium Assay. The Visium spatial assay combines traditional histopathology with unbiased, high-throughput gene expression analysis from the same tissue section at high resolution and sensitivity. This enables spatial clustering of cells based on gene expression that reliably correlates with the neuroanatomy of intact tissue, across different mammalian brain regions. The addition of immunofluorescence staining enables the simultaneous examination of protein and gene expression from the same tissue, providing additional insights.
Source: 10X Genomics
The 10X Visium assay is a newer and improved version of the Spatial Transcriptomics assay which also utilizes spotted arrays of mRNA-capturing probes on the surface of glass slides but with increased spot number, minimized spot size and increased amount of capture probes per spot. Within each of the four capture areas of the Visium Spatial Gene Expression slides, there are approximately 5000 barcoded spots, which in turn contain millions of spatially barcoded capture oligonucleotides. Tissue mRNA is released upon permeabilization and binds to the barcoded oligos, enabling capture of gene expression information. Each barcoded spot is 55 µm in diameter, and the distance from the center of one spot to the center of another is approximately 100 µm. The spots are staggered to minimize the distance between them. On average, mRNA from anywhere between 1 and 10 cells are captured per spot which provides near single-cell resolution. Each Visium Spatial Gene Expression Slide includes 4 capture areas (6.5 x 6.5 mm), each defined by a fiducial frame (fiducial frame + capture area is 8 x 8 mm). The capture area has ~5,000 gene expression spots, each spot is ~55 microns with primers that include:
Illumina TruSeq Read 1 (partial read 1 sequencing primer). 16 nt Spatial Barcode (all primers in a specific spot share the same Spatial Barcode); 12 nt unique molecular identifier (UMI); 30 nt poly(dT) sequence (captures poly-adenylated mRNA for cDNA synthesis). Distance from center to center of each spot is ~100 microns.
Tissue sections on the capture areas of the Visium Spatial Gene Expression Slide are fixed using methanol. Hematoxylin is used to stain the nuclei, followed by eosin staining for the extracellular matrix and cytoplasm. The stained tissue sections are imaged. The same tissue section is permeabilized to release mRNA onto capture spots that contain spatially barcoded oligos fixed to the slide. mRNAs are converted to cDNAs and then collected for dual-indexed Illumina library construction and sequencing. The H&E stained image and the spatially barcoded cDNAs are overlaid to allow visualization of the gene expression within the original tissue placement.
The tissue sections should be no larger than the capture area (6.5 mm x 6.5 mm) to avoid covering the fiducial frame that is used to align the RNASeqdata with the stained tissue images. In addition, tissue placed outside the capture area will also simply not generate any additional gene expression data, or could possibly convolute the gene expression data generated.
With Visium’s whole transcriptome and protein co-detection approach
Gain insights on cell-to-cell interactions with spatial context: Discover new biomarkers by examining histology, protein, and mRNA from the same fresh frozen tissue section
Characterize cellular sub-types and functional states: Reveal the spatial organization of newly discovered cell types, states, and biomarkers with whole transcriptome analysis
Discover regional cell heterogeneity throughout: Examine gene and protein expression heterogeneity and how it contributes to biological system
Spatially resolved gene expression can provide a powerful complement to traditional histopathology methods, enabling a greater understanding of cellular heterogeneity and organization within the tissue architecture.
Ståhl PL, et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353: 78–82, 2016. doi: 10.1126/science.aaf2403
It is known that all the hereditary information is contained within an organism’s genome. Owing to continuous global efforts many new bioinformatics databases are emerging and has seen an up trend in the recent past, a reflection on how NGS data is impacting our understanding of life and our need to constantly develop new methods to investigate and decode the information in and around DNA (or RNA for some viruses) and its nucleotide sequences.
By Parimala Nagaraja, Assistant Manager-NGS, MedGenome Inc., USA
It is known that all the hereditary information is contained within an organism’s genome. Owing to continuous global efforts many new bioinformatics databases are emerging and has seen an up trend in the recent past, a reflection on how NGS data is impacting our understanding of life and our need to constantly develop new methods to investigate and decode the information in and around DNA (or RNA for some viruses) and its nucleotide sequences.
A comprehensive outlook and understanding of a full genome is now possible with de novo whole-genome shotgun sequencing and annotation. Because of the novel technological developments over the recent years and the availability of several reference genomes in the public domain that can be used for annotation, WGS has become increasingly easier, faster, and cheaper. NGS plays a significant role in Human genomics ushering a new era of new personalized therapeutics in order to achieve healthy and disease-free lifestyle – the broad term being“personomics”. Several SNPs, mutations, other sequence variants such as InDels, copy number (CNVs), and structural variations (SNVs) can be detected through Targeted sequencing like Ampliseq or Whole Exome Sequencing (WES) within or among different species. Other methods of Targeted sequencing that are widely used for identifying polymorphisms that are important in tissue or cell matching for transplantation include HLA genotyping of an entire gene or just exonic regions. Targeted sequencing of just coding regions to detect exonic mutations responsible for rare Mendelian Genetic disorders such as hearing loss, intellectual disabilities, and movement disorders and for investigating common disorders such as heart disease, hypertension, diabetes, and cancer can be termed as “Exomics”.
Transcriptomics:
The sum total of RNA transcript sets expressed by the genome in cells, tissues, and organs at different stages of an organism’s life cycle is termed as a “transcriptome” of an organism. High throughput RNA sequencing from Complementary DNA (cDNA) molecules helps us to understand the complex and intricate genome functions in biological systems. It also provides us to identify quantitative expression levels of genes, tissue specific transcript variants and isoforms, small and large non-coding RNAs involved in the regulation of gene expression or associated with various types of cancer in a highly sensitive and accurate manner.
Methylomics and epigenomics:
The study of complete epigenetic modifications via DNA nucleotide methylation and posttranslational modifications of histones, the interaction between transcription factors and their targets, and nucleosome positioning is called Epigenomics. The genome-wide analysis (GWA) of DNA methylations and their effects on gene expression and heredity is called Methylomics. Bisulfite DNA sequencing (Methyl-seq) aids in mapping DNA cytosine methylation at single-base resolution. Methyl seq is a well-established method for DNA methylation profiling in various organisms as well as humans for evaluating pathogenic variants of the genes.
ChIP-seq (Chromatin Immunoprecipitation) allows the genome-wide profiling of DNA-binding proteins and histone and nucleosome modifications. It is the most widely used method for detecting and analysing the transcription factor binding sites and histone modifications in a variety of organisms. Another commonly used NGS method used in epigenomics is Hi-C which is generally used to identify DNA regions such as promoters, enhancers, and insulators that come together to mediate their regulatory activities.
Proteomics, metabolomics, and systeomics:
Proteomics is the study of structure, function and characterization of different peptides and proteins. Sequencing the Open reading frames (ORFs) of the genomic regions, exonic regions, and transcripts aids in constructing proteomic profiles from NGS data. Although this is not the only method to build the proteomics data. A variety of several other hardware and software tools are employed to build up an organism’s peptide and protein profiles. These include 2D-PAGE, liquid chromatography coupled with tandem mass spectrometry, affinity-tagged proteins, and yeast two-hybrid assays.
The study of an organism’s total metabolic response to an environmental stimulus or a genetic modification is called Metabolomics. The metabolomics of an organism is mainly drawn from the known functions of enzymes and proteins involved in metabolic and biochemical pathways. This field forms an integral part of functional genomics in determining the phenotypic effects of genetic modifications such as gene deletions, insertions, and other mutations.
Integration of genomics, proteomics, metabolomics into a single network system is termed as Systeomics. This field of study uses computational techniques to analyse and model cell interactions. This is an interdisciplinary field of study that focuses on complex interactions within biological systems using a holistic approach.
Metagenomics:
The study of the total genomic content of the microbial community is called Metagenomics. It helps in epidemiological study of various pathogenic agents such as mycobacteria, S. aureus, E. coli, cholera, influenza, HIV, Ebola virus, etc. The Earth microgenome project reconstructed approximately 500 million varieties of microbial genomes. Before the first NGS platforms emerged, metagenomics studies were focused on 16srRNA genes to genotype and detect different species of microbes. Over the past 10 years many big projects such as the TerraGenome project for soils and the Tara Oceans project on the microbiome, eukaryotic plankton, and viromes of the global oceans emerged for sequencing metagenomes.
Agrigenomics
Studies involved in advancing crop improvements and understanding plant biology using NGS are called Agricultural genomics or Agrigenomics. Arabidopsis thaliana was the first plant genome that was published in 2000. Since then, nearly 54 new plant genomes have been sequenced in 2013, followed by another 6 plant genomes including the hexaploid bread wheat genome.
Peek into the near future:
NGS is the science of Biological information systems and “Big Data” today. However, several challenges still persist with regard to NGS data acquisition, storage, analysis, integration, and interpretation. Hence, future developments will unquestionably depend on new technologies and large-scale collaborative efforts from multidisciplinary and international teams to continue generating comprehensive, high-throughput data production and analysis. With the new innovations and the availability of cost-effective sequencing methods and the existing “Third generation sequencing” tools, smaller industries and individual scientists will be able to participate in the genomics revolution and contribute new knowledge to the different fields of structural and functional “Omics” in the life sciences.
Next-Generation Sequencing — An Overview of the History, Tools, and “Omic” Applications | InTechOpen, Published on: 2016-01-14. Authors: Jerzy K. Kulski
With the advent of novel Next generation sequencing (NGS) technology platforms – DNA Sequencing has seen a revolutionary leap both in terms of cost and application in cutting-edge research.. Today, we can sequence an entire Human genome in a day compared to the conventional Sanger sequencing using capillary electrophoresis. It is now possible to identify and track genetic variation in a more efficient and precise manner. Also, owing to this seamless sequencing capability now thousands of variants can be analysed within a large population in a short span of time.
By Parimala Nagaraja, Assistant Manager-NGS, Peer Reviewers: Neha Verma, Research Scientist – NGS, Ekkirala Chaitanya, Lab Director- NGS operations, MedGenome Inc., USA
With the advent of novel Next generation sequencing (NGS) technology platforms – DNA Sequencing has seen a revolutionary leap both in terms of cost and application in cutting-edge research.. Today, we can sequence an entire Human genome in a day compared to the conventional Sanger sequencing using capillary electrophoresis. It is now possible to identify and track genetic variation in a more efficient and precise manner. Also, owing to this seamless sequencing capability now thousands of variants can be analysed within a large population in a short span of time.
NGS is an umbrella term that describes various sequencing platforms of modern research. These technologies allow us to decode the DNA and RNA in the most cost effective and time efficient manner and pose a wide range of applications from exploring and detecting a Single Nucleotide polymorphism (SNP) to Constructing a whole Genome via de novo assembly.
Ever since the discovery of the Double Helix model of DNA in the 1950’s, Researchers have invested their time and efforts in decoding and unravelling the sequence of the variety of different genomes.
Before any attempts were made to sequence the DNA, Robert Holley, an American Biochemist in 1964 sequenced and determined the complete structure of Alanine tRNA molecules which consists of 77 ribonucleotides. His work paved the way to other scientists to sequence other RNA as well as DNA molecules.
Paul Berg developed the first technology which permitted isolation of defined fragments of DNA in 1972 leading to the development of modern genetic engineering tools. Before this only Phage DNA was available for sequencing.
Walter Greenburg published the first nucleotide sequence of Lac operator that consists of 24 base pairs in 1973. Later in 1977 Fredrick Sanger was the first to sequence the complete genome of Bacteriophage and developed the “Chain Termination” sequencing technology. In the same year, Walter Gilbert, an American Biochemist produced ‘DNA sequencing by chemical degradation’. This led to Sanger and Gilbert receiving a Nobel prize in the 1980s for successfully developing sequencing methods for a long DNA molecule. The other half was awarded to Paul Berg “for his fundamental studies of the biochemistry of nucleic acids, with particular regard to recombinant DNA”.
Later in 1986, Leroy Hood from California Institute of Technology first developed a Semi-automated DNA sequencing machine. This machine automated the Enzymatic Chain termination method of Sanger sequencing and became a key tool in mapping and sequencing the genetic material.
And then Applied Biosystems in the USA marketed the first automated sequencing machine called ABI 370.
Later came another method of sequencing called “Pyrosequencing” which did not require electrophoresis. This was developed by Pal Nyren and Mostafa Ronaghi in Sweden in 1996. It relies on the detection of DNA polymerase activity by an enzymatic luminometric inorganic pyrophosphate detection assay developed by P. Nyren in 1987.
The Automated Sanger sequencing led to many significant accomplishments which include: Completion of Human Genome Project, and other Plant and Animal genomes. Because of its own limitations there was a need for more advancements and improvements in sequencing of large numbers of genomes. While automated Sanger sequencing is called “First Generation Sequencing”, the later advanced methods of sequencing are called “Next Generation Sequencing”.
Emergence of NGS platforms
NGS became available in the early 21st century. The most significant improvement of NGS is its capability to produce a large amount of data in a very short time in the most cost effective and accurate manner beyond the reach of traditional Sanger methods.
Lynx Therapeutics Company launched the first of the NGS technology, called Massively parallel signature sequencing (MPSS) in the year 2000 which was later acquired by Illumina.
Later in 2004, a paralleled version of sequencing called “Pyrosequencing” was marketed by 454 Life Sciences (Branford, CT, USA) which reduced the cost of sequencing by 6fold compared to automated Sanger sequencing and was the second of the new generation of sequencing methods. This company was later acquired by Roche. Roche’s 454 GS 20 refashioned the sequencing platforms in 2005-2006 as it could produce about 20 million bases (20 Mbp), which was further replaced by GS FLX model in 2007 which can produce over 100 Mbp of sequence in just four hours, which increased to 400 Mbp in 2008. This model was then upgraded to the 454 GS−FLX+ Titanium sequencing platform which can produce over 600 Mbp of sequence data in a single run with Sanger-like read lengths of up to 1,000 bp.
In 2005, Solexa released the ubiquitous Sequencing by Synthesis (SBS) method of sequencing which was later purchased by Illumina in 2007. This method is responsible for 90% of sequencing data produced in biological research.
Recently, Illumina developed a platform called Novaseq 6000 which can produce enormous amounts of data by sequencing several hundreds of libraries within a matter of 24-48 hours. The output of Novaseq can range from 400gb to 3000gb. This platform unleashed a new era of sequencing with unconventional innovations providing users with the throughput, speed, and flexibility to complete projects faster and more economically than ever before. It also offers a wide range of flexibility to the users in the sequencing options and enables the users to choose between the four flow cell types (SP, S1, S2, S4) and sequence one or two flow cells at a time. This cutting-edge technology provides simple, scalable, and highly reliable high throughput sequencing with outstanding accuracy.
Other novel sequencing technologies parallel to SBS include: True Single Molecule Sequencing developed by Helicos Biosciences, Ion Torrent sequencing developed by Life technologies, Single molecule real-time (SMRT) sequencer developed by Pacific Biosciences, Oxford Technologies Nanopore (Oxford, UK) single molecule sequencer with ultra-long single molecule reads that became available in 2012–2013.
Future of NGS
NGS will continue to revolutionize genomics research by becoming increasingly efficient and cost effective. For now, all the NGS platforms require the preparation of NGS libraries (preparing the DNA/RNA compatible for sequencing) which includes Fragmenting the DNA/cDNA molecules, attaching the adapter molecules to either ends followed by the Amplification of the libraries using PCR. Another new class of sequencing is under development called “Third Generation sequencing” which allows the sequencing of single DNA molecules without amplification and can produce longer reads compared to NGS platforms. Single molecule real-time (SMRT) sequencer developed by Pacific Biosciences and Nanopore Sequencing offered by Oxford Technologies Nanopore are already on this domain by rapidly generating up to 15000 bases from a single DNA or RNA molecule. This allows the sequencing of smaller genomes completely without introducing the PCR bias in the most time and cost-efficient way.
References:
Barba M, Czosnek H, Hadidi A. Historical perspective, development and applications of next-generation sequencing in plant virology. Viruses. 2014;6(1):106-136. Published 2014 Jan 6. doi:10.3390/v6010106
Behjati S, Tarpey PS. What is next generation sequencing?. Arch Dis Child Educ Pract Ed. 2013;98(6):236-238. doi:10.1136/archdischild-2013-304340
#Sanger sequencing, #de novo assembly, #Sequencing technology, #Pyrosequencing, #Sequencing by Synthesis, #Single Molecule Sequencing
Since December 2019, the outbreak of Corona Virus Disease (COVID-19) has posed a serious threat to global health. The number of cases increased quickly and has resulted in over four million deaths worldwide, as of July 2021. In response to this, numerous research projects have been conducted to study the disease etiology, the patterns of epidemic, and potential treatments for the disease. The adaptive immune response plays a central role in clearing viral infections and in turn directly influences patients clinical outcomes.
By Dr. Kushal Suryamohan, Director of Bioinformatics Services & Aditya Pai, Vice President, Corporate and Business Development, MedGenome Inc, USA
Antibody discovery has gained immense prominence with the COVID-19 pandemic. Numerous technologies are available to produce antibodies. For several decades now, hybridoma technology has been the mainstay to generate monoclonal antibodies by immortalization of antigen-specific B-lymphocytes.
Hybridomas are cells formed by a fusion between an antibody-producing B-cell which is short-lived and an immortal myeloma cell. Each hybridoma expresses a large amount of one specific antibody (monoclonal antibody). If a hybridoma is stable, it can be used and cryopreserved as a long-term source of monoclonal antibody production. The workflow for creating hybridomas involves several technical procedures, including antigen preparation, animal immunization, cell fusion, hybridoma screening and sub-cloning as well as characterization and production of specific antibodies. Despite the fact that hybridomas have fueled the discovery and production of antibodies for a multitude of applications, the technology has some disadvantages. These include contamination of hybridoma cultures, the range of useful antibodies generated given the losses that occur during the fusion process, genetic drift over time leading to batch effects, time, cost to maintain and the limitations in use in rat and mice. Furthermore, generation and identification of high-quality hybridoma clones is a fairly labor-intensive, low-throughput process, and requires months of work during the time frame from immunization to specific hybridoma identification.
Newer antibody discovery methods such as phage display technologies overcome some of the limitations of hybridoma technology. However, with evolving sequencing technology, MedGenome has developed and standardized, HitMab (High-throughput antibody discovery using single cell sequencing), a streamlined workflow for antibody discovery using high-throughput single-cell B-cell receptor sequencing (scBCR-seq) to obtain accurately paired full-length variable regions in a massively parallel fashion. This method allows for the rapid discovery of thousands of antibodies leveraging single cell BCR sequencing and bypasses hybridoma-based antibody discovery. scBCR-seq can be used for rapid discovery of large, diverse panels of high-affinity antigen-specific antibodies with natively paired heavy- and light-chains when combined with high-quality antigen-specific B-cell sorting. By interrogating each individual B-cell, HitMab maximizes the antibody repertoires that we can access making it inherently superior to existing antibody discovery methods. HitMab provides users with a prioritized list of candidate antibody sequence panels with a fast turnaround time of 12 weeks that includes immunization of mice or rats with antigen of interest, isolation of individual B cells, scBCR-seq, antibody cloning, recombinant IgG expression and purification, and initial testing of binding properties (Figure 1). Briefly, mice or rats are immunized with a specific antigen of interest. From the antibody forming cells, B-cells are then isolated and each B-cell is processed through the 10X Chromium platform for single cell library preparation and sequencing for identification of paired heavy (Vh) and light (Vl)chain antibody sequences.
HitMab has some key advantages over traditional hybridoma-based antibody discovery:
Does not rely on immortalizing B-cells and instead relies on isolating B-cells from spleen or lymph nodes of immunized rats or mice
Cost effective and time efficient – obtain a panel of prioritized paired antibody sequences for testing/screening in 12 weeks compared to 6-8 months for hybridoma-based antibody discovery.
Maximal antibody diversity compared to hybridoma where there is a lower B-cell repertoire due to fusion loss.
scBCR-seq not restricted to mice and humans. MedGenome offers custom species including horse and rat.
For more information to customize a project to your needs, please contact aditya.pai@medgenome.com
Figure 1: MedGenome’s workflow for antibody discovery using high-throughput single-cell B-cell receptor sequencing (HitMab)
Since December 2019, the outbreak of Corona Virus Disease (COVID-19) has posed a serious threat to global health. The number of cases increased quickly and has resulted in over four million deaths worldwide, as of July 2021. In response to this, numerous research projects have been conducted to study the disease etiology, the patterns of epidemic, and potential treatments for the disease. The adaptive immune response plays a central role in clearing viral infections and in turn directly influences patients clinical outcomes.
By Derek Vargas and Dr. Kushal Suryamohan, MedGenome Inc
Since December 2019, the outbreak of Corona Virus Disease (COVID-19) has posed a serious threat to global health. The number of cases increased quickly and has resulted in over four million deaths worldwide, as of July 2021. In response to this, numerous research projects have been conducted to study the disease etiology, the patterns of epidemic, and potential treatments for the disease. The adaptive immune response plays a central role in clearing viral infections and in turn directly influences patients clinical outcomes. T cells play a crucial role in the immune response to viral infection. Since the start of this pandemic, several innovative tools have been made available for studying the role of T cells in viral infection, and other diseases.
T cells play a critical role in clearing viral infections and providing long-term immune memory. There are two subsets of T cells that participate in the immune response to viral infection. Activated CD8+ T cells directly kill infected cells, while CD4+ T cells produce signaling molecules that drive and support CD8 response and the formation of long-term CD8 memory. CD4+ T cells also participate in the selection and affinity maturation of antigen-specific B cells, which ultimately leads to the generation of neutralizing antibodies. T cells recognize short pathogen-derived peptides presented on the cell surface of the major histocompatibility complex (MHC) using hypervariable T-cell receptors (TCR). TCR repertoire sequencing allows for the quantitative tracking of T-cell clones in patients as these populations go through expansion and contraction. Additionally, TCR repertoire sequencing can provide full length sequences that can be used for biomarker discovery or developing immunotherapies to treat against disease, such as COVID-19.
Despite its widespread adoption, there has been a lack of simple and interactive tools to analyze and explore TCR sequencing data. Many established tools require programming or Unix/Bash knowledge to analyze and visualize results, which can be a barrier for many labs interested in TCR sequencing. To help researchers overcome this hurdle, MedGenome offers end-to-end TCR sequencing services. Using RNA or cells as input, libraries can be generated and sequenced to analyze TCR data, including alpha, beta, delta, and gamma chains. Additionally, we provide a number of analysis outputs, which include critical information such as full length clonotype sequences, V-J usage summaries, CDR3 length distribution, and shared clonotype analysis (Figure 1).
Beyond that, there are additional resources available to scientists interested in studying the immune repertoires of samples. We recently co-hosted a webinar with iReceptor (link below) who discussed the Adaptive Immune Receptor Repertoire Community that they established as a way for researchers to share and access TCR and BCR data. This is a group of immunologists, bioinformaticians, and other experts working together to develop guidelines and standards for the generation, storage, and annotation of this data to facilitate its use by the larger research community. iReceptor facilitates the curation and sharing of TCR seq data from multiple labs and institutions. By engaging this community, scientists can access these datasets, make comparisons with their own data, and effectively increase the amount of data available to answer complex questions about the adaptive immune response.
TCR sequencing data has enormous potential to be used in biomarker and immunotherapy development. The recent pandemic has created a demand for deeper understanding of TCR repertoires. The solutions offered by MedGenome and iReceptor are just a few examples that show how the industry is embracing immunogenomics, and developing solutions to streamline these projects.
In my previous blog, I highlighted the uniqueness of single cell RNA sequencing technologies and how these can be used to understand 5’ and 3’ gene expression, T and B cell immune repertoire profiles, and more specific antibody-based approaches such as CITE-Seq as well as epigenetics approaches with ATAC-Seq. In this blog, the power of multi-omic approaches to simultaneously determine open chromatin regions with gene expression in a single cell is reviewed.
By Aditya Pai, Vice President, Corporate and Business Development, MedGenome Inc.
In my previous blog, I highlighted the uniqueness of single cell RNA sequencing technologies and how these can be used to understand 5’ and 3’ gene expression, T and B cell immune repertoire profiles, and more specific antibody-based approaches such as CITE-Seq as well as epigenetics approaches with ATAC-Seq. In this blog, the power of multi-omic approaches to simultaneously determine open chromatin regions with gene expression in a single cell is reviewed.
Understanding the epigenetic profile of cells in development and / or disease can provide unique and key insights into understanding the expression of genes during such processes. Epigenetic changes don’t change the DNA sequence but instead involve the attachment of chemicals. Methylation, acetylation and two examples of such processes and such processes are linked to gene regulation and many physiological and pathological processes. In DNA methylation for example, methyl groups attach to DNA rendering it inactive and thus preventing the creation of a specific protein. RNA can be methylated and acetylated as well. Such epigenetic modifications can impact RNA processing, mRNA translation into proteins. Two very distinct syndromes, Prader-Willi syndrome and Angelman syndrome demonstrate how a single locus on chromosome 15q11.2-q13.3 can be deleted or mutated and depending on maternal or paternal imprinting can result in very different phenotypes. DNA methylation is at the heart of both syndromes and epigenetics has allowed for a greater understanding of both syndromes.
Epigenetic changes are studied by several techniques such as the understanding the modification of histone proteins around which DNA is bound. Chromatin immunoprecipitation-sequencing (ChIP-seq) has historically been used as a method for understanding protein-chromatin interactions. ChIP-seq has limitations such as requiring a large number of cells, lengthy protocols, and high sequencing depth. For “bulk cell” studies, more recent methods include CUT and TAG and CUT and RUN (cleavage under targets and Tagmentation or release using nuclease respectively). In such methods, DNA fragments bound to the modified histones are directly cleaved and released, instead of more laborious ChiP-seq methods requiring cross-linking and subsequent immunoprecipitation.
Single cell-based methods like ATAC-Seq, 5’ and 3’ gene expression or multi-omic approaches such as ATAC-Seq with 3’ gene expression offer unique insights to our clients compared to bulk RNA approaches.
Single cell ATAC-seq or assay for transposase-accessible chromatin looks for regions of open chromatin which provide insights into those regulatory sequences which are accessible to DNA binding proteins. ATAC-seq can be combined with 3’ gene expression simultaneously in a single cell providing multi-omic information and insights into how chromatin structure influences regulation of gene expression
Takeaway summary:
ATAC-Seq and 3’ gene expression are useful to simultaneously perform in a single cell and can help in the following ways:
Deeper cell type characterization where epigenetic profiles can be overlayed with gene expression to allow for better interpretation of such profiles
Understanding how gene regulatory networks may be disrupted in disease
Finding new gene regulatory interactions using multi-omic data.
The scientific curiosity to understand the cause of a disease has led to many technological innovations. As the cost of genomic sequencing started to fall a decade ago, it opened up numerous new technologies that could provide unique insights in understanding disease biology even at a molecular level. These include whole genome data (genomics), changes in the structure of chromatin, understanding RNA sequences and their expression (transcriptomics) to proteomics-based approaches to understand protein structure, folding and the measurement of various metabolites (metabolomics).
By Aditya Pai, Vice President, Corporate and Business Development, MedGenome Inc.
The scientific curiosity to understand the cause of a disease has led to many technological innovations. As the cost of genomic sequencing started to fall a decade ago, it opened up numerous new technologies that could provide unique insights in understanding disease biology even at a molecular level. These include whole genome data (genomics), changes in the structure of chromatin, understanding RNA sequences and their expression (transcriptomics) to proteomics-based approaches to understand protein structure, folding and the measurement of various metabolites (metabolomics). These broad array of technological advancements have helped in deciphering causal factors thus enhancing our ability to study and insight into many diseases through different dimensions and resolutions which were not previously possible.
While whole genome sequencing can provide excellent information on genomic data, a large emphasis has shifted to understanding the transcript and understanding how RNA is expressed. Bulk RNA sequencing approaches provide an advantage of being relatively cost effective, yet provide useful transcriptomic data. Typically, RNA is extracted from a tissue comprised of several cell-types. Thus, the name “bulk” as the analysis of the sequencing is not cell specific but instead is an average expression level for genes across a large population of cells. Such approaches can be very useful for differential gene expression analysis or comparing the transcriptome of a given tissue across different species.
However, in order to maximize one’s understanding of a disease or a particular microenvironment where a disease manifests itself, single cell RNA sequencing approaches have gained prominence. This has been aided by technological innovation in single cell sequencing technology and a reduction in the cost of sequencing. This has allowed for a far greater resolution or “fidelity” with which stochastic changes in cellular state or cellular heterogeneity be understood. For example, single cell RNA sequencing approaches allow for an understanding of heterogenous cell types, including rare cell populations, and molecular differences between healthy and abnormal tissue or clusters of cells that can be grouped to allow for a greater understanding of the microenvironment of the disease. Sheih et al1 demonstrated how single-cell transcriptional profiling of CAR-T cells can be used in patients undergoing CD19 CAR-T immunotherapy. Their use of scRNA-seq allowed for a unique assessment of transcriptional attributes in patients infused with CD19 CAR-T cells and how these could be potentially impacted by tumor burden and the tumor microenvironment. Similarly, for developers of cellular therapies, including CAR-T and NK-based products, scRNA-seq affords a unique opportunity to characterize the therapeutic product (i.e. the transduced cells prior to infusion into the patient) and compare it with the genetically-modified and other host cells later recovered from treated patients.
Depending on the number of cells and cell size, the most commonly used Single cell RNA sequencing approaches are 10X Chromium system and Takara SMARTSeq. In the 10X chromium system, normally used for greater than 100,000 cells, a cell suspension is used and the 10X Chromium system partitions reactions into nanoliter-scale droplets containing uniquely barcoded beads called GEMs (Gel Bead-In Emulsions). The system can be used for single cell partitioning or even single nuclei partitioning. For starting cell numbers below 20,000, approaches like Takara SMARTSeq are used where cells are sorted in a plate.
The combination of the above approaches can be used for various single cell RNA sequencing experiments to understand 5’ gene expression, 3’ gene expression, T and B cell immune repertoire and more specific antibody-based approaches such as CITE and ATAC-Seq. In my next blog, we will review the most commonly used single cell approaches used by MedGenome’s clients.
References
1 Sheih et al: “Clonal kinetics and single-cell transcriptional profiling of CAR-T cells in patients undergoing CD19 CAR-T immunotherapy.” Nat Commun 11, 2019-2020
A fundamental challenge in biomedical research is to identify accurate, early indicators of a disease. Recent advances in sequencing technologies have led to unparalleled efforts to characterize the molecular changes that underlie the development and progression of complex human diseases, including cancer. Scientists have widely used RNA-seq analysis to study the transcriptome in populations of cells. More recently, single-cell RNA seq studies have been used to gain insight on cellular traits and changes in cellular state.
By Derek Vargas, Application Scientist, MedGenome Inc.
A fundamental challenge in biomedical research is to identify accurate, early indicators of a disease. Recent advances in sequencing technologies have led to unparalleled efforts to characterize the molecular changes that underlie the development and progression of complex human diseases, including cancer. Scientists have widely used RNA-seq analysis to study the transcriptome in populations of cells. More recently, single-cell RNA seq studies have been used to gain insight on cellular traits and changes in cellular state.
The increasing commercial availability of single-cell sequencing platforms, such as 10x Genomics’ Chromium, has lead to many exciting discoveries. Additionally, researchers can now go beyond single-cell RNA seqanalysis and can also capture information on cell-surface proteins, chromatin state, genetic perturbations, and even genome data. Each modality complements RNA data to provide a unique perspective on cell state and identity.
Researchers at Peter MacCallum Cancer Centre have recently outlined a method for SUGAR-seq (SUrface-protein Glycan AND RNA-seq) which enables detection and analysis of N-linked glycosylation, extracellular epitopes, and the transcriptome at the single-cell level. Specifically, they used biotinylated lectins (carbohydrate-binding proteins) to label the complex N-glycan branches on the surface of cells. They then used an anti-biotin monoclonal antibody conjugated to an oligonucleotide tag that is compatible with the 10x Genomics workflow. This allows for easy integration with the Chromium platform, allowing simultaneous profiling of N-glycan levels, cell-surface protein expression, TCR sequences, and the transcriptome. Additionally, this modular approach allows for this protocol to be easily translated to other single-cell platforms.
The SUGAR-seq technique was used to examine tumor-infiltrating T cells (TILs) from multiple tumor sources. Analysis of the data revealed divergent levels of N-glycosylation across distinct TIL populations. Regulatory T cells and exhausted T cell subsets showed high levels of N-glycosylation, whereas memory T cells showed lower levels. Additionally, N-glycosylation levels were significantly increased in the TILS compared to the levels detected in lymph nodes, suggesting the tumor microenvironment modulates high N-glycan levels.
The use of SUGAR-seq has allowed for the simultaneous detection of surface glycans, epitopes, transcripts, andTCR repertoire to characterize TILs. This allows for deeper insights into the cellular environment and can be used to identify cellular phenotypes associated with disease. This demonstrates the power of using multi-omic data to answer scientific questions, and is an example showing how single cell platforms can be modified to maximize the data output.
References
Kearney CJ, Vervoort SJ, et al. (2021) SUGAR-seq enables simultaneous detection of glycans, epitopes, and the transcriptome in single cells. Science Advances Vol. 7 (8) DOI: 10.1126/sciadv.abe3610
Single-cell genomics techniques are revolutionizing our ability to characterize complex tissues. Although bulk RNA sequencing experiments can be insightful, they often mask important biological activity of rare cell types and fail to show the variability in gene expression between individual cells. The rapid development of low-input RNA seq methods has led to an explosion of single-cell RNA-seq platforms, each with their own advantages and limitations. Droplet-based methods (10X Chromium, DropSeq) can be used to analyze thousands of cells in a single prep.
By Derek Vargas, Application Scientist, MedGenome Inc.
Single-cell genomics techniques are revolutionizing our ability to characterize complex tissues. Although bulk RNA sequencing experiments can be insightful, they often mask important biological activity of rare cell types and fail to show the variability in gene expression between individual cells. The rapid development of low-input RNA seq methods has led to an explosion of single-cell RNA-seq platforms, each with their own advantages and limitations. Droplet-based methods (10X Chromium, DropSeq) can be used to analyze thousands of cells in a single prep. In this method, single cells are separated using microfluidics. They are then captured in emulsion, tagged with cell barcodes, and then further processed into a single library. On the other hand, plate-based methods (SMART-seq) for single-cell sequencing require flow sorting cells into individual wells. Each cell is individually lysed and the RNA is used to generate a library. This method requires sorting equipment and has a lower throughput but is far more sensitive than droplet-based methods. Plate-Seq can detect thousands of more genes per cell than droplet-based methods. Additionally, plate-seq generates data from full length mRNA transcripts, whereas droplet-based methods only provide data on the 3’ or 5’ end.
While each method for single-cell sequencing offers benefits, combining these two platforms can provide greater insights into heterogeneous cell populations, and offer a window into the various stages of differentiation and activation states in developing cell populations. Researchers from Xiamen University (Lin et al., 2021) were able to design a computational method which uses both types of data. By analyzing data from both droplet-based and plate-based single-cell experiments, they were able to describe the lineage features and predict the developmental path of mature human pancreatic islet cells. They used mitochondrial genome variants as endogenous lineage-tracking markers, and clustered their data based on these variants. Based on the distinct lineage features between alpha and beta cells, they determined that these cell types develop from different progenitors.
Single-cell RNA sequencing is now widely employed in immunological studies seeking to resolve previously unrecognized cellular heterogeneity, define processes in cell development and differentiation, and understand the gene regulatory networks that predict immune function. In this area also, researchers are finding that an approach combining droplet-based and plate-based single-cell sequencing methods can be beneficial to characterize cytotoxic T lymphocytes. Scientists at the Technical University of Munich (Kanev et al., 2021) tailored a droplet-based approach for high-throughput analysis, and a plate-based method for high depth sequencing. They named these methods tDrop-seq and tSCRB-seq, respectively. They noted that conventional droplet-based methods have inherently low mRNA capture efficiency for cytotoxic T cells and optimized a method for increasing the sensitivity. Although, tDrop-seq allowed them to process large numbers of cells, the low copy number of genes limited the power of analysis for these cells. They then used tSCRP-seq to generate high resolution data necessary for defining the critical mechanisms of T cell differentiation.
These studies highlight the importance of both droplet-based and plate-based single-cell sequencing methods. Though many researchers struggle with the decision between high throughput and low-cost drop-seq, or high-resolution data from plate-seq, there are certainly benefits when combining both methods. A dual method approach has the potential to shed more light on cellular processes involved in health and disease, as well as provide insights into cellular development.
References
Single-cell transcriptome lineage tracing of human pancreatic development identifies distinct developmental trajectories of alpha and beta cells. Doi: https://doi.org/10.1.1101/2021.01.14.426320
Technological advances in sequencing capabilities have rapidly accelerated our understanding of human health and disease. From the workhorse short-read Illumina sequencing data to the recent advent of third-generation sequencing instruments such as PacBio, Nanopore, that now enable single molecule sequencing, genomics and its applications has assumed a wider scope in recent times ranging from specialised studies such as transcriptomics, epigenomics, metagenomics to more specific application areas such as biomarker discovery
By Dr. Kushal Suryamohan, MedGenome Inc.
Technological advances in sequencing capabilities have rapidly accelerated our understanding of human health and disease. From the workhorse short-read Illumina sequencing data to the recent advent of third-generation sequencing instruments such as PacBio, Nanopore, that now enable single molecule sequencing, genomics and its applications has assumed a wider scope in recent times ranging from specialised studies such as transcriptomics, epigenomics, metagenomics to more specific application areas such as biomarker discovery, pharmacogenomics, disease diagnosis, identification of novel genes in disease mechanisms, drug repurposing and disease risk predictions.
Further, Innovations in micro fluidics now allow researchers to study biological processes at the level of single cells where researchers can obtain and interpret data which provides lots of insights into cellular heterogeneity, lineage tracing study, cell population dynamics and gene expression profiles of single cells with huge application possibilities.
Given the increasing amounts of data that is generated, analysis and interpretation of such complex data is paramount to building on our knowledge base. Moreover, it is imperative to have a scalable framework to analyze, store, collect and visualize data with reproducible results. Bioinformatics is the interdisciplinary field of science that is aimed at realizing this goal.
MedGenome has successfully setup a universal bioinformatics one-stop platform MAnGO – (MedGenome Analytics for Genomics Platform) with powerful algorithms and workflows that allow parallel analysis of 100s and 1000s of samples with data interpretation and visualization capabilities that are scalable and customized for specific research applications and queries.
Gall bladder cancer (GBC) is an aggressive gastrointestinal malignancy with a poor prognosis. It is the 20th most common type of cancer worldwide and its incidence is particularly high in specific regions of the world including Bolivia, Chile, Ecuador, Peru, Korea, Japan and India and is currently rising in Western populations (https://bit.ly/3kSLDMw) (Figure 1). In the United States, it is a more common malignancy in Southwestern Native Americans and Mexican Americans.
By Dr Kushal Suryamohan*, Bioinformatics Scientist, MedGenome Inc
Gall bladder cancer (GBC) is an aggressive gastrointestinal malignancy with a poor prognosis. It is the 20th most common type of cancer worldwide and its incidence is particularly high in specific regions of the world including Bolivia, Chile, Ecuador, Peru, Korea, Japan and India and is currently rising in Western populations (https://bit.ly/3kSLDMw) (Figure 1). In the United States, it is a more common malignancy in Southwestern Native Americans and Mexican Americans. There is also a gender disparity with GBC more prevalent in females than males. The median survival of patients with GBC is typically <1 year. This is mainly due to the fact that it is difficult to diagnose in the early stages and most patients are asymptomatic until the disease reaches an advanced or metastatic stage. Furthermore, the anatomical location of the gallbladder under the liver makes it easier for GBC to grow undetected. Radical surgery, chemo and radiation therapy remain the current mainstay for treating GBC. However, only about 10-15% patients are amenable to surgery and the 5-year overall survival rate of less than 5%. Currently, it’s not clear what causes gallbladder cancer. Likely causative factors include gallstones, female hormone estrogen, lifestyle, food and feeding habits, etc.
While most cancer genome sequencing studies to date have focused on highly prevalent cancers, few large-scale studies have been performed on rarer forms of cancer such as GBC. Thus, we decided to focus on GBC and created a global consortium involving researchers from India, USA, Korea and Chile. This collaborative effort allowed us to obtain 167 GBC primary samples as well as 39 non-GBC samples and the corresponding matched normal tissue. This unprecedented dataset enabled us to map genomic alterations frequently observed in GBC and to determine if there were differences among tumors from different geographic regions. In our study published in Nature Communications, we carried out exome, whole genome and transcriptome sequencing of GBCs from these ethno-geographically diverse populations and identified several significantly mutated genes that were not previously linked to GBC. This included ELF3, a frequently mutated gene in GBC with genomic alterations in 21% of the sequenced tumors. We integrated somatic mutation, copy number variation and gene fusion data to identify affected pathways in GBC. TP53/RB1 pathway was most commonly altered in GBC. We also found WNT pathway and KEAP1/NFE2L2 pathway activation in GBC. WNT pathway activation was primarily driven through activating mutations in CTNNB1 and RSPO3 fusion. We observed frequent inactivating mutations in SWI/SNF pathway genes including SMARCA4, ARID1A and ARID2. We also found several therapeutically actionable mutations in RAS/PI3K pathway involving frequent alterations in ERBB2. ERBB3, BRAF and PIK3CA.
The advent of immunotherapy has revolutionized cancer treatment with significant survival benefits observed in various cancers including melanoma and lung cancer. In order to determine potential opportunities for immunotherapy in GBC, we evaluated neoantigens arising from somatic mutations. We predicted high-affinity MHC class I binding neoantigen peptides for each tumor. This resulted in the identification of roughly 15 neoantigens per tumor. Most predicted neoantigens were derived from frequently mutated genes in GBC which included TP53, ELF3, CTNNB1, ERBB2, ARID1A and CDKN2A. Using peripheral blood mononuclear cells (PBMCs) from HLA-matched healthy donors, we were able to determine the ability of mutant peptides to activate T-cells. Three mutant ELF3 peptides, two mutant ERBB2 peptides and one mutant TP53 peptide were indeed found to activate T-cells and can be used as potential cancer vaccines. We also identified several actionable targets in GBC based on comprehensive characterization of genomic alterations. Up to 20% of all tumors in our study were found to have actionable targets based on available approved targeted therapies (OncoKB). We also identified neoantigens that can be pursued for developing immunotherapy strategies to treat gall bladder cancers (Figure 2).
By studying a diverse set of GBC samples across geographically diverse populations, our study has identified novel and potential cancer vaccine candidate genes for treating GBC. This is an important milestone in the ongoing global effort for finding biomarkers of translational significance. By selecting ethnically (and genetically) diverse population groups, this study further underscores the importance and need for incorporating genomic data analysis to identify candidate marker genes for diagnostic as well as therapeutic applications. This will result in better patient outcomes in the clinic through use of approved targeted therapies.
1 MedGenome Inc., Foster City, California, USA, 2 Jiwaji University, Gwalior, Madhya Pradesh, India, 3 QIMR Berghofer Medical Research Institute, Brisbane, Australia, 4 SciGenom Research Foundation, Chennai, Tamil Nadu, India
In 2013, single-cell sequencing was selected as the method of the year to highlight its ability to sequence DNA and RNA in individual cells. The advantages of such high resolution sequencing are to unveil previously unknown cell population heterogeneity and to perform more accurate analysis. The high degree of heterogeneity in tumor tissue is widely considered to relate to the mechanisms of tumorigenesis and metastasis. Traditional sequencing methods can only detect cell populations then get the average of the signals in a group of cells.
By Dr. Jing Wang, Project manager, Bioinformatics, MedGenome Inc
In 2013, single-cell sequencing was selected as the method of the year to highlight its ability to sequence DNA and RNA in individual cells. The advantages of such high-resolution sequencing are to unveil previously unknown cell population heterogeneity and to perform more accurate analysis. The high degree of heterogeneity in tumor tissue is widely considered to relate to the mechanisms of tumorigenesis and metastasis. Traditional sequencing methods can only detect cell populations then get the average of the signals in a group of cells. Now with the help of single-cell sequencing and analysis, researchers can explore the heterogeneity among individual cells, detect rare cell types, study immune process and do much more (Figure 1).
The single-cell techniques are improving dramatically since then together with the emerging next generation sequencing platforms. Commercially available single-cell kits such as inDrops, 10X Genomics, Drop-seq and SMART-seq utilize two major methods to reach single-cell resolution, droplet encapsulation and plate-based isolation. More and more applications of single-cell have invaded the market including transcriptome sequencing, TCR/BCR sequencing, ATAG-seq and even spatial gene expression. At MedGenome US research group, we offer various levels of analysis for single-cell applications. We can perform the standard Cell Ranger analysis utilizing our intensive computing resources or AWS. The analysis results include gene expression matrix for transcriptome sequencing, clonotypes for TCR/BCR sequencing and more. Further downstream analyses such as cell type assignment, clonal expansion study is also offered with sophisticated pipelines developed in house.
Along with these individual applications, multi-modal analyses are another big trend in the field of single-cell. In 2019, single-cell multi-modal omics was selected as the method of the year. It is the ability to make the most of your samples and to measure multiple data types simultaneously from the same cell. The common multi-modal applications include antibody tagging CITE-seq, overlaying transcriptome sequencing with TCR/BCR and overlaying transcriptome sequencing with ATAG-seq. Antibody-tagging CITE-seq can help researchers understand protein level expression in a quantitative way together with transcriptome expression. It can also be used to tag multiple samples together into one pool and construct one library, which is more cost effective. Overlaying transcriptome sequencing with TCR/BCR allows researchers to study clonotype distribution across different cell types and to track specific clonotype abundance. Overlaying transcriptome sequencing with epigenome ATAG-seq enables researchers to gain deeper insights into gene regulation mechanisms. At MedGenome US research group, we offer customized multi-modal analyses consisiting of custom pipeline development, publication ready visualization and knowledge-based data interpretation.
More and more bioinformatics tools are being developed to fulfill complicated single-cell analysis needs. With tumor and normal single-cell transcriptome datasets, people can utilize tools such as InferCNV to understand CNVs. Complex cell communications can also be characterized using tools like CellPhoneDB, which predicts communicating pairs of cells. With the help and unique advantage of single-cell technique, we believe the studies in various biology fields become easier and more accurate.
References
Tang et al., The single-cell sequencing: new developments and medical applications. Cell Biosci 9, 53 (2019). https://doi.org/10.1186/s13578-019-0314-y
Big data and big data analytics are the buzz words that we have been hearing for the past few years, which have relevance in all fields and specialties. In the field of medicine, the process of clinical documentation and analysis have been very meticulous and exhaustive in the past contributing to major discoveries in associating diseases with genes, understanding disease epidemiology and in generating and testing hypothesis. The advances in computational science and data processing have streamlined the management of medical big data creating opportunities to impact the health care system with accurate prognostication and disease management.
By Dr. N. Soumittra, Disease Head – Ophthalmology, MedGenome Labs
Medical Big Data
Big data and big data analytics are the buzz words that we have been hearing for the past few years, which have relevance in all fields and specialties. In the field of medicine, the process of clinical documentation and analysis have been very meticulous and exhaustive in the past contributing to major discoveries in associating diseases with genes, understanding disease epidemiology and in generating and testing hypothesis. The advances in computational science and data processing have streamlined the management of medical big data creating opportunities to impact the health care system with accurate prognostication and disease management. The sources of medical big data are electronic medical records, medical imaging, clinical registries, large clinical trials, large epidemiological studies and administrative claim records. Medical big data applications include rational clinical decision, predictive or prognostic modelling of disease progression, disease surveillance, public health and research.
MedGenome has taken a lead to create disease-specific knowledgebases using large clinical datasets to initiate hypothesis-driven research in complex diseases.
Ophthatome
Ophthatome – a knowledgebase for ophthalmic disease research was launched in April 2018. This knowledgebase of ocular diseases is a comprehensive collection of clinical, phenotype and biochemical data providing researchers and clinicians with a platform to design studies that address critical unmet needs in eye disorders. Ophthatome currently contains curated clinical and phenotype data of 581,466 cases that includes 524 disease types and 1800 disease subtypes, covering 35 different eye parts and more than 40 clinical variables. Nearly half of the total cohort have longitudinal data with a maximum of five-year follow-up.
The searchable interface enables performing complex queries to select specific disease cohorts based on demographics, disease types and subtypes, disease course or severity, specific tissues affected by the disease, drug response and many other clinical and phenotypic parameters.
The knowledgebase provides options to select cohorts with specific well-defined quantitative and qualitative traits apart from disease types and subtypes. The availability of clinically well-defined disease cohorts facilitates powerful genomic, pharmacogenomic and clinical research to discover novel biology in ocular diseases.
The database would be continuously updated with new data new and follow up cases as registered in the EMR of Narayana Nethralaya.
Ophthatome – Additional Note
Ophthatome is developed by MedGenome in collaboration with Narayana Nethralaya, a tertiary eye care hospital and research institute, Bengaluru. †he ophthatome knowledgebase is built on the electronic medical record data.
Reach out to MedGenome, your trustworthy partner in reseach and development of Advanced Pharma Solutions.
COVID-19 pandemic has infected over 23 million individuals and claimed over 800,000 lives globally as of August 2020. SARS-CoV-2, the organism causing COVID-19 belongs to the family of coronaviruses and shares 79% genome sequence identity with SARS-CoV. The spike antigen used by the virus to enter host cells became the prime target for immediate vaccine development efforts because of prior work on SARS-CoV that showed neutralizing antibodies against the spike antigen protected mice and chimps against new infection (1, 2).
COVID-19 pandemic has infected over 23 million individuals and claimed over 800,000 lives globally as of August 2020. SARS-CoV-2, the organism causing COVID-19 belongs to the family of coronaviruses and shares 79% genome sequence identity with SARS-CoV. The spike antigen used by the virus to enter host cells became the prime target for immediate vaccine development efforts because of prior work on SARS-CoV that showed neutralizing antibodies against the spike antigen protected mice and chimps against new infection (1, 2). Currently, over 166 vaccines are in development against SARS-CoV-2, which can be divided into two major categories – nucleic acid (DNA/RNA) and protein-based (antibodies, viral proteins, inactivated/attenuated viruses and virus-like particles (3, 4) vaccines. Four vaccine candidates – RNA-based delivery of spike protein (Moderna), RNA-based delivery of receptor-binding domain (RBD) of spike protein (Pfizer/Biontech), DNA-based delivery of spike protein (Inovio) non-replicating adenovirus-based delivery of spike protein (Oxford/AstraZeneca and CanSino Biologics) and inactivated virus (Sinovac) are in Phase-I or have completed Phase-I/II trials (5-9). Preliminary data from some of these trials show the induction of neutralizing antibodies against the spike antigen. T-cell immunity is relatively less robust in treated individuals. At present, efficacy data are lacking to support whether the induction of neutralizing antibodies alone will be sufficient to protect individuals from new infections.
Several features of SARS-CoV-2 infection are unique and do not follow the path of other respiratory viruses. For example, infected individuals remain asymptomatic carrying high viral load thereby becoming a potent source of viral transmission. Further, the immune response against SARS-CoV-2 is skewed towards a TH1/TH2 CD4 T-cell response, which results in severe immune toxicity in 15-20% of infected individuals (12, 13). Based on studies in other respiratory viruses, the development of protective immunity against SARS-CoV-2 in vaccinated individuals may face certain challenges. First, the mucosal antibody response is short-lived against respiratory viruses and shows a similar trend in SARS-CoV-2 raising the concern about whether an antibody response is sufficient for long-term protection (10). Second, some vaccinated individuals can experience life-threatening immune toxicity when exposed to the virus (11). Our understanding of how the immune system interacts with the vaccine and how this interaction will translate to a response during active infection remains rudimentary and therefore a source of concern especially in older individuals who are both susceptible to infection and also to immune toxicity, but need the vaccine most urgently. A good vaccine will engage both T-cell and B-cell immunity to provide immediate pathogen clearance and induce memory for long-term protection.
B and T-Cells in Adaptive Immunity
Historically, the B-cell memory arm responsible for pathogen-specific neutralizing antibodies is characterized in greater detail in vaccine development studies. Figure 1 (left) demonstrates how B-cells secreting IgM molecules provide short-term clearance of pathogens and pathogen-infected cells, while concurrently maturing into IgG secreting plasma cells that confer long-term protection. The kinetics of the T-cell arm Figure 1 (right) follows a similar profile where antigen-experienced T-cells expand in number, differentiates into a killer phenotype, and clears infected cells. Following clearance of the infection, T-cells persist as resident memory cells in the tissues and as effector memory cells in circulation becoming sentinels to prevent future infections (1). While the B-cell arm of adaptive immunity has been widely emphasized in vaccine development, the T-cell mediated immunity has remained underexplored primarily due to lack of the right technologies. However, significant technological advancements in the last decade have galvanized the study of T-cell mediated immunity in human diseases.
The activation of T-cells is MHC-peptide dependent. T-cell receptors (TCR)on CD8+ cytotoxic T-cells and CD4+ helper T-cells bind 9-15-mer peptide fragments in complex with MHC (referred to as HLA for humans). These peptides are recognized as foreign (non-self) and activate T-cells to generate protective immunity. A large diversity of MHC genes and their polymorphic variants bind millions of peptide fragments from proteins and non-protein antigens and present them to a large diversity of T-cells expressing ~ 106 – 109 unique TCRs. A single peptide can mobilize and activate many T-cells each expressing a unique TCR. Identifying a good TCR-MHC-peptide pair that can clear pathogens is the holy grail of cellular immunology. Figure 2 is a schematic of MedGenome’s OncoPept platform that identifies potent MHC-peptide-TCR combinations for efficient pathogen clearance and protective immunity.
T-Cells and COVID-19
In early April, the OncoPept team conducted their very first immune assays with peptide pools from the spike antigen of SARS-CoV 2 and were surprised to observe pre-existing T-cell immunity in healthy donors who were unexposed to the virus. Sequence alignment analysis revealed the possibility that a part of the global population may be protected against the current pandemic as a result of pre-existing immunity against other ‘’common cold viruses’’ in the Corona virus family. In fact a recent study in Cell reports that many healthy individuals have both CD4+ and CD8+-T cells that elicit an antigenic response to the new coronavirus even though they have never been exposed to SARS-CoV or MERS (2).
Going back to clues on de novo CD8 immunity against SARS-CoV-2, earlier infection by SARS-CoV can perhaps shed some key insights since it is one of the closest homologs in the phylogenetic tree. Patients who had recovered from SARS-CoV back in 2003 had indeed presented an antibody response that faded within two or three years. Interestingly these patients also elicited detectable virus-specific robust T cell response 10-17 years after the infection had disappeared (3, 4). These startling pieces of evidence support the notion that perhaps targeting T-cell driven immunity may be a key to a robust long-term immune protection against the COVID-19 pandemic in the population (5).
The OncoPept platform was used to assess CD8 T-cell-targeted immunity to the spike protein demonstrated to be highly immunogenic in SARS-CoV studies (6). The algorithm identified CD8 T-cell epitopes restricted to 23 HLAs. Figure 4 shows certain HLAs were restricted to a small number of peptides, whereas other HLAs presented many peptides. If an HLA presents a few peptides from a pathogenic organism the immune system may not register its presence and will fail to mount a protective response. Our prediction analysis indicated that individuals harboring HLA-A:03 or A:23 will present fewer peptides for immune recognition and individuals carrying these HLAs may be more susceptible to infection. Interestingly, a study showed that individuals carrying HLA-A*03 and A*23 were indeed more susceptible to SARS-CoV than individuals carrying HLA-C*15 and HLA-A*33, which bound many more peptides (Figure 4). The data is purely correlational but supports the idea that CD8 T-cell immunity is critical for developing a robust protective SARS-CoV-2-specific immune response.
Possible Challenges in Designing COVID-19 Vaccines
Published studies from Phase-I/II vaccine clinical trials by different groups have indicated a milder CD8 T-cell response against the spike antigen (8, 9). It is possible that potent T-cell response may lie in regions outside the spike protein as shown by the presence of memory T-cell responses against nucleocapsid protein and the non-structural proteins NSP7 and NSP13 gene products from the ORF-1 region in convalescent individuals (4). Identification and characterization of these epitopes and their inclusion in a selected vaccine cocktail may yield a more robust and targeted T-cell protective effect.
Another important consideration while using full-length Spike antigen over “selected epitopes” with immunogenic potential is the phenomenon of epitope dominance. The use of a full-length antigen may result in immuno-dominant T-cell responses outcompeting sub-dominant ones. In many scenarios, these immuno-dominant epitopes saturate MHC molecules on cells and mobilize large families of TCRs with reduced cytotoxic potential. This is a common mechanism employed by viruses as a means of evading the immune system. The response can be made broader and potent by utilizing a rational selection of sub-dominant T-cell epitopes or immuno-dominant epitopes that mobilize a productive and robust cytotoxic response for a well-rounded T-cell immunity (13, 14)
An important hallmark of SARS-CoV-2 infection is viral pneumonia accompanied by pulmonary inflammation and edema characterized by eosinophilic infiltrates. A current study highlights the critical role of host TH17 inflammatory responses in mediating this process. Elevated TH17 responses were also observed previously in MERS-CoV and SARS-CoV patients. Also, studies in experimental animals using vectored vaccines have reported substantial immune enhancement in both the lungs and liver of experimental animals characterized by eosinophilic infiltrates (15, 16). Therefore, careful parsing of T-cell epitopes is much needed to prevent the presentation of TH17-inducing epitope elements increasing the chances of immune toxicity and disease susceptibility.
Conclusion
To conclude, a wide array of respiratory viruses induces severe pneumonia, bronchitis, and even death following infection. Despite this immense clinical burden, there is a lack of efficacious vaccines with long-term therapeutic benefit. Most current vaccination strategies employ the generation of broadly neutralizing antibodies, however, the mucosal antibody response to many respiratory viruses is short-lived and declines with age. In contrast, several studies on respiratory viruses have shown the presence of robust virus-specific CD8-T cell responses which has been shown to last for decades. Therefore, vaccine designs for emerging respiratory viruses need consideration and rational inclusion of CD8 epitopes to confer long term resistance. The OncoPept platform developed at MedGenome combines computational and experimental methods to mine strong CD8 immunomodulatory antigens with therapeutic utility across disease areas spanning respiratory and other infectious diseases, cancer, and autoimmunity.
References
Actor JK. Introductory Immunology: Basic Concepts for Interdisciplinary Applications. 1st ed2014 June 2014.
Grifoni A, Weiskopf D, Ramirez SI, Mateus J, Dan JM, Moderbacher CR, et al. Targets of T Cell Responses to SARS-CoV-2 Coronavirus in Humans with COVID-19 Disease and Unexposed Individuals. Cell. 2020;181(7):1489-501 e15.
Ng OW, Chia A, Tan AT, Jadi RS, Leong HN, Bertoletti A, et al. Memory T cell responses targeting the SARS coronavirus persist up to 11 years post-infection. Vaccine. 2016;34(17):2008-14.
Le Bert N, Tan AT, Kunasegaran K, Tham CYL, Hafezi M, Chia A, et al. SARS-CoV-2-specific T cell immunity in cases of COVID-19 and SARS, and uninfected controls. Nature. 2020;584(7821):457-62.
Schmidt ME, Varga SM. The CD8 T Cell Response to Respiratory Virus Infections. Front Immunol. 2018;9:678.
Subbarao K. SARS-CoV-2: A New Song Recalls an Old Melody. Cell Host Microbe. 2020;27(5):692-4.
Jackson LA, Anderson EJ, Rouphael NG, Roberts PC, Makhene M, Coler RN, et al. An mRNA Vaccine against SARS-CoV-2 – Preliminary Report. N Engl J Med. 2020.
Folegatti PM, Ewer KJ, Aley PK, Angus B, Becker S, Belij-Rammerstorfer S, et al. Safety and immunogenicity of the ChAdOx1 nCoV-19 vaccine against SARS-CoV-2: a preliminary report of a phase 1/2, single-blind, randomised controlled trial. Lancet. 2020;396(10249):467-78.
Mulligan MJ, Lyke KE, Kitchin N, Absalon J, Gurtman A, Lockhart S, et al. Phase 1/2 study of COVID-19 RNA vaccine BNT162b1 in adults. Nature. 2020.
Walsh EE, Frenck R, Falsey AR, Kitchin N, Absalon J, Gurtman A, et al. RNA-Based COVID-19 Vaccine BNT162b2 Selected for a Pivotal Efficacy Study. medRxiv. 2020.
Long QX, Tang XJ, Shi QL, Li Q, Deng HJ, Yuan J, et al. Clinical and immunological assessment of asymptomatic SARS-CoV-2 infections. Nat Med. 2020;26(8):1200-4.
Scherer A, Salathe M, Bonhoeffer S. High epitope expression levels increase competition between T cells. PLoS Comput Biol. 2006;2(8):e109.
Im EJ, Hong JP, Roshorm Y, Bridgeman A, Letourneau S, Liljestrom P, et al. Protective efficacy of serially up-ranked subdominant CD8+ T cell epitopes against virus challenges. PLoS Pathog. 2011;7(5):e1002041.
Ruckwardt TJ, Luongo C, Malloy AM, Liu J, Chen M, Collins PL, et al. Responses against a subdominant CD8+ T cell epitope protect against immunopathology caused by a dominant epitope. J Immunol. 2010;185(8):4673-80.
Hotez PJ, Bottazzi ME, Corry DB. The potential role of Th17 immune responses in coronavirus immunopathology and vaccine-induced immune enhancement. Microbes Infect. 2020;22(4-5):165-7.
Wu D, Yang XO. TH17 responses in cytokine storm of COVID-19: An emerging target of JAK2 inhibitor Fedratinib. J Microbiol Immunol Infect. 2020;53(3):368-70.
Cancer immunologists scooped the 2018 medicine noble prize for pioneering treatments that unleash the body’s own immune system to attack cancer cells. It represents a completely new principle which unlike the previous strategies that target the cancer cells, rather targets the brakes — the checkpoints — of the host immune system.
Immunotherapy based on check point inhibitors has shown astounding clinical success with countess patients with varied tumor types showing a pronounced clinical response.
By Dr. Malini Manoharan – Bioinformatics Scientist-II, MedGenome Labs
Cancer immunologists scooped the 2018 medicine noble prize for pioneering treatments that unleash the body’s own immune system to attack cancer cells. It represents a completely new principle which unlike the previous strategies that target the cancer cells, rather targets the brakes — the checkpoints — of the host immune system.
Immunotherapy based on check point inhibitors has shown astounding clinical success with countess patients with varied tumor types showing a pronounced clinical response, however, many more patients show a decreased or no clinical benefit. Understanding the complexity and diversity of the tumor microenvironment in the context of its immune composition can largely improve patient stratification. To this end, MedGenome has developed
OncoPeptTUME, a genomic solution that utilizes its highly cell-type specific proprietary minimal gene expression signature to characterize the composition of currently 8 different immune cells. The expression of genes for a given signature is transformed to produce a cell-type specific immune score that is used to quantitate the relative proportion of cell types present in the tumor microenvironment (Figure 2).
Pan cancer analysis of the TCGA data using OncoPeptTUME revealed immunogenic features that impact prognosis in human cancers. Our analysis revealed that CD8+ T cells expressing higher levels of anergic and exhaustion markers, which are hallmarks of dysfunctional T-cells were enriched in the deceased group compared to the alive group. The analysis published recently (Manoharan et al., 2018) reveals critical determinants of long-term survival pointing to an integrated approach that can be designed for selecting patients who will benefit from cancer immunotherapy treatment.
Manoharan Malini, Mandloi Nitin, Priyadarshini Sushri, Patil Ashwini, Gupta Rohit, Iyer Laxman, Gupta Ravi, Chaudhuri Amitabha (2018). A Computational Approach 1dentifies 1mmunogenic Features of Prognosis in Human Cancers. Front. Immunol., 9.
It’s widely believed that South Asians are born with a high risk for several non-communicable diseases (NCD) such as diabetes, cardiovascular diseases etc, which often points to unhealthy lifestyle and environmental factors. For example, higher prevalence of type 2 diabetes is often over-attributed to overweight/obesity in Indians. How many of us know that in India, there are higher incidence of type 2 diabetes reported even in people with lower BMI ? [1,2]
Compared to Europeans, South Asians, on an average have low muscle mass, which could be due to long-term adaptation to climate [2].
Dr. Ramesh Menon, Senior Bioinformatics Scientist – II, MedGenome Labs
It’s widely believed that South Asians are born with a high risk for several non-communicable diseases (NCD) such as diabetes, cardiovascular diseases etc, which often points to unhealthy lifestyle and environmental factors. For example, higher prevalence of type 2 diabetes is often over-attributed to overweight/obesity in Indians. How many of us know that in India, there are higher incidence of type 2 diabetes reported even in people with lower BMI ? [1,2]
Compared to Europeans, South Asians, on an average have low muscle mass, which could be due to long-term adaptation to climate [2]. Genetic studies in South Asian population have found increased selection of a gene encoding myostatin – a protein that inhibits skeletal muscle growth in uterus through poor placental glucose uptake [3]. Can this be a factor for higher incidence for type 2 diabetes in South Asians? Recent studies answered yes to this question [3]. In fact, South Asians may have a specific tendency for fat accumulation in liver (ectopic hepatic fat) and for intramyocellular fat deposition, which cause further disruption in insulin action [2,4,5]. In addition, the conversion from pre-diabetes to diabetes is alarmingly rapid in South Asians, reported in a 10 year follow-up study by Madras Diabetic Research Foundation [6]. We are yet to completely understand the reason behind this completely, but hepatic fat accumulation is found to play an important role in post-dysglycaemia [6].
A typical genome-wide association study (GWAS) attempts to identify moderate or high frequent genetic variant by assaying several thousands to millions of genomic markers, screening large number of samples. One of the important study was from Saxena and collegues in 2013, where the GWAS analysis was performed in Punjabi Sikhs, a sub-population with high prevalence of type 2 diabetes and cardiovascular disease despite low obesity rates, moderate non-vegetarian diet, and strict tobacco abstinence [7]. The authors have identified a previously unreported genetic variant (rs9552911) in SGCG (skeletal muscle-expressed sarcoglycan) gene, which has a strong association to type 2 diabetes in Punjabi Sikhs. The findings were confirmed by a replication study.
In a larger study published very recently in Nature Genetics, Vujkovic collegues have discovered 558 independent SNPs associated with T2D in a multi-ethnic study. Interestingly, out of 558 markers, 21 SNPs were ancestry informative markers (AIM) present in Europeans [8]. However, they could not find a population-specific type 2 diabetes associated marker in South Asians, which may be due to limited South Asian sub-population representation in the public domain used in the study. In India, the major limiting factor is the systematic recording of clinical/biochemical/anthropometric parameters of the patient visiting the clinic. Except few well-established hospitals or clinics which are active in research, our primary health centres where most patients get treatment are not equipped to capture this important information. This limits the application of current computation tools and statistical methods from a deeper analysis combining genotype and phenotype data. Models like UK Biobank are good examples that can be followed in India.
References
Narayan and Kanaya (2020). Why are South Asians prone to type 2 diabetes? A hypothesis based on underexplored pathways. Diabetologia
Pomeroy et.al (2019). Ancient origins of low lean mass among South Asians and implications for modern type 2 diabetes susceptibility. Scientific Reports
Metspalu et. al. (2011). Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. American Journal of Human Genetics
Trouwbors et.al. (2018). Ectopic Fat Accumulation in Distinct Insulin Resistant Phenotypes; Targets for Personalized Nutritional Interventions. Frontiers in Nutrition
Gujral et.al. (2019). Diabetes in Normal-Weight Individuals: High Susceptibility in Nonwhite Populations. Diabetes Care
Mohan et.al. (2015). Incidence of Diabetes and Prediabetes and Predictors of Progression Among Asian Indians: 10-Year Follow-up of the Chennai Urban Rural Epidemiology Study (CURES). Diabetes Care
Saxena et. al. (2013). Genome-wide association study identifies a novel locus contributing to type 2 diabetes susceptibility in Sikhs of Punjabi origin from India. Diabetes
Vujkovic et.al. (2020). Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nature Genetics
Contact MedGenomefor helpful insights into genetic studies. A trusted partner in your genomic research and development.
#South Asian Cohorts, #genome-wide association study, #SNPs, #Diabetes
Genetic testing can prove to be a crucial element in clinical diagnosis as it offers critical insights that can help a clinician in managing, treating and even sometimes prevent an inherited condition. A specific diagnosis and treatment regimen can be determined once the clinician runs a specific array of genetic tests after knowing the detailed family history of the patient.
Reporting one or two variants specific to the phenotype of the patient from few lakh variants after performing an NGS assay involves steps that can strategically eliminate variants of least importance.
Genetic testing can prove to be a crucial element in clinical diagnosis as it offers critical insights that can help a clinician in managing, treating and even sometimes prevent an inherited condition. A specific diagnosis and treatment regimen can be determined once the clinician runs a specific array of genetic tests after knowing the detailed family history of the patient.
Reporting one or two variants specific to the phenotype of the patient from few lakh variants after performing an NGS assay involves steps that can strategically eliminate variants of least importance. The variants are excluded based on high minor allele frequency (common in the control population), variants with benign insilico predictions or that are not significant to the phenotype mentioned. However, differential diagnosis, clinical indications manifested and the other details plays an important role in identification of the causative agent.
We analysed a case of a 21-year-old male, born of a consanguineous marriage, who had difficulty in walking since his childhood. Other physical challenges involved difficulty in getting up from the floor, weakness in left lower limb for the past 5 years, followed by the weakness in the right lower limb from past 4 years, slipping of footwear without awareness, outward deviation of feet from past 4 years, tremors in both hands from past 2-4 months, malformed ears, bifacial weakness, pes cavus, hammer toes and reduced deep tendon reflexes and elevated levels of Creatine phosphokinase (CPK). He also had a family history of elder sister who was similarly affected. The clinician suspected this as a case of hereditary spastic paraparesis. We analysed the NGS data for all the genes related to hereditary spastic paraparesis but did not find anything significant, however, we noted a rare homozygous missense variant in ARG1 gene which causes Argininemia.
Arginase deficiency is an autosomal recessive inborn error of metabolism caused by a defect in the final step in the urea cycle, the hydrolysis of arginine to urea and ornithine. Primarily, urea cycle disorders are characterized by the triad of hyperammonemia, encephalopathy, and respiratory alkalosis. Sometimes, arginase deficiency may be misdiagnosed as static spastic diplegia (cerebral palsy). It should be noted that arginase deficiency is one of the few treatable causes of spastic diplegia. (https://www.omim.org/entry/207800)
Get in touch with MedGenome, your reliable research and development partner for advanced pharmaceutical solutions.
For the majority of people, the mere mention of snakes conjures involuntary shivers! These stealthy critters have a forked tongue, unblinking eyes and either have fangs that deliver venom to immobilize/kill prey or strong muscles to asphyxiate. Snakes have been around for millions of years, and have used this time to become incredibly effective predators and can be found on all continents except Antarctica. Beginning over a 100 million years ago, snakes diverged from lizards, lost their legs and evolved into smaller and faster hunters to catch quick-moving prey. Rather than expend a great deal of energy to forage for food, many snakes developed venom – a complex chemical cocktail of proteins and enzymes designed to kill or incapacitate the prey even before ingesting their meals.
For the majority of people, the mere mention of snakes conjures involuntary shivers! These stealthy critters have a forked tongue, unblinking eyes and either have fangs that deliver venom to immobilize/kill prey or strong muscles to asphyxiate. Snakes have been around for millions of years, and have used this time to become incredibly effective predators and can be found on all continents except Antarctica. Beginning over a 100 million years ago, snakes diverged from lizards, lost their legs and evolved into smaller and faster hunters to catch quick-moving prey. Rather than expend a great deal of energy to forage for food, many snakes developed venom – a complex chemical cocktail of proteins and enzymes designed to kill or incapacitate the prey even before ingesting their meals. The toxins in these venoms have been refined over millions of years to target highly specific pathways that affect their prey’s vital bodily functions. Some toxins are neurotoxins while some disrupt hemostasis and several others that are cytotoxic. Some snakes are so dangerous that people die from such encounters. According to the most recent report by the World Health Organization, about 5 million people are bitten by snakes and ~100,000 are killed annually.
There are more than 3000 identified species of snakes, of which over 600 are known to be venomous. India, where snakes are both feared and worshipped as mythological animals, has roughly 300 snake species, of which ~60 are venomous. Given that most of India’s population still lives in rural areas, encounters with snakes are quite frequent with > 45,000 snakebite-related deaths every year and these are only estimates as incidences of snakebites are often underreported. The “big four” snakes of India – the spectacled cobra, common krait, Russell’s viper and saw-scaled viper cause the most fatalities.
While antivenom, the only currently approved form of treatment for snakebites, is freely available in public hospitals, there are several issues with the current practices in antivenom manufacturing, often resulting in anti-venom that is poorly efficacious. One reason for this is snake venom and its potency differs between species, and even between snakes of the same species between regions. For instance, while doctors have been known to administer 2-3 vials of a certain antivenom against a species of snake in one part of India, more than 25 vials of the same serum are required to treat a victim bitten by the same snake in another part of the country. Another reason for this lack of efficacy is the archaic technology used for antivenom manufacture. Antivenom manufacturers still use a technology that was first pioneered ~120 years ago (about 30 years before penicillin was discovered by Alexander Fleming). This method relies on the use of snakes for milking their venom glands to extract venom. Small amounts of venom are then repeatedly injected into horses to create an immune response. Antibodies are then extracted from the blood and packaged as antivenom (with a few minor steps in between that help extend the antivenom shelf life). Needless to say, this is a laborious and expensive process. More importantly, over 70% of the antibodies in this antivenom cocktail do not target the toxins that cause the most damage in snakebite victims. This is because when antibodies are extracted from the horse’s blood, you do not only find those that recognize the snake toxins but also countless other antibodies are recovered which have no therapeutic effect on the snakebite as these antibodies recognize bacteria, viruses, hay, dust, and other environmental stimuli the horse may have been exposed to. Therefore, such antivenoms are typically less potent and necessitate administration of multiple doses of antivenom per treatment. Another consequence of this is that the antivenom can cause adverse reactions in the patient/victim including hyperallergic reactions such as serum sickness, kidney failure and anaphylactic shock, which can kill the snakebite victim if the snake does not do so first. Another critical drawback is the poor efficacy of such antivenoms – many snake venom toxins are small proteins and are poorly immunogenic and therefore are not attacked by the horse’s immune system.
Given these drawbacks, several alternative antivenom manufacturing methods have been proposed. One such approach that has gained increasing attention is the use of phage display technology. Phage display technology is a high-throughput approach to discover human antibodies specific to different antigens. Several antibodies discovered using phage display technology have been approved for use as drugs to treat a number of human diseases, ranging from cancers to autoimmune disorders. Using this approach, it is now feasible to develop humanized antibodies that can target key proteins, like potent toxins, to create effective and safe antivenom. In lieu of relying on snakes for venom, knowledge of the toxins and their genomic coding sequences for a given species will instead allow for the synthesis and expression of venom components using recombinant DNA technology. These can then be used as antigens for antibody discovery. A cocktail of such antibodies against the most potent toxins can be combined to yield a synthetic antivenom of a defined composition. Importantly, these humanized antibodies will not elicit an immune response from patients and can be produced using standard lab approaches for drug manufacture. Equally important, this approach will lead to a more humane and cost-affordable approach to antivenom manufacture as it does not require maintaining a collection of snakes for venom extraction or horses for antivenom development.
While the above mentioned methods are superior for antivenom development, large animal-based antivenom production, using extracted snake venoms, still continues to be the standard practice. This is due to several factors including socio-economic barriers, low funding for research initiatives, the complexity of developing an alternative treatment, and low economic incentives for pharmaceutical companies to develop antivenoms. In 2018, snakebite envenoming was added to the World Health Organization’s (WHO) list of Category ‘A’ Neglected Tropical Diseases in 2017, thereby bringing renewed attention and focus on promoting research and development efforts into novel snakebite antivenom therapies.
A significant hurdle in developing nextgen antivenom is the gap in our understanding of snake venom. Much of our current knowledge on snake venoms is based on proteomic studies and they have provided an incomplete picture of the venom components. Mass spectrometry of venom relies on a good database of proteins to accurately identify the constituent components. Given the limited genomic or transcriptomic reference datasets for venomous snakes, this database of venom proteins is not comprehensive.
Our impetus to get involved in snakebite and antivenom research was fueled by this gap in antivenom manufacture technology. Given MedGenome’s expertise in the NGS space, we leveraged this experience and utilized several genomics sequencing technologies including long-read, short-read sequencing platforms, optical mapping and chromosome conformation capture methods to produce the first high-quality reference genome of the Indian cobra. This study was recently published in Nature Genetics and was featured on the cover of the January 2020 issue. Besides the genome, we also published a comprehensive catalogue of venom genes for this medically important snake. An integrated analysis of genome, transcriptome and venom proteome of the Indian cobra revealed 12,346 genes that were expressed in the venom gland that included 139 toxin genes from 33 different toxin families. From this list, we identified 19 genes that were primarily expressed in the venom gland. Using proteomic data from the venom, we confirmed the presence of 16 of these toxins. It is likely that these toxins form the major components of this species’ venom and targeting these venom-specific toxins using synthetic antibodies should neutralize the major toxic effects. This information can be used for rational design and expression of toxins of interest using recombinant DNA technology. Recombinantly produced toxins can then be used for developing synthetic antibodies using phage display technology. Once identified and tested, the resulting synthetic human antibodies against the different toxins can be produced on a large scale and combined to yield a safe and effective antivenom. We envision such an antivenom can be manufactured in a cost-effective manner and be made more accessible across India. This approach will modernize antivenom development and set the stage for the generation of a broad spectrum antivenom against the ‘Big Four’ Indian snakes.
Our study has given insights into previously unknown genetic structure and variations in venom genes within a given snake species. This study provides a useful genomic resource which will facilitate studies of venom biology, evolution, drug discovery and antivenom research in Asia and across the world.
Another aspect of our venture into snakes is the fact that despite their deadly nature, venom is one of nature’s most beautifulparadoxes. By design, venom is meant to kill, and it does this job frighteningly quickly and efficiently! Yet, the same properties that make it deadly can also be harnessed to provide potent healing. Several components of venom often target the same molecules that medicines target to treat diseases. Indeed, out of the ~1,000 venom toxins that have been analyzed by scientists so far, about a dozen drugs have been developed and brought to market. There are already six drugs approved for use by the FDA (Food and Drug Administration) in the USA – all derived from venom. To date, the FDA has approved seven drugs derived from animal venom for conditions such as high blood pressure, heart conditions, chronic pain, and diabetes. Ten more are currently in clinical trials while many others are in pre-trial stages. And we have only barely scratched the surface – with an estimated 300,000 venomous animals found across the world and ~50-60 unique toxins in each species, there are ~20 million potential toxins, each with its own targets and effects that remain unexplored. While snakes are our primary species of interest for venom research, we are also studying other venomous creatures, from venomous caterpillars to jellyfishes, centipedes, scorpions and many more. Our goal is to catalogue the genomes and venom of these species and thus create a rich resource for drug discovery. Stay tuned for more updates on the fascinating world of venomous animals.
References:
Suryamohan K., et al, The Indian cobra reference genome and transcriptome enables comprehensive identification of venom toxins. Nat Genet 52, 106–117 (2020).
World Health Organization snakebite resource – https://www.who.int/health-topics/snakebite
Gutiérrez, J., Calvete, J., Habib, A. et al. Snakebite envenoming. Nat Rev Dis Primers 3, 17063 (2017).
Venoms to Drugs: Venom as a Source for the Development of Human Therapeutics. Ed. By Glenn F. King (https://doi.org/10.1039/9781849737876)
Phage display technology – George P. Smith* and and Valery A. Petrenko, Chemical Reviews 1997 97 (2), 391-410
Cancer vaccines are an upcoming therapeutic modality mobilizing body’s own immune system to eradicate advanced tumors and holds tremendous promise for cancer patients unresponsive to checkpoint inhibitors.
Theoretically, targeting cancer with cancer vaccines has a clear advantage over other targeted therapies in that every cancer patient can be treated with a unique set of cancer vaccine cocktail derived from a set of tumor-specific mutations present in the patient. Although conceptually attractive, the field of cancer vaccines has not delivered on the promise as results from several Phase I/II clinical trials have shown.
By Papia Chakraborty, Assoc. Dir. and Head of Immuno-Oncology, Amit Chaudhuri, VP, R&D
Introduction
Cancer vaccines are an upcoming therapeutic modality mobilizing body’s own immune system to eradicate advanced tumors and holds tremendous promise for cancer patients unresponsive to checkpoint inhibitors.
Theoretically, targeting cancer with cancer vaccines has a clear advantage over other targeted therapies in that every cancer patient can be treated with a unique set of cancer vaccine cocktail derived from a set of tumor-specific mutations present in the patient. Although conceptually attractive, the field of cancer vaccines has not delivered on the promise as results from several Phase I/II clinical trials have shown. Surprisingly, in all clinical trials, patients treated with the vaccine cocktail mount an antigen-specific immune response, which in a large subset of patients fail to control tumor growth. There are many reasons for this disconnect between immune response and benefit; quality and magnitude of response, the lack of generation of T-cells that will eliminate tumor cells, the inability of primed and activated T-cells to enter the tumor compartment, the hostile immune-suppressive tumor microenvironment that blunts the T-cell response and finally the absence of T-cell targets on tumor cells.
Antigen Selection – one side of the story
A formidable challenge in the field of vaccines, in particular cancer vaccines, is identifying epitopes that will engage T-cells and induce a robust T-cell response. Thirty years of vaccine development in infectious diseases has taught us that it is relatively easier to produce a prophylactic than a therapeutic vaccine. One of the early approved prophylactic vaccine that significantly reduces the risk of developing human papilloma virus (HPV)-induced cervical cancer in young women combines two proteins expressed by the virus at an early stage of infection. Other successful vaccines that protect us from influenza, yellow fever, polio for example, use attenuated or killed viruses exposing our immune system to a gamut of viral antigens thereby ensuring that most individuals mount their own virus-specific immune response. Through these studies, many immunodominant epitopes restricted to specific HLAs were discovered. In contrast, a large body of work on cancer vaccines has yet to identify similar immunodominant cancer neoepitopes. It is an open question whether tumor cells can harbor immunodominant epitopes without getting eliminated by the immune system?
The exposure of the immune system to cancer cells for a prolonged period of time before disease manifestation may act against the origination of immunodominant epitopes in cancer in contrast to infections, which occur at a much shorter time scale. It is likely that tumor antigens mounting a strong immune response at the time of cancer initiation eliminates tumor cells – thereby erasing the existence of an immunodominant epitope at the very outset. By contrast, tumors that have broken the immune barrier and manifested as a full-blown disease may have eliminated, anergized or suppressed the relevant population of T-cells that were tumor reactive. The field of cancer vaccines requires two different sets of cancer neoepitopes – one that the immune system have encountered, but failed to mount a sustained response due to tumor-mediated immune suppression, and a second set of neoepitopes that T-cells may not have encountered to create an army of activated tumor-directed T-cells. The efficacy of checkpoint blockade antibodies can be enhanced by combining the reinvigoration of antigen-experienced suppressed T-cells and by mobilizing antigen-inexperienced T-cells to become antigen-specific. Finding cancer epitopes that mount both recall (epitopes already encountered by T-cells in the patient) and de-novo (epitopes not seen by T-cells earlier) responses may favor the generation of a broad tumor-directed host immune response. Therefore, assays to identify robust T-cell epitopes is a priority area and a variety of technologies to identify antigen-specific T-cells is pursued both in academia and industry.
Antigen-specific T-cells (other side of the story)
For cancer neoepitopes to mount an effective anti-tumor response, they must engage T-cells, expand them and equip them with ammunition to kill the tumor cells. Antigen-specific T-cell activation starts by the engagement of a neoepitope, a mutated peptide-MHC (peptide-major histocompatibility complex, pMHC) complex with the T-cell receptor (TCR). The pMHC complex is presented on antigen-presenting cells (APCs) and on tumor cells. Engagement with neoepitope-presenting APC activates T-cells making them functionally competent to kill tumors (cytolytic T-cells) or inducing them to become helper T-cells that lack cytolytic activity, but performs other functions related to long-term disease control. Neoepitope cocktails that generate both cytolytic T-cells and helper T-cells may provide long-term survival benefit. Discovering TCRs that engage a pMHC complex to produce CTLs is difficult because one pMHC complex will engage and expand many unique T-cell clones. One has to characterize the function and phenotype of individual T-cells by interrogating a variety of intracellular and cell-surface markers and finally test the T-cells for their tumor killing potential. One approach is to isolate all T-cells that bind a specific pMHC complex (multimers) and analyze them for function, such as their cell killing activity. A second approach popularized by 10X Genomics’s single-cell sequencingplatform is to label individual T-cells using a peptide-MHC multimer and characterize them transcriptionally for functional phenotype. A third approach is to combine multimer staining with cell surface marker staining (CITE-Seq) to identify antigen-specific polyfunctional T-cells. Identifying T-cell receptors that engage pMHC complex to deliver protective T-cell response is an active area and can bring novel T-cell directed therapies to treat cancer.
Besides, therapeutic use of T-cells, antigen-specific modulation of the host T-cell repertoire is turning out to be a useful biomarker in the field of cancer immunotherapy. Combining ex vivo and in vivo analysis of T-cell dynamics can be a powerful method to assess the effect of vaccines on patients. The expansion and contraction of specific T-cell clones over the course of therapy can give valuable insights on interactions between T-cells and tumor cells and impact on long-term disease control. Bulk TCR sequencingis a relatively cheaper technology that can provide information on T-cell dynamics through a TCR-centric lens is of considerable clinical value.
Where is the field heading?
Discovery of targets that engage the immune system will continue to remain a focus area in the field of cancer immunotherapy. Novel immunogenic T-cell epitopes will be identified by a variety of new technologies leveraging genomics and proteomics. For example, scientists are probing the dark matter of the genome, short open reading frames scattered within the non-coding regions of the genome to identify epitopes recognized by the immune system using Ribo-Seq. A variety of delivery platform for delivering neoepitopes is being considered for evoking a robust immune response. In addition, as single-cell-sequencing and spatial transcriptomics mature and become inexpensive, they will find greater usage in selecting optimal antigen-specific T-cells with properties to enter the tumor, thereby increasing their effectiveness as a therapy.
The GenomeAsia100K pilot project included 1,739 individuals of 219 population groups from 64 countries across Asia. The samples included 598 from India, 156 from Malaysia, 152 from South Korea, 113 from Pakistan, 100 from Mongolia, 70 from China, 70 from Papua New Guinea, 68 from Indonesia, 52 from the Philippines, 35 from Japan and 32 from Russia. The high-quality sequence data of Indian samples were generated from MedGenome’s sequencing lab located at Narayana Netralaya hospital in Bangalore, India.
By Ramesh Menon, Anjali Verma, Manjari Deshmukh, Akshi Bassi and Ravi Gupta (Bioinformatics R&D division, MedGenome Labs)
The GenomeAsia100K pilot project included 1,739 individuals of 219 population groups from 64 countries across Asia. The samples included 598 from India, 156 from Malaysia, 152 from South Korea, 113 from Pakistan, 100 from Mongolia, 70 from China, 70 from Papua New Guinea, 68 from Indonesia, 52 from the Philippines, 35 from Japan and 32 from Russia. The high-quality sequence data of Indian samples were generated from MedGenome’s sequencing lab located at Narayana Netralaya hospital in Bangalore, India.
The study has given insights into previously unknown population genetic structure, as well as implications of sub-population/community specific genetic variations in diseases as well as drug reactions. This study provides a useful genomic resource which will facilitating genetic studies in Asia including India. More than 20% of genetic variants identified in this study are not reported in previous studies like the Exome Aggregation Consortium (ExAC), 1000 Genomes project, gnomAD etc. In rare disease genetics databases like ExAC, gnomAD, 1000G, dbSNP are used to filter variants based on allele frequency. Since majority of the samples available in these databases are of European origin there are population specific variants present higher frequency which otherwise will be taken as rare variant. For example, when both the gnomAD and the data published in this study is used for filtering common variants (allele frequency > 0.1%), then we reduce the candidate variants roughly by two-fold as compared to when we use gnomAD alone. This study will improve the identification of pathogenic variant for the rare diseases more accurately as it will help in filtering variants for South Asian ancestry more accurately.
The complex history of Asian populations and population structure has also been reported in this study. This study shows that people from India, Malaysia and Indonesia consists of multiple ancestral populations as well as multiple admixed groups. The rate of recessive diseases has increased because of strong founder effects. Our study found that the indigenous and the tribal population groups have higher identity by descent (IBD) as compared to other groups. Further, we found that the urban population from Chennai (size of 9 million) has an IBD score which is 1.3 times higher than the Finnish group. This suggests that our population group from Southern part of India have higher founder effect and also carry a higher chance of having recessive disorders.
Variation in certain regions in the genome that are ancestry related sometimes have implications to drug responses. In several clinics globally, the recommendations for dosing of certain drugs are guided by apparent or self-reported population identity. In this study, we assessed the allele frequencies of key pharmacogenomic variants in the GenomeAsia pilot dataset to identify inter-population differences that have potential implications on drug testing and treatment. Interestingly, the study has identified drugs such as carbamezepine, clopidigrel, peginterferon and warfarin as the drugs with largest impact on genetic variation related to ethnicity and has predicted adverse drug responses in several population sub-groups. For example, a genetic variant in HLA-B gene is associated with risk for development of Steven Johnson syndrome in patients treated with carbamazepine was found to occur at an increased frequency in Austronesian group people (~400 million) from Indonesia, Malaysia and the Philippines. Also, the study assessed the allele frequencies of key pharmacogenomic variants in our dataset to identify inter-population differences that have potential implications on drug testing and treatment, and these novel findings can help the Pharma industry to reduce time and investment in their research while assessing the efficacy and toxicity of new drug development. The GenomeAsia has deeply catalogued population specific genetic variants in “very important pharmacogenes” (VIP genes) such as VKORC1, IFNL3, CYP2B6, CYP2D6 and CYP2C19, affecting dosage, efficacy and toxicity of associated FDA approved drugs.
Human genetic studies taking place across the world have minimum representation from Asian population groups. Most of these studies have been performed on people with European origin. Now, discoveries and genetic associations found from the European population is not necessarily can be translated to non-European population group. This limits the researchers in understanding human diseases accurately for the non-European population including those from Asia (which represent 60% of the global population). Recently, there has been slight improvement in non-European studies but still it remains highly underrepresented. This study has also published a imputation reference panel available at Michigan Imputation Server (MIS – https://imputationserver.sph.umich.edu/index.html). Our analysis revealed that our panel provides much superior imputation for South Asian ancestry as compared to the existing published reference panel. This will help the GWAS studies performed on the South Asian ancestry.
The GenomeAsia consortium is continuously collecting and analyzing several thousands of diverse genomes across Asia, which creates a unique platform for genetic studies, pharmacogenomic genomic research, which can pave way to the well-being of people in Asia. The pilot study genome browser is available freely and can accessed using the following link https://browser.genomeasia100k.org.
GenomeAsia100K Consortium (2019). The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature
David Reich et. al. (2009). Reconstructing Indian population history. Nature Genetics
Analabha Basu et. al. (2016). Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. PNAS.
High throughput sequencing of cancer patients has enabled rapid identification of somatic coding mutations that could generate neoantigens1,2. The tumor neoantigens are ideal targets for immunotherapy because they are expressed only by the tumor cells3. Several studies have suggested that neoantigens are important targets for effective antitumor immune response and their use for developing personalized vaccines4-6. Many studies have been published that showed that higher mutation burden is linked to stronger T-cell responses and better survival of the patients7,8.
By Dr. Ravi Gupta, PhD, Chief Scientist, Bioinformatics
Introduction
High throughput sequencing of cancer patients has enabled rapid identification of somatic coding mutations that could generate neoantigens1,2. The tumor neoantigens are ideal targets for immunotherapy because they are expressed only by the tumor cells3. Several studies have suggested that neoantigens are important targets for effective antitumor immune response and their use for developing personalized vaccines4-6. Many studies have been published that showed that higher mutation burden is linked to stronger T-cell responses and better survival of the patients7,8. Associations have been reported in endometrial cancers, melanoma, non-small cell lung cancer (NSCLC) and colorectal cancer 9-12. The neoantigens-specific T cell population have been also found to be expanded in effective antitumor immunity9,10. Both animal and human studies have shown that the tumor cell presenting immunogenic peptides can be selectively targeted by T cells which leads to complete or partial regression of tumor13-15.
Challenges in developing cancer vaccine
The vaccine has to be designed such that the patient’s immune cell (T cells) selectively hunt and kill only those specific tumor cells that present the targeted neoantigens. Finding a solution to train patient’s immune systems to specifically target and kill cancer cells has proven to be a difficult task. The first success of molecular identification of neoantigen was reported by Plaen et al. in 199816.
Identification of right immunogenic neoantigens is one of the central problems in the successful development of cancer vaccine. A patient’s tumor contains candidate neoantigens ranging from few hundred to several thousand. The real challenge is in selecting the candidate that would be best for stimulating the patient’s T cells. Computational algorithms have been developed but these programs suffer from lack of sensitivity and specificity because they rely heavily on features associated with antigen presentation alone, without considering features required for T cell receptor (TCR) binding. A recent paper describes a novel approach of quantifying neoantigen fitness in tumors to predict immunogenic peptides, in which both HLA presentation and TCR recognition are used as fitness components17. The neoantigen fitness model predicts immunogenic epitopes without examining structural features in a peptide that enables interaction with TCR. The model was used to predict long-term survivors of pancreatic cancer in a recent study18.
MedGenome solution
MedGenome computational group has developed a highly accurate new method (IPepPredicT) to select immunogenic peptide19. IPepPredicT applies ensemble voting-based machine learning approach to identify immunogenic peptides from patient’s somatic mutations. Our method is the first in-silico model that combines physicochemical properties of amino acids favorable for TCR binding with features relevant for antigen presentation and processing. IPepPredicT is trained on MHC Class I HLA-A*02:01 9mer peptides present in IEDB data. Our analysis revealed enrichment of helix/turn features at TCR contact residues along with hydrophobicity features enriched at the HLA-binding anchor residues. Our analysis also provides a feature spatial enrichment map that provides a guideline for selecting immunogenic peptides. While developing the method we also analyzed MHC-peptide-TCR complex crystal structures. Our analysis revealed that many of the features selected by our prediction algorithm are in agreement with finding from crystal structure analysis.
Promising results from recent clinical trials
Recently two clinical trials reported have shown encouraging results. The first study was conducted at Boston’s Dana-Farber Cancer Institute on six melanoma patients4. The cancer vaccine targeted 20 neoantigens. Of the six patients given cancer vaccine, four of them are disease free for 25 months. For the remaining two patients, the disease reoccurred and was treated with anti-PD-1 therapy. The neoantigen specific T cells was found to be expanded that lead to complete tumor regression. The second clinical trial was conducted on 13 melanoma patients by Biopharmaceutical New Technologies (BioNTech) in Germany5. The cancer vaccine in this trial targeted 10 neoantigens for each patient. Eight patients were disease free for 12–23 months. These studies clearly indicate that the personalized vaccine has the ability to make cancer patient disease free.
By accurately predicting neoantigens, we believe IPepPredicT could help in effective personalized cancer vaccines.
References
Snyder, A. & Chan, T. A. Immunogenic peptide discovery in cancer genomes. Curr Opin Genet Dev30, 7-16, doi:10.1016/j.gde.2014.12.003 (2015).
van Rooij, N. et al. Tumor exome analysis reveals neoantigen-specific T-cell reactivity in an ipilimumab-responsive melanoma. J Clin Oncol31, e439-442, doi:10.1200/JCO.2012.47.7521 (2013).
Schumacher, T. N. & Schreiber, R. D. Neoantigens in cancer immunotherapy. Science348, 69-74, doi:10.1126/science.aaa4971 (2015).
Ott, P. A. et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature 547, 217-221, doi:10.1038/nature22991 (2017).
Sahin, U. et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature 547, 222-226, doi:10.1038/nature23003 (2017).
Vasquez, M., Tenesaca, S. & Berraondo, P. New trends in antitumor vaccines in melanoma. Ann Transl Med 5, 384, doi:10.21037/atm.2017.09.09 (2017).
Yarchoan, M., Johnson, B. A., 3rd, Lutz, E. R., Laheru, D. A. & Jaffee, E. M. Targeting neoantigens to augment antitumour immunity. Nat Rev Cancer17, 209-222, doi:10.1038/nrc.2016.154 (2017).
Hu, Z., Ott, P. A. & Wu, C. J. Towards personalized, tumour-specific, therapeutic vaccines for cancer. Nat Rev Immunol, doi:10.1038/nri.2017.131 (2017).
Chan, T. A., Wolchok, J. D. & Snyder, A. Genetic Basis for Clinical Response to CTLA-4 Blockade in Melanoma. N Engl J Med373, 1984, doi:10.1056/NEJMc1508163 (2015).
Rizvi, N. A. et al. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science348, 124-128, doi:10.1126/science.aaa1348 (2015).
Le, D. T. et al. PD-1 Blockade in Tumors with Mismatch-Repair Deficiency. N Engl J Med 372, 2509-2520, doi:10.1056/NEJMoa1500596 (2015).
Howitt, B. E. et al. Association of Polymerase e-Mutated and Microsatellite-Instable Endometrial Cancers With Neoantigen Load, Number of Tumor-Infiltrating Lymphocytes, and Expression of PD-1 and PD-L1. JAMA Oncol 1, 1319-1323, doi:10.1001/jamaoncol.2015.2151 (2015).
Matsushita, H. et al. Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting. Nature482, 400-404, doi:10.1038/nature10755 (2012).
DuPage, M., Mazumdar, C., Schmidt, L. M., Cheung, A. F. & Jacks, T. Expression of tumour-specific antigens underlies cancer immunoediting. Nature482, 405-409, doi:10.1038/nature10803 (2012).
Castle, J. C. et al. Exploiting the mutanome for tumor vaccination. Cancer Res72, 1081-1091, doi:10.1158/0008-5472.CAN-11-3722 (2012).
De Plaen, E. et al. Immunogenic (tum-) variants of mouse tumor P815: cloning of the gene of tum- antigen P91A and identification of the tum- mutation. Proc Natl Acad Sci U S A85, 2274-2278 (1988).
Luksza, M. et al. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature551, 517-520, doi:10.1038/nature24473 (2017).
Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. Nature551, 512-516, doi:10.1038/nature24462 (2017).
Priyanka Shah, R. G., Anand Kumar Maurya, Ravi Gupta, Amit Chaudhuri. A machine learning approach for accurate prediction of immunogenic peptides from somatic mutations. Under Review (2017).
The COVID-19 pandemic has changed the way we interact with our customers. The interactions of the commercial team with the customers and prospects are pushed to be more virtual. And with the pandemic, we are all discussing about the ‘new normal’ where a behavioral change in these interactions are expected to sustain and become a habit. During these times of virtual interaction, one aspect that we at MedGenome have focused on is developing a platform to manage project logistics and project delivery.
By Mohana (Senior Project Manager), Angelica (Executive Account Manager) and Hiranjith GH (Senior Director, Corporate Marketing & Business Operations ), MedGenome Inc., USA
The COVID-19 pandemic has changed the way we interact with our customers. The interactions of the commercial team with the customers and prospects are pushed to be more virtual. And with the pandemic, we are all discussing about the ‘new normal’ where a behavioral change in these interactions are expected to sustain and become a habit.
During these times of virtual interaction, one aspect that we at MedGenome have focused on is developing a platform to manage project logistics and project delivery. The intent was to offer a platform for our customers to register project samples with us, to streamline the shipment logistics, to easily accept data/analyses delivery and to access all projects with MedGenome in a single dashboard. This is at the core of our business – Improve customer experience
MedGenome’s Customer Portal will allow for
1. Project & Sample registration
Customers can register samples for a confirmed project through the portal. Following a simplified process, the client can access the sample order form and generate a shipping document to accompany the samples in their shipment. Each sample is assigned a unique identifier to allow for seamless sample tracking.
2. Sample Accessioning at MedGenome lab in Foster City, CA
The shipment, once arrived in our lab, will be accessioned by leveraging the label accompanying the shipment. Sample names are crossed referenced from shipping documents against tube labels to ensure accurate alignment.
3. Project Execution & Tracking
Our scientists execute the project as per the statement of work to generate data and perform analyses, all accessible through the customer portal. The portal provides increased visibility and access to QC reports and project status.
4. Customer access to data & analyses
Customer can access and download the data and analyses for the corresponding project through the portal. The portal allows the customer to access all historic projects that they have executed with MedGenome in a single view. The new interactive analysis feature allows clients to visualize and interact with their data and download publication ready figures.
In all of these, customer communication is maintained through the system.
The platform allows to support MedGenome growth as a nationwide preferred service provider offering end to end services for extraction, library prep, sequencing and analysis specializing in bulk RNA sequencing (low input and degraded), single cell (GEX, TCR, BCR, CITE) sequencing, immune profiling (TCR, BCR) and genome (WGS, WES, DeNovo, CUT&TAG) sequencing services. By doing so, we are able to provide a seamless experience to our customers.
One of the early adopters of our portal, a Staff Scientist at a large Academic lab in the US, states the benefit as below
“MedGenome’s Portal, a place where sequencing projects can be uploaded and sequence files downloaded, works the way we would like. It is nice to have all projects that are easily accessible from one location”
Our endeavor is to continue to innovate and provide the best-in-class experience, services and solutions to our customers. Looking forward to continued engagement with our customers in 2020.
COVID-19 caused by SARS-CoV-2 Virus has emerged as a major challenge with no known vaccine available in the market so far. This pandemic has warranted scaling up of research efforts both by pharma companies and university research centres to develop a viable vaccine. Although there are many promising candidates in the pipeline, none will be available anytime soon making social distancing measures and quarantine as the only effective resort to contain this disease.
By Vinay CG, Sr. Manager – Content & Communications, Peer Reviewers: Kushal Suryamohan (Bioinformatics Scientist) & Hiranjith GH (Senior Director, Corporate Marketing & Business Operations ), MedGenome Inc., USA
COVID-19 caused by SARS-CoV-2 Virus has emerged as a major challenge with no known vaccine available in the market so far. This pandemic has warranted scaling up of research efforts both by pharma companies and university research centres to develop a viable vaccine. Although there are many promising candidates in the pipeline, none will be available anytime soon making social distancing measures and quarantine as the only effective resort to contain this disease.
The virus itself is zoonotic in its origin, capable of jumping from its natural host, bats, to other species, including humans. Phylogenetic analyses of viral genome sequences have indicated that it closely resembles SARS-like coronavirus strain BatCov RaTG131 and that it spread to human population through an intermediary host, likely pangolins.
Structure and Entry
The novel coronavirus is made up of a Nucleocapsid Protein (N), Spike Proteins (S), Envelope Protein (E) and the Membrane Protein (M) – see Figure 1.
The S-protein is the most critical one as it binds to the host cell receptor thus entering the host cell. Specifically, in humans the β-coronaviruses – the class to which the previous SARS-CoV and the current SARS-CoV-2 belong to – are found to bind to Angiotensin-converting Enzyme Receptor 2 (ACE2). The S- protein consists of 2 sub-units namely the S1 and the S2. The S1 sub-unit (receptor binding domain (RBD)) binds with the host cell receptor while the S2 sub-unit helps in fusing the viral and host membranes3. This process is aided by a cellular serine protease TMPRRS2– which helps in S-protein priming4. Once the virus’s viral RNA genome enters the cytoplasm it gets translated into two polyproteins and structural proteins, and then the viral genome begins to replicate. This is followed by the insertion of newly formed glycoproteins into the membrane of endoplasmic reticulum (ER) or the Golgi complex thus leading to the formation of the nucleocapsid owing to the combination of genomic RNA and nucleocapsid protein. The viral particles then germinate into the ER-Golgi intermediate compartment (ERGIC) and finally the vesicles containing virus particles fuse with the plasma membrane to release the virus5(Figure 26).
Clinical Features and Pathogenicity
SARS-CoV-2 viral infection is primarily through respiratory droplets. Infected individuals may exhibit fever, cough, and fatigue, while other symptoms include sputum production, headache, haemoptysis, diarrhoea, dyspnoea, and lymphopenia7. However, the incubation period may range from 1 to 14 days before a patient can display any symptoms.
Replication is usually in mucosal epithelium of upper respiratory tract (nasal cavity and pharynx), with further multiplication in lower respiratory tract and gastrointestinal mucosa.
Possible Drug Targets
The thorough understanding of the underlying mechanisms of SARS-CoV-2 is proving to be a vital key for the development of novel drugs and vaccines against this virus. Possible approaches for discovery of drugs/vaccine are shown in Figure 49.
MedGenome Study, Services and Genomic Solutions
I. SARS-CoV-2 Virus and Structural evaluation of Human ACE2 receptor polymorphism
MedGenome recently performed an extensive analysis to identify ACE2 polymorphisms that might alter host susceptibility to SARS-CoV-2 by affecting the ACE2-S-protein interaction. This comprehensive analysis included several large genomic datasets that included over 290,000 samples representing >400 population groups and identified multiple ACE2 protein-altering variants, some of which mapped to the S-protein-interacting ACE2 surface. Using recently reported structural data and a recent S-protein interacting synthetic mutant map of ACE2, our scientists could identify natural ACE2 variants that are predicted to alter the virus-host interaction and thereby potentially alter host susceptibility. In particular, human ACE2 variants S19P, I21V, E23K, K26R, T27A, N64K, T92I, Q102P and H378R are predicted to increase susceptibility. The T92I variant, part of a consensus NxS/T N-glycosylation motif, confirmed the role of N90 glycosylation in immunity from non-human CoVs. Other ACE2 variants K31R, N33I, H34R, E35K, E37K, D38V, Y50F, N51S, M62V, K68E, F72V, Y83H, G326E, G352V, D355N, Q388L and D509Y are putative protective variants predicted to show decreased binding to SARS-CoV-2 S-protein.
To know more about the study, please click here to access our publication.
II. MedGenome COVID-19 research services and solutions
MedGenome also offers critical services out of our high-throughput Next-Generation sequencing lab in Foster City, California, to help manage and accelerate our customer’s COVID-19-related R&D projects.
H.A. Rothan and S.N. Byrareddy, Journal of Autoimmunity 109 (2020) 102433
Yuefei Jin et al, Virology, Epidemiology, Pathogenesis, and Control of COVID-19, Viruses 2020, 12, 372.
Haibo Zhang et al, Angiotensin-converting enzyme 2 (ACE2) as a SARS-CoV-2 receptor: molecular mechanisms and potential therapeutic target, Intensive Care Med (2020) 46:586–590.
Biomarkers are biological indicators of early disease detection (diagnostic), disease progression and outcome (prognostic), and response to therapy (predictive). The inclusion of biomarkers in patient selection has led to superior drug response rates and increased overall survival in pivotal clinical trials. Also, use of biomarkers to select drug sensitive patients have greatly improved the quality of life by improving therapeutic efficacy and reducing toxicity.
By Amit Chaudhuri, VP R&D, MedGenome Inc. USA
Definition of Biomarkers
Biomarkers are biological indicators of early disease detection (diagnostic), disease progression and outcome (prognostic), and response to therapy (predictive). The inclusion of biomarkers in patient selection has led to superior drug response rates and increased overall survival in pivotal clinical trials. Also, use of biomarkers to select drug sensitive patients have greatly improved the quality of life by improving therapeutic efficacy and reducing toxicity. Biomarkers discovered and used in clinical trials have been approved as companion diagnostics and used routinely in making treatment decisions. In this review, I will give an overview of cancer biomarkers, their discovery using traditional approaches and more recently through genomics and proteomics technologies and their validation through clinical trials.
Diagnostic biomarkers
Diagnostic biomarkers allow disease detection and/or disease staging. Traditionally, diagnostic biomarkers in cancer came from histopathology. The WHO classification of solid and hematological tumors are based on histopathological examination of the tissues and available as monographs, or blue books for consultation (whobluebooks.iarc.fr/). For example, WHO recognizes 30 subtypes of lymphoma based on their histopathology, which has improved the accuracy of patient diagnosis significantly, without impacting drug development, or treatment decisions, because of molecular heterogeneity within the subtypes [1]. For example, gene expression profiling of diffuse large B-cell lymphoma (DLBCL) has identified three distinct molecular subtypes that are treated differently. Other molecular rearrangements have aided in the diagnosis of solid tumors such as ALK-fusion for the diagnosis and therapy of ALK-positive non-small cell lung cancer. Diagnostic markers in many instances have become both predictive and prognostic. For example, estrogen receptor positive (ER+) breast cancer is a diagnostic marker, as well as a predictive marker for hormone inhibition therapy, and a prognostic marker of good clinical outcome, when compared with hormone receptor negative tumors [2].
Predictive vs. prognostic biomarkers
There is considerable confusion in our understanding of what distinguishes a predictive biomarker from a prognostic biomarker. Predictive biomarkers are associated with response to treatment. Tumors positive for the marker will show differential treatment effects compared with tumors negative for the marker. As an example, in non-small cell lung cancer (NSCLC), tumors harboring activating mutations in epidermal growth factor receptor (EGFR) benefited more from erlotinib (Tarceva) treatment (hazard ratio, HR 0.10) compared to tumors harboring wild-type EGFR treated with erlotinib (HR 0.78)[3]. In this example, both groups benefited from treatment HR < 1, however, there was a quantitative difference in benefit between EGFR mutant vs. EGFR wild-type group (quantitative interaction) [2, 4]. The benefit can also be qualitative, in which case the biomarker positive group benefits from the therapy, whereas there is a lack of benefit to the negative biomarker group including harmful effects from the treatment. For example, use of anti-EGFR monoclonal antibody cetuximab provides benefit to metastatic colorectal cancer patients harboring wild-type KRAS, but patients harboring mutant KRAS fare poorly in the presence of the drug [5]. This makes KRAS a predictive marker of response to anti-EGFR therapy in metastatic colon cancer. Surprisingly, the status of KRAS is not a predictive biomarker of anti-EGFR tyrosine kinase inhibitor (erlotinib or gefitinib) in non-small cell lung cancer [6] indicating deeper biological differences between the two cancer types.
A prognostic biomarker provides information on disease outcome, such as disease progression, disease recurrence or death, independent of drug treatment [2]. For example, activating mutations in phosphatidyl-inositol-3-kinase catalytic subunit alpha (PIK3CA) show worse prognosis in women with HER2-positive metastatic breast cancer, regardless of treatment [7, 8]. A prognostic biomarker may reveal the underlying mechanism of disease progression and can guide the development of novel therapies.
BIOMARKER DETECTION IN CLINICAL SETTINGS
Platform technologies
Biomarkers are derived from tumor tissues or other body fluids and detected by histopathological, immunohistochemical (IHC), fluorescence, ELISA, and PCR based techniques. Tumor tissue-derived biomarkers, such as overexpression of genes are detected by IHC, such HER2 overexpression in HER2+ breast cancer. Chromosomal translocation such as BCR-Abl fusion in Philadelphia chromosome is detected by fluorescence in situ hybridization (FISH). ELISA methods are used to detect proteins in blood or other body fluids such as Carbohydrate antigen 19-9 (CA19-9) from the serum of pancreatic cancer patients. More recently DNA and RNA sequencing have expanded the scope of biomarker detection from limited tissue material. Mutations in EGFR, BRAF, KRAS and other oncogenes are detected by sequencing and is used routinely in clinical settings as predictive and prognostic markers. Similarly, mass-spectrometric approaches have identified biomarkers in complex body fluids such as serum and saliva. Biomarkers discovered using high throughput proteomics methods are validated in the clinic using more robust multiplex ELISA methods.
Multi-omics approaches
In recent years, technological breakthroughs in genomics and proteomics have resulted in a shift from the use of a single biomarker to multiple biomarkers for disease classification, diagnostics, and prognosis. This is specifically true for oncology indications, where genetic and biochemical heterogeneity of tumor cells and the need to use combination therapies to derive maximum efficacy require a deeper understanding of the molecular features of the tumor and its microenvironment. These molecular features can be accurately assessed by the use of carefully selected biomarkers.
This multi-omics biomarker discovery approach has found extensive application in the area of cancer immunotherapy – a rapidly developing field of cancer treatment, where the host immune response is boosted to elicit an anti-tumor response. The efficacy of immune-boosting checkpoint inhibitors is closely associated with molecular features present in tumor cells and thetumor microenvironment. Both exome and RNA-sequencinganalyses reveal critical determinants of drug response. The scope of such an analysis is schematically represented in Figure 1.
Biomarkers make meaningful differences in clinical trials
A review of clinical trials conducted between 2006-2015 (9985 trials) reveal a low Phase-I to approval success rate for oncology drugs compared to other non-oncology disease areas (5.1% vs. 11.8% respectively) [12]. Further, the success of a biomarker-driven clinical trial was 3-times higher than a trial without biomarkers (25.9% vs. 8.4% respectively) [12]. Therefore, biomarker discovery has become mandatory for the clinical development of therapeutic molecules in all disease areas, particularly in oncology.
Biomarkers have become particularly important for targeted therapies and patient selection during clinical trials. In the early days of cancer treatment, non-targeted therapies, such as chemotherapy, or radiation therapy did not require specific biomarkers for patient selection. Histopathological examination of tumor tissue helped in tumor staging, which guided treatment decisions. With the advent of targeted therapies, biomarkers for selecting patients who will benefit from treatment became pivotal in designing Phase-II and III clinical trials. In 2005, AstraZeneca’s EGFR inhibitor gefitinib was tested in a Phase-III multicenter clinical trial involving 1692 patients. The trial failed to show improvement in benefit between the placebo and the treated groups, although indications of benefit to certain patient subgroups, such as never smokers or Asian origin were noted [13]. However, follow up molecular studies, investigating the mechanism for the lack of benefit, discovered that only patients harboring activating mutations in EGFR were super responsive to the EGFR tyrosine kinase inhibitors erlotinib and gefitinib [14-16]. These findings resulted in the rescue of the drugs, which have become the standard of care treatment for NSCLC patients harboring activating mutations in EGFR. Similarly, approval of crizotinib against NSCLC tumors harboring anaplastic lymphoma kinase fusion (ALK-fusion) has become the standard of care treatment within four years after the discovery that 3-5% of NSCLC tumors harbor ALK-fusion genes [17] and ROS fusion genes [18]. Such accelerated clinical development was only possible because biomarkers for selecting tumors that will benefit from therapy were well established and FISH assays to detect such fusions were in place.
Biomarkers for drug repurposing
Drug repurposing or drug repositioning is finding new uses for existing drugs against new disease indications. Repurposed drugs may be approved for one disease indication, or may have failed clinical development due to inadequate efficacy or unacceptable toxicity. An example of an approved drug repurposed for a totally different indication is the cyclogenase-2 inhibitor (COX2) Celebrex (celecoxib). Celebrex and its generic counterpart celecoxib reduce inflammation and is approved for osteoarthritis, rheumatoid arthritis and acute pain and other indications. However, the drug has been repurposed for use against colon polyps based on the finding that COX2 overexpression increases the risk of colorectal cancer and a clinical trial to that effect demonstrated a decrease in the risk of additional polyp formation in individuals with colorectal cancer [19]. Drug repurposing requires identification of diagnostic biomarkers associated with disease mechanisms. In the example above, the discovery that COX2 is highly overexpressed in colon cancer and inflammation is a key mediator of colon polyp formation led to the repurposing of COX2 inhibitor in this disease indication, which is considered a milestone discovery in colon cancer research. Another example is the use of the Type-2 diabetic drug metformin in preventing cancer. Metformin inhibits mitochondrial complex-I, reducing the generation of ATP, thereby increasing AMP levels that trigger AMPK kinase activation resulting in an increase in glucose metabolism [20]. New discoveries made in the last few years have identified pleiotropic effects of metformin on cellular pathways, such as inhibition of reactive oxygen species (ROS) generation, inhibition of p53-mediated cyclin-D1 expression, inhibition of autophagy and insulin-like growth factor signaling triggering a flurry of over 200 clinical trials in cancer (www.clinicaltrials.gov). Drug repurposing will rely heavily on the discovery of biomarkers for patient stratification, and for measuring positive effect of drugs in the repurposed disease indications.
Future of biomarkers in precision medicine and personalized therapies
Biomarker discovery is a critical bottleneck to ensure the success of drugs in clinical trials. The cost of new drug development has skyrocketed in the last decade reaching over 1 billion dollars in discovery/development cost and running clinical trials. The burden of failure in late stage clinical trials results in a significant erosion in company’s market value, winding down of future research activities and blunting innovation that small companies bring to the table. A recent example is the failure of BMS’s drug Opdivo (nivolumab) in the first line treatment of advanced non-small cell lung cancer. The results of the failed clinical trial demonstrated that PD-L1, which is used routinely as a biomarker for selecting patients might not be robust enough to ensure approval of BMS’s drug. The lack of positive clinical trial data erased 20% of BMS’s market cap in a day and prevented the market adoption of its drug to a competing product Keytruda (pembrolizumab) from Merck, which got approved for the same indication. The Opdivo CheckMate trial and other unsuccessful clinical trials emphasize the need to identify robust biomarkers very early during drug development, and design efficacy and toxicity studies around these biomarkers to evaluate their utility, before transitioning the drug into pivotal clinical trials.
A large number of technological platforms including next generation sequencing and mass-spectrometry are available for the rapid discovery of biomarkers in complex tissues and body fluids [21]. This robustness of these technologies is well suited for clinical adoption and is rapidly gaining momentum with the regulatory authorities. Equipped with multi-omics-based biomarkers the era of precision medicine will enter into the next phase of delivering personalized medicine, where each patient will receive a tailored therapy at the right time and at the right dose to maximize efficacy and avoid adverse toxicity – fighting cancer and still experiencing a better quality of life.
References
Younes, A. and D.A. Berry, From drug discovery to biomarker-driven clinical trials in lymphoma. Nat Rev Clin Oncol, 2012. 9(11): p. 643-53.
Ballman, K.V., Biomarker: Predictive or Prognostic? J Clin Oncol, 2015. 33(33): p. 3968-71.
Brugger, W., et al., Prospective molecular marker analyses of EGFR and KRAS from a randomized, placebo-controlled study of erlotinib maintenance therapy in advanced non-small-cell lung cancer. J Clin Oncol, 2011. 29(31): p. 4113-20.
Khan, S.A., et al., EGFR Gene Amplification and KRAS Mutation Predict Response to Combination Targeted Therapy in Metastatic Colorectal Cancer. Pathol Oncol Res, 2016.
Song, Q.B., Q. Wang, and W.G. Hu, Anti-epidermal growth factor receptor monoclonal antibodies in metastatic colorectal cancer: a meta-analysis. World J Gastroenterol, 2015. 21(14): p. 4365-72.
Hames, M.L., et al., Correlation between KRAS mutation status and response to chemotherapy in patients with advanced non-small cell lung cancer. Lung Cancer, 2016. 92: p. 29-34.
Swain, S.M., et al., Pertuzumab, trastuzumab, and docetaxel in HER2-positive metastatic breast cancer. N Engl J Med, 2015. 372(8): p. 724-34.
Baselga, J., et al., Biomarker analyses in CLEOPATRA: a phase III, placebo-controlled study of pertuzumab in human epidermal growth factor receptor 2-positive, first-line metastatic breast cancer. J Clin Oncol, 2014. 32(33): p. 3753-61.
Topalian, S.L., et al., Mechanism-driven biomarkers to guide immune checkpoint blockade in cancer therapy. Nat Rev Cancer, 2016. 16(5): p. 275-87.
Motz, G.T. and G. Coukos, Deciphering and reversing tumor immune suppression. Immunity, 2013. 39(1): p. 61-73.
Chen, D.S. and I. Mellman, Elements of cancer immunity and the cancer-immune set point. Nature, 2017. 541(7637): p. 321-330.
Thomas, D.W., Burns, J. et al., Clinical Development Success Rates 2006-2015. BIO Industry Analysis, 2016.
Thatcher, N., et al., Gefitinib plus best supportive care in previously treated patients with refractory advanced non-small-cell lung cancer: results from a randomised, placebo-controlled, multicentre study (Iressa Survival Evaluation in Lung Cancer). Lancet, 2005. 366(9496): p. 1527-37.
Haber, D.A., et al., Molecular targeted therapy of lung cancer: EGFR mutations and response to EGFR inhibitors. Cold Spring Harb Symp Quant Biol, 2005. 70: p. 419-26.
Lynch, T.J., et al., Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. N Engl J Med, 2004. 350(21): p. 2129-39.
Pao, W., et al., EGF receptor gene mutations are common in lung cancers from “never smokers” and are associated with sensitivity of tumors to gefitinib and erlotinib. Proc Natl Acad Sci U S A, 2004. 101(36): p. 13306-11.
Soda, M., et al., Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature, 2007. 448(7153): p. 561-6.
Rikova, K., et al., Global survey of phosphotyrosine signaling identifies oncogenic kinases in lung cancer. Cell, 2007. 131(6): p. 1190-203.
Arber, N., et al., Celecoxib for the prevention of colorectal adenomatous polyps. N Engl J Med, 2006. 355(9): p. 885-95.
Pryor, R. and F. Cabreiro, Repurposing metformin: an old drug with new tricks in its binding pockets. Biochem J, 2015. 471(3): p. 307-22.
Simon, R. and S. Roychowdhury, Implementing personalized cancer genomics in clinical trials. Nat Rev Drug Discov, 2013. 12(5): p. 358-69.
#Biomarkers, #genomics and proteomics, #Predictive, #prognostic, #Platform technologies, #Multi-omics approaches
It is a miracle of billion-years evolution that vertebrates, including us – the humans, are constantly thwarting attacks from an ever-expanding universe of foreign invaders such as bacteria, viruses and other pathogenic organisms throughout our lifetime. The miracle that makes this happen is our adaptive immune system, comprising of B and T cells, and a host of other regulatory cell-types that function as a central command to activate, mobilize and eventually suppress the army of rogue killers, once the threat is eliminated.
It is a miracle of billion-years evolution that vertebrates, including us – the humans, are constantly thwarting attacks from an ever-expanding universe of foreign invaders such as bacteria, viruses and other pathogenic organisms throughout our lifetime. The miracle that makes this happen is our adaptive immune system, comprising of B and T cells, and a host of other regulatory cell-types that function as a central command to activate, mobilize and eventually suppress the army of rogue killers, once the threat is eliminated. The puzzle of how our immune system recognize new organisms/biomolecules that may not have existed when we were born was revealed by the work of Susumu Tonegawa and others who discovered that the recognition mechanism is mediated by a family of highly diverse immune receptors expressed by the cells of the adaptive immune system – B, T and antigen-presenting cells (APCs) (Figure 1). This diversity enables the immune system to identify and mount an attack against any foreign element invading from outside the body (bacteria, virus), or generated inside (tumor cells) protecting us from deadly diseases. It is estimated that there are 109 – 1011 unique B cell receptors and 106 – 108 T cell receptors and about 301 known human leukocyte antigen (HLA) proteins expressed by the APCs in healthy humans.
Deeper insight into the immune-receptor diversity became possible with the advent of NGS and powerful bioinformatics and computational tools. Through these sequencing efforts, we know that two individuals, including monozygotic twins, do not share identical immune receptor repertoire, although each of us is capable of mounting an immune response against common pathogens indicating that there are enormous redundancy and plasticity in the recognition process. Further, the receptor repertoire undergoes significant expansion and contraction during diseases and these changes have led to the development of novel diagnostics in the area of autoimmune diseases.
In this essay, I will give an overview of the immune receptors and discuss how MedGenome is leveraging the NGS data of immune receptor repertoire and developing tools that will not only enhance the fundamental knowledge of how our immune system works but also how the diversity can be interrogated to discover biomarkers of productive immune response eliminating pathogens, versus adverse response targeting body’s own cells leading to autoimmunity.
Immune repertoire diversity – how is it generated?
Immune receptors expressed by the B cells (B cell receptors, BCRs) and T cells (T cell receptors, TCRs) are formed during B cell development in the bone marrow and T cell development in the thymus. BCRs resemble the structure of an antibody with heavy and light chains and are membrane-bound (Figure 2A). TCRs are heterodimers of α and β polypeptide chains (αβ TCR), or γ and δ chains (γδ TCR). More than 90% of TCRs are αβ TCR (Figure 2B), while the rest are γδ TCRs. Both the receptors are created by recombining multiple gene segments residing at multiple genomic loci in the germline DNA that are brought within a coding sequence during B and T cell development. The gene segments, referred to as the variable (V) gene, the joining (J) gene and an additional diversity (D) gene (for heavy-chain and β-chain) followed by a constant (C) gene is added to all receptors. Figure 2A shows the assembly of a full-length BCR while Figure 2B shows the mechanism that generates a functional αβ TCR; following V(D)J recombination of the V, D, J and C genes. Receptor diversity arises at two levels. First, a combinatorial diversity in which recombination brings one of the 40-50 ‘V’ gene segments with a ‘D’ and ‘J’ gene segments at the germline followed by splicing of the C gene at the RNA-level.
The second level of diversity is introduced by random addition/deletion of nucleotides between gene segments (junctional diversity). Combinatorial and junctional diversity creates the final diversity of an individual’s immune receptor repertoire and explains why two individuals cannot share identical repertoire. The sequence spanning the V-D-J junction is the ‘hypervariable’ segment, which is unique to each TCR-β chain and is called the complementarity determining region 3 (CDR3). The CDR3 region recognizes the antigen. The diversity of the TCR repertoire is analysed by enumerating the unique number of CDR3 sequences present in a T cell pool. Earlier experiments using bulk RNAseq data quantitated the enrichment of the CDR3 region. However, with the recent developments in single-cell RNA sequencing technology (scRNAseq), the transcriptomes of thousands of cells can be processed simultaneously, bringing an extra dimension to the analysis of TCRs from the scRNAseq experiments (Ref 2). Identification of each cell’s unique TCRs using single cell technology now enables the pairing of α and β heterodimers that was not possible from bulk RNA sequencing. The enormous diversity of the TCR repertoire represents a major analytical challenge, which has led to the development of specialized software that aims to characterize the TCR repertoire in greater detail.
Applications of immune repertoire profiling
Immune repertoire profiling holds great potential not only for understanding the development of the normal immune response but also in providing insights into disease mechanisms leading to the development of new therapeutics and treatment modalities in infectious diseases, autoimmunity, and immuno-oncology. There is now increasing evidence that the BCR (and TCR) repertoires can serve as a proxy for aberrant immune response to many infections and autoimmune conditions, that can be monitored through patient blood/plasma, helping to gain a better understanding of their aetiology and progression (Figure 3).
Recent studies have demonstrated that TCR diversity enables monitoring and predicting response to immunotherapy drugs and the occurrence of immune-related adverse effects. Studies investigating tumor-immune interaction in cancer patients have shown that the circulating-TCR repertoire captures aspects of tumor-TCR repertoire with prognostic potential (Figure 4). Additionally, the immune repertoire data is being used to distinguish viral-driven cancers from non-viral ones, for precise tracking of vaccine-responsive T cell clones to enable more effective vaccine development. The diversity in the length of the CDR3 sequences has been linked to the T cell differentiation state – with longer CDR3 sequences enriched in antigen-naïve T cells than effector T cells.
Despite variations in the clonotypic diversity between individuals, there are instances where many individuals share the same clonotypes referred to as shared “public” clonotypes (Figure 5). Given that these individuals also share a common disease suggest that the shared clonotypes may be directed towards a common disease-specific antigen.
Tools/resources for repertoire analysis
Given the complexity of immune repertoire data, there is a need to assimilate the right tools and algorithms to estimate both the amount and diversity of unique T cell clones that characterize the T cell repertoire of any individual. Currently, for TCR sequencing of samples, MedGenome offers NGS-based solutions using SMARTer® TCR Profiling Kit (Takara Bio USA Inc) and Single-cell V(D)J Immune Profiling solution (10X™ Genomics Inc.). Data generated using these kits are currently being analysed using CellRanger, MiXCR, and VDJtools (Reviewed in Ref 3). However, improved tools for accurately predicting the binding of TCR sequences with their cognate peptide-MHC complex out of a pool of non-binding TCRs are important areas of research. MedGenome has created additional software to integrate and work on top of these existing software solutions. Very similar to the genomic data explosion, we are now seeing a rapid accumulation of immune repertoire data in public repositories. This growing body of immune receptor data has tremendous utility in analyzing, annotating and interpreting the TCR and BCR sequence data. The Adaptive Immune Receptor Repertoire sequencing (AIRRseq) federated databases and repositories have created standardized representations of immune repertoire data to facilitate cross-dataset analysis and promote the reusability of AIRRseq data (Ref 4). The AIRR community, formed in September 2014, initiated the iReceptor resource to provide a unified gateway (http://ireceptor.irmacs.sfu.ca/) to query and access the AIRRseq TCR and IG data from different repositories (Figure 6). Since its inception, AIRRseq data has been growing at an exponential scale, currently providing access to 1.3 billion sequences and 879 samples. Several computational and statistical analysis methods are being developed to resolve the complexity and deconvolute the dynamics of adaptive immunity from these large-scale AIRRseq data. MedGenome is part of an International consortium group which has been awarded a European/Canadian project grant to develop the next generation of the iReceptor platform referred to as the iReceptor-plus.
Conclusions
The scientific landscape is seeing an amalgamation of hypothesis-driven science and data-driven science that will have important ramifications for developing future therapeutics. The promising field of immune receptor repertoire is presented with new scientific and analytical challenges where currently no scalable solutions exist. With the exponential increase in genomic and transcriptomic data (both bulk and single cell) in addition to the rapidly accumulating immune receptor data, a scalable solution is expected with new developments in the areas of data aggregation, database management, cloud computing technologies and workflows for data integration along with scalability of computational tools for analysis. Although clear opportunities exist in analysing bigger volumes of data, it is important not to lose sight of the underlying biology. Dr. Sydney Brenner, a Nobel laureate in molecular biology commented, “There is a crisis these days. We are drowning in data and are still thirsty for more.” He said, “If we do not clearly define the problem, we won’t know what information is important.”
References
Overview of methodologies for T-cell receptor repertoire analysis. Rosati et al. BMC Biotechnology (2017); 17:61
Single cell T cell receptor sequencing: Techniques and future challenges. Simone et al. Front Immunol. (2018); 9:1638
Computational Strategies for dissecting the high-dimensional complexity of adaptive immune repertoires. Miho et al. Front Immunol. (2018); 9:244
AIRR Community Standardized Representations for Annotated Immune Repertoires. Heiden et al. Front Immunol. (2018); 9:2206
To know more about MedGenome’s unique Immunerepertoire Sequencing Solutions dowload the white paper here