Leveraging big data using a novel clinical database and analytic platform based on 323,145 individuals with and without of Diabetes ::

Hiranjith G.H., Anjana Ranjit Mohan, Praveen Raj, Jebarani Saravanan, Srinivasan Vedantham, Radha Venkatesan, Muthu Narayanan, Pradeepa Rajendra, Ranjit Unnikrishnan, Somasekhar Jayaram, Rohit Gupta, Paul George, Brijendra Kumar Srivastava, Uthra Subash Chandra Bose, Lovelena Munawar, Sam Santhosh, Mohan Viswanathan

Diabetes is a chronic disorder of glucose metabolism and is a major cause of heart disease and end-stage renal disease in world populations. It is also the single biggest cause of preventable blindness, the leading cause of non-traumatic lower extremity amputation and major cause of premature mortality. 415 million people have diabetes globally and is expected to reach 642 million by 20401. Large volumes of diabetes biomedical data are being produced every day, but it has not been used effectively. Leveraging such voluminous amount of patient data using data science approaches help to uncover hidden patterns, unknown correlations, and other insights of the disease. Integration of diverse genomic data with comprehensive electronic health records (EHRs) exhibit challenges, but essentially, they provide a feasible opportunity to better understand the underlying diseases, treatment patterns and develop an efficient and effective approach to identify biomarkers for diagnosis and improve therapy.

The value of studying the Indian population to identify novel genetic variants to inform mechanisms of disease and pharmacological response ::

A. Das, P. Raj, V. Gopalan, Hiranjith. G.H., E. Stawiski, S. Santhosh, R. Gupta, A. Chaudhuri, R. Gupta

While Genome wide association studies can shed light on the significance of variants in susceptibility to a disease or allow to stratify patients for specific therapeutic modalities, often variants that are rare and could be of significance are not identified in these studies. This can occur due to allelic heterogeneity in a complex disease. Furthermore, spurious differences in allelic frequencies between normal and disease resulting from systematic differences in ancestry can also confound the conclusions drawn from a GWAS study. Therefore, studying population isolates where individuals with the disease and normal have a homogeneous genetic background can allow to enrich for rare alleles, and improve the accuracy of elimination of false positives, and make it possible to accurately correlate segregation of the variants to the disease traits. One such population is of the Indian subcontinent, where the ancestral populations date back to modern humans travelling out of Africa 65,000 year ago, creating a gene pool of over 1000 years starting from a few founder families, resulting in an accumulation of unique disease-causing and disease-protective alleles that were preserved and enriched within various ethnic groups in the country.

OncoPeptVAC : A machine learning based approach for candidate vaccine identification and their validation using cell based assays ::

Ankita Das, Priyanka Shah, Xiaoshan “Shirley” Shi, Vasumathi Kode, Kayla Lee, Ravi Gupta, Amit Chaudhuri and Papia Chakraborty

  • T cell immunity provides significant therapeutic benefit to cancer patients treated with checkpoint inhibitors. Most tumors harbor a repertoire of somatic mutations, a fraction of which is capable of initiating potent T cell mediated anti-tumor activity. However, accurate identification of relevant immunogenic neoantigens remain a major challenge in therapeutic cancer vaccine research.
  • Current in silico methods to predict immunogenic neoantigens suffer from lack of sensitivity and specificity because they rely heavily on features associated with antigen presentation alone, without considering features required for T cell receptor (TCR) binding.
  • Here we report OncoPeptVAC, an algorithm based on ensemble voting-based machine learning approach to identify immunogenic peptides from patient’s somatic mutations. Our method combines physicochemical properties of amino acids favourable for TCR binding with features relevant for antigen presentation and processing.

The Ophthatome™ Knowledgebase : A curated knowledgebase of over 500,000 ocular disease phenotypic records coupled with analyses tools to enable novel discoveries for drug development and pharmacogenomics ::

A. Das, Nagasamy S, P. Raj, B. Muthu Narayanan, J. Somasekhar, T. Chandrasekhar, D. Kumar, A. Shetty, S. Das, S. Tejwani, P. Narendra, A. Ghosh

  • Medical big data analytics has applications in clinical decision, predictive/ prognostic modelling of disease progression, disease surveillance, public health and research.
  • The electronic medical record (EMR), system is the digital storehouse of rich medical data that includes demographics, clinical (diagnosis, clinical diagnostic tests, treatment, prescription drugs, surgery, laboratory test reports) and administrative (bills, insurance claims) details of patients’ visits to hospital(s).
  • Although EMR is a repository of vast clinical data on a large patient cohort collected over many years, the data lack sufficient structure to be of any clinical value for applying deep learning methods and advanced analytics to improve disease management at an individual patient level or for the field in general.
  • Aggregated data from hospital EMRs need to be captured in a structured knowledge base to support clinical and translational research (CTR).

Considering Genomic Research?

2019 © MedGenome • All Rights Reserved