About Me

Hi! I'm Bobby Ranjan, a predoctoral fellow at EMBL Rome. Born in Scotland, raised in India, educated in Singapore and currently working in Italy, I am an ambitious student-researcher in the field of computational biology.


Predoctoral Fellow, Hackett Group

2021 -

European Molecular Biology Laboratory, Rome

Bachelor of Engineering in Computer Engineering

2014 - 2018
Minor in Entrepreneurship
Minor in Life Sciences
Nanyang Technological University, Singapore

Selected Publications


Feature selection (marker gene selection) is widely believed to improve clustering accuracy, and is thus a key component of single cell clustering pipelines. Existing feature selection methods perform inconsistently across datasets, occasionally even resulting in poorer clustering accuracy than without feature selection. Moreover, existing methods ignore information contained in gene-gene correlations. Here, we introduce DUBStepR (Determining the Underlying Basis using Stepwise Regression), a feature selection algorithm that leverages gene-gene correlations with a novel measure of inhomogeneity in feature space, termed the Density Index (DI). Despite selecting a relatively small number of genes, DUBStepR substantially outperformed existing single-cell feature selection methods across diverse clustering benchmarks. Additionally, DUBStepR was the only method to robustly deconvolve T and NK heterogeneity by identifying disease-associated common and rare cell types and subtypes in PBMCs from rheumatoid arthritis patients. DUBStepR is scalable to over a million cells, and can be straightforwardly applied to other data types such as single-cell ATAC-seq. We propose DUBStepR as a general-purpose feature selection solution for accurately clustering single-cell data.


The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.

Availability: RCA2 is implemented in R and is available on GitHub.


Background: Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation.
Results: We present scConsensus, an R framework for generating a consensus clustering by (i) integrating the results from both unsupervised and supervised approaches and (ii) refining the consensus clusters using differentially expressed (DE) genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations.
Conclusions: scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is freely available on GitHub.


Background: Alzheimer's disease (AD) is a progressive neurological disorder, recognized as the most common cause of dementia affecting people aged 65 and above. AD is characterized by an increase in amyloid metabolism, and by the misfolding and deposition of β-amyloid oligomers in and around neurons in the brain. These processes remodel the calcium signaling mechanism in neurons, leading to cell death via apoptosis. Despite accumulating knowledge about the biological processes underlying AD, mathematical models to date are restricted to depicting only a small portion of the pathology.
Results: Here, we integrated multiple mathematical models to analyze and understand the relationship among amyloid depositions, calcium signaling and mitochondrial permeability transition pore(PTP)-related cell apoptosis in AD. The model was used to simulate calcium dynamics in the absence and presence of AD. In the absence of AD, i.e. without β-amyloid deposition, mitochondrial and cytosolic calcium level remains in the low resting concentration. However, our in silico simulation of the presence of AD with the β-amyloid deposition, shows an increase in the entry of calcium ions into the cell and dysregulation of Ca2+ channel receptors on the Endoplasmic Reticulum. This composite model enabled us to make simulation that is not possible to measure experimentally.
Conclusions: Our mathematical model depicting the mechanisms affecting calcium signaling in neurons can help understand AD at the systems level and has potential for diagnostic and therapeutic applications.


Bioinformatics Specialist

August 2018 - July 2021
Prabhakar Lab, Genome Institute of Singapore
  • Developing algorithms for cell type identification in single-cell data

Software Design Engineer Intern

May - August 2017
  • Built customer-facing license consumption report for all BitTitan products
  • Conducted tech feasibility analysis to improve BitTitan’s reporting capacity
  • Built code analysis tool to clean up database references across codebase

Technology Analyst Intern

August - December 2016
Bank of America, Merrill Lynch (Singapore)
  • Worked on the payments processing and payments testing development teams
  • Redesigned database logging using a queueing mechanism with the help of Apache ActiveMQ and Java Spring Framework
  • Also built an application to help onboard new testers onto the testing platform, using Java, AngularJS and SQL

Conferences & Workshops

  • SymbNET Metabolomics Workshop 2023 - Lausanne, Switzerland
  • EMBO Workshop on Epigenome Inheritance and Reprogramming in Health and Disease 2022 - Split, Croatia
  • Intelligent Systems in Molecular Biology 2021 - Virtual
  • Human Cell Atlas Asia 2020 - Virtual
  • Intelligent Systems in Molecular Biology 2020 - Virtual
  • Cell Symposia 2019 - Singapore
  • Single Cell Analyses Meeting 2019 - Cold Spring Harbor Laboratory, New York, USA