Prateek Arora

A Bioinformatics Research Associate with expertise in

About

Meet me — an AI-Native Bioinformatician & Data Scientist with 6+ years of experience transforming messy, high-throughput experimental data into high-quality, structured datasets. My work spans single-cell and spatial transcriptomics, multi-omics integration, lipidomics, proteomics, and ATAC-seq, combining deep biological grounding with statistical rigor to eliminate batch effects and instrument drift.

I build reproducible pipelines (Snakemake, Nextflow, Docker, Singularity) and custom tools — like 'LipidLocator', an open-source R Shiny app for spatial lipidomics published as co-corresponding author — that bridge computational workflows with wet-lab needs. I'm fluent in agentic coding (Claude Code, GitHub Copilot, local LLMs) to rapidly build reusable functions, optimize token strategies, and ship robust data tools.

In my personal time, I've created multiple Shiny apps in R — from analyzing Google search trends to everyday lab calculators. Outside of work, I enjoy reading fantasy novels, cooking, and swimming.

As a Ph.D. holder in Cellular and Molecular Biology, I've worked with top institutions including the University of Bern and the Tata Institute of Fundamental Research. My 12 publications in peer-reviewed journals and presentations at international conferences reflect my commitment to rigorous, well-communicated science.

Profile photo of Prateek Arora, PhD

Bioinformatics Research Associate

Get to know me better! Here are some personal and professional details that define who I am.

Facts

In addition to my expertise in experimental biology and bioinformatics, I have also achieved notable success in various areas, including publications, collaborative projects between multiple groups, web app development, and training and mentoring of students etc. The following section highlights some of my notable achievements:

Publications published or in preparation

Collaborative Projects within and outside the group

Web Apps developed and hosted

Students Trained for Masters/Bachelors dissertation

Skills

In the dynamic field of bioinformatics, I have honed a diverse skill set that enables me to tackle complex biological questions with precision and innovation. My expertise spans a range of technical proficiencies including advanced programming in R and Python, developing and deploying Shiny apps, and leveraging Docker for efficient, reproducible research. I am adept in cloud computing, utilizing platforms like AWS to handle large-scale data analysis. Additionally, I possess strong competencies in experimental design, pipeline development, and multi-omics data integration. More recently, I've extended this toolkit with Snakemake and Nextflow for reproducible pipeline orchestration, and agentic development with Claude Code and GitHub Copilot to build reusable, token-efficient analysis tools. These skills, combined with my dedication to continuous learning and collaboration, empower me to drive impactful research and deliver insightful, data-driven solutions.

R 100%
Python 80%
Bash 80%
Docker 90%
Cloud Computing 70%
App Development and Deployment 70%

Additional Tools & Skills

High-Throughput Analysis
scRNA-seq (10x, Parse, Nanopore) Spatial Transcriptomics (Visium, Nanostring) Proteomics Lipidomics ATAC-seq Statistical Modeling & QC
Pipeline Development & AI
Snakemake Nextflow Docker Singularity Git FAIR Data Standards HPC (SLURM) Agentic Development (Claude Code, GitHub Copilot, Local LLMs)
Programming & Visualization
R Python UNIX/Bash Shiny (Full-stack)
Consultancy
Experimental Design Statistical Consulting Data Visualization Manuscript Preparation

Resume

Here you can find a brief overview of my professional journey and academic qualifications.

You can download my full resume in Pdf format here.

Summary

Prateek Arora

AI-Native Bioinformatician & Data Scientist with 6+ years of experience transforming messy, high-throughput experimental data into high-quality structured datasets. Expert in building reproducible pipelines (Snakemake/Nextflow/Docker) and custom tools (LipidLocator) that bridge computational workflows with wet-lab needs. Fluent in agentic coding (Claude Code, Copilot) to rapidly build reusable functions, optimize token strategies, and ship robust data tools. Combines deep biological grounding in multi-omics with statistical rigor to eliminate batch effects and instrument drift.

Education

PhD + MSc (Cellular and Molecular Biology)

2011 - 2020

Tata Institute of Fundamental Research, Mumbai, India

My project involved, designing experiments, collecting data, analyzing data, performing statistical tests, regression fittings etc. By the project I established various analysis paradigms in the research group. This work was published in an international peer reviewed journal.

In past I have also worked on a project which involved study of role of Canonical Wnt signalling regulating laminin levels(Extracellular matrix molecule) in the development of zebrafish appendage -fin fold(limb homologue).

B.Sc. (H) Microbiology

2011

Ram Lal Anand College,

University of Delhi, New Delhi, India

Conferences and Publications

Conference Organization

Organized "Exploring Regeneration through Omics" from scratch — secured CHF 10,000 in funding, invited speakers, and managed logistics. June 2025

Oral Presentations

Shiny Conf 2024

Swiss Zebrafish Meeting 2023

Selected publications

Carvalho, J.A.S., Arora, P., et al, Developmental Dynamics, 2026 (Co-corresponding author) - https://doi.org/10.1002/dvdy.70158

Arora, P., et al, Bioinformatics Advances, 2026 (Co-corresponding author) - https://doi.org/10.1093/bioadv/vbag012 — LipidLocator

Coppe, B., Galardi-Castilla, M., Sanz-Morejón, A., Arora, P., et al, Circulation, 2025 - https://doi.org/10.1161/CIRCULATIONAHA.124.070323

Vahdani, N., Arora, P., et al, Lab on a Chip, 2025 - https://doi.org/10.1039/D5LC00553A

García-Poyatos, C., Arora, P., et al, Developmental Cell, 2024 - https://doi.org/10.1016/j.devcel.2024.04.012

Arora, P., et al, eLife, 2020 - https://doi.org/10.7554/eLife.49064

View all 12 publications on Google Scholar.

Trainings and Certifications

Diving into Deep Learning - Theory and Applications with PyTorch (SIB- Swiss Institute of Bioinformatics)

Introduction to Machine Learning with Python (SIB- Swiss Institute of Bioinformatics)

Long-read sequence analysis (SIB- Swiss Institute of Bioinformatics)

Languages

English - Proficient User (C1/C2)

German - Intermediate (B1)

Hindi - Mother Tongue

Professional Experience

RESEARCH ASSOCIATE - BIOINFORMATICS

July 2023 - Present

University of Bern, Bern, Switzerland

  • Single-Cell & Spatial Innovation: Led the analysis of >30 scRNA-seq datasets (10x, Parse) and pioneered the department's first Oxford Nanopore long-read snRNA-seq workflow. Expertly handled Spatial Transcriptomics (Nanostring, 10x Visium) projects to study in situ vaccination in cancer, utilizing SingleR and ClusterProfiler for niche-specific pathway analysis.
  • Tool Development: Conceptualized and built 'LipidLocator', an open-source R Shiny tool for spatial lipidomics. Optimized workflows to reduce project turnaround time by ~98% (from 3 months to 3 days).
  • Multi-omics Integration: Successfully integrated ATAC-seq with RNA-seq, and Proteomics with Transcriptomics to identify common regulatory pathways.
  • Reproducibility Standards & AI-Native Workflows: Enforced strict reproducibility standards using Docker and Snakemake across all computational pipelines. Leveraged agentic workflows (Claude Code/Copilot) to scale capacity, building a customized repository of reusable functions and skills that allowed simultaneous execution of multiple complex projects while ensuring highly standardized, error-checked, and reproducible analytical results across all datasets.
  • Experimental Coordination & Design: Served as primary liaison to the Sequencing Core Facility, collaborating closely with wet-lab scientists to optimize experimental design, select appropriate chemistries, and troubleshoot raw instrument data anomalies at the source to guarantee benchmark-grade data quality.
  • Training & Mentorship: Introduced and trained PhD students in RNA-seq analysis, pathway enrichment (ClusterProfiler), and functional interpretation, empowering them to perform independent data exploration.
  • Infrastructure Management: Established and currently maintain the group's high-performance Linux workstation; ensured 100% environment stability.
  • Machine Learning: Trained and improved an AI pipeline based on U-Net and TensorFlow to detect injury vs. healthy tissue in AFOG-stained zebrafish hearts.
  • Methylation Analysis: Discovered methylation differences across generations in mouse after cardiac injury using Infinium Mouse Methylation BeadChip.

POSTDOCTORAL FELLOW - BIOINFORMATICS

January 2021 - June 2023

University of Bern, Bern, Switzerland

  • Reproducible Research: Built robust pipelines (Snakemake) for ATAC-seq studies investigating the WT1 transcription factor's role in zebrafish and mouse sperm, using Bowtie, picardtools, MACS2/MACS3, Samtools, Genrich, Bedtools, ChIPseeker, DiffBind, and the MEME suite for transcription factor site discovery.
  • snRNAseq Analysis: Advanced autophagy model studies in zebrafish hearts using Seurat and LIANA (ligand-receptor); explored inheritance patterns in mouse hearts via pseudobulk analysis with Libra, a Shiny app built with ShinyCell, and pathway analysis.
  • Algorithm Development: Developed a Ligand-Receptor Network pipeline using bulk RNA-seq, DESeq2, and network theory, directly facilitating two PhD projects.
  • Meta-analysis: Conducted a comprehensive meta-analysis of bulk RNA-seq data for regenerating hearts, uncovering conserved pathways and training PhD students.

SENIOR BIOINFORMATICIAN

June 2019 - June 2020

Elucidata Data Consulting Pvt Ltd, New Delhi, India

  • Pharma Consulting: Led a team developing scalable drug target discovery pipelines on AWS for pharmaceutical partners; identified 20+ potential targets.
  • Project Management: Served as Line Manager for a team of two; managed timelines, validated scientific products, and aligned deliverables with client OKRs.

Portfolio

Welcome to my portfolio! Here, you'll find a selection of my most impactful projects that demonstrate my expertise in bioinformatics and computational biology. From single-cell RNA sequencing in Switzerland to developing intuitive Shiny apps for data visualization, my work spans various innovative areas in the field. I have led advanced ATACseq studies to uncover key transcription factors, performed meta-analyses to reveal conserved pathways in heart regeneration, and created efficient pipelines for spatial lipidomics. Each project highlights my commitment to advancing scientific discovery through robust data analysis and collaborative efforts.

  • All
  • Single-cell/ Bulk RNASeq
  • Proteomics/Lipidomics/ATACseq
  • Other Shiny Apps
Graphs from single-cell RNAseq project
Ligand Receptor network visualization app
ATAC Seq analysis in zebrafish hearts
Proteomics heatmap for zebrafish muscle and hearts
Single-nuclei RNASeq of regenerating zebrafish heart
LipidLocator: spatial lipidomics Shiny app for Mass Spectrometry Imaging on Zebrafish
Graphs from single-nuclei RNASeq of regenerating mice hearts
Lab calculator app screenshot

Contact

Get in touch! Whether you have questions, collaboration ideas, or just want to connect, feel free to reach out using the form below or through my social profiles.

Location

University of Bern, Switzerland

Follow Me

Loading
Your message has been sent. Thank you!