Vaibhav Kulkarni

Vaibhav Kulkarni

Engineering Lead | Data & AI Platforms for Drug Discovery & Development

📍 Switzerland

about

I am an engineering lead & I enjoy building production-grade data & AI ecosystems that accelerate modern drug discovery. I specialize in architecting unified data platforms, from data warehouses for clinical & multi-omics data to intuitive analytical applications that empower scientific teams.

At Debiopharm, I drive the technical implementation of our core data-fabric and AI-driven drug development programs, leading the team in delivering platforms for small molecule & ADC research. My expertise is grounded in years of hands-on data & ML engineering across genomics, supply-chain, & IoT domains, complemented by a PhD in privacy-preserving machine learning.

I believe in continuous learning & enjoy using my skills to solve meaningful problems, both professionally & personally. I regularly contribute to open-source software & participate in global competitions focused on computational drug discovery. At the heart of my work is a commitment to building high-quality software, guided by a passion for writing clean, pragmatic code.

interests

AI-driven therapeutic design
Unified data-analytics platforms
Production-grade MLOps & systems

experience

Debiopharm

Engineering Manager: Data & Bioinformatics

Debiopharm

July 2025 – Present
Lausanne, Switzerland
  • Leading cross-functional team of data engineers & bioinformaticians to architect & deliver next-generation data fabric & bioinformatics platforms
  • Engineering bioinformatics pipelines for processing multi-omics & non-clinical data for biomarker discovery programs & harmonizing datasets
  • Partnering with Translational Medicine department to provide data analysis & bioinformatics support for projects in small molecule discovery & antibody-drug conjugate (ADC) research
Debiopharm

Data Engineering Team Lead

Debiopharm

July 2022 – June 2025
Lausanne, Switzerland
  • Architected & led the full-stack development of the company's most utilized data platform; Central Data Repository, automating ingestion & quality control of clinical & non-clinical trial data at enterprise-scale & positioning it as central data hub for all research & development programs
  • Engineered the deployment of custom-configured scientific analytics platforms (SAS Viya, Posit, Dotmatics, Expressions) on company-controlled cloud infrastructure, creating a unified ecosystem interfaced with the central data repository to ensure stringent performance & data governance
  • Engineered an entity-recognition & search platform by mining scientific publication databases - Automated internal trend-monitoring functions
  • Productionised privacy-aware large language model-based applications to streamline document analysis & study protocol writing processes
  • Supported compliance team via DevOps-based computerised system validation, infra qualification & security audits - Automated GxP workflows
  • Responsible for technical due-diligence of healthcare-startups (technical-stack, software & ML strategy) to aid our VC funds investment decisions
SOPHiA GENETICS

Senior Data Engineer

SOPHiA GENETICS

April 2021 – June 2022
Lausanne, Switzerland
  • Implemented terabyte-scale ETL pipelines to transform raw genomics data in-to analytics-ready format - Significantly lowering processing time
  • Developed API endpoints & CLI tools to expose database functionalities to internal teams- Facilitated data democratisation & access control
  • Implemented regression testing mechanisms for variant detection bioinformatics pipelines & dashboards delivering real-time insights into KPIs
  • Authored formal TLA+ specifications for ETL operations, enabling quantification & benchmarking of logical correctness across the data stack
GenLots

Software Engineer | Research & Development Lead

GenLots

September 2019 – March 2021
Lausanne, Switzerland
  • Developed a microservices architecture, streamlining ingestion, transformation, & delivery of client ERP data to production-grade ML services
  • Deployed reinforcement learning-based solution, optimising end-to-end supply chain planning, warehouse management, & CO2 emissions
  • Engineered a suite of tools for reporting & monitoring integrated into dashboards delivering real-time business performance metrics to clients
  • Directed research projects in collaboration with Swiss watch manufacturing firms to predict inbound material needs & stabilise supply chains
HEC Lausanne

Research Scientist

Distributed Object Programming Lab

November 2015 – August 2019
Lausanne, Switzerland
  • Partnered with public transport firms to architect & implement systems capturing high-frequency, real-time ridership data into a data warehouse
  • Developed a Machine Learning-driven platform to proactively detect & reduce ticketless travel across the Lausanne public transport network
  • Led platform development for a large-scale (>300 participants) spatiotemporal mobility-data collection project to facilitate ML-privacy research
ETH Zürich

Project Engineer

ETH Zürich

January 2015 – August 2015
Zürich, Switzerland
  • Implemented a software-defined-radio-based platform for reliable IoT product testing, generating interference patterns of wireless appliances
  • Deployed the platform in production at ETH Zurich & TU Berlin with toolsets allowing for remote configuration & logging performance metrics
BOLT IoT

Embedded Systems Engineer

BOLT IoT

September 2011 – January 2013
Goa, India
  • Designed PCB layouts for embedded platforms used in robotics & IoT systems & collaborated with software teams to ensure firmware integration
  • Implemented toolsets & applications to benchmark platform performance, validation tests, EMI simulations & industry compliance certifications
NIO

Project Intern

National Institute of Oceanography

June 2011 – August 2011
Goa, India
  • Contributed to autonomous underwater vehicle localization project by developing triangulation algorithms & learning satellite communication & ocean current modeling

skills

▶ Data & Platform Engineering

Enterprise Platform Architecture

Designing & building scalable, production-grade platforms from data ingestion to user-facing applications, with focus on system reliability, performance optimization, & maintainability.

Distributed Systems Microservices Architecture Domain-Driven Design (DDD) Onion Architecture API Design System Integration

Data Architecture

Designing & implementing enterprise-scale data repositories, data warehouses, & real-time data processing systems.

Data Warehousing Data Lakes SQL (PostgreSQL) NoSQL (MongoDB) ETL/ELT Pipelines

Cloud & Infrastructure

Deploying & managing scalable, cloud-native applications & infrastructure.

AWS Azure GCP Docker Kubernetes Terraform Infrastructure as Code (IaC)
▶ AI & Machine Learning

Production ML Systems (MLOps)

Building & deploying robust, scalable, & reproducible ML pipelines & services.

PyTorch Scikit-learn MLflow Kubeflow CI/CD for ML

Large Language Models (LLMs)

Developing & productionizing privacy-aware LLM applications for scientific document analysis & generation.

Hugging Face LangChain Vector Databases Prompt Engineering

Predictive Modeling

Creating ML-driven platforms for predictive analytics & statistical analysis.

Python (Pandas, NumPy) Jupyter Reinforcement Learning
▶ Bioinformatics & Computational Biology

Multi-Omics Data Processing

Engineering bioinformatics pipelines for harmonizing & analyzing genomics, transcriptomics, & other omics data.

Nextflow Snakemake Bioconductor RNA-seq & Variant Calling Pipelines

Computational Drug Discovery

Supporting small molecule & antibody-drug conjugate (ADC) research with data analysis & bioinformatics.

RDKit Cheminformatics Molecular Modeling Tools

Biomarker Discovery

Developing platforms & strategies to identify & validate novel biomarkers from complex datasets.

R Python Statistical Analysis Data Visualization
▶ Quality & DevOps

Automated Testing Frameworks

Developing & owning regression & performance testing systems for critical data pipelines.

Pytest CI/CD Integration GitHub Actions Jenkins

System Validation & Qualification

Implementing DevOps-based automation for GxP compliance, infrastructure qualification, & security audits.

GxP Compliance DevOps Methodologies Automated Auditing

Formal Methods

Using formal specifications to verify the logical correctness & reliability of complex data systems.

TLA+
▶ Leadership & Strategy

Cross-Functional Team Leadership

Leading teams of data engineers, bioinformaticians, & software developers.

Agile Scrum Kanban Jira Confluence

Technical Strategy & Roadmapping

Defining the technical vision for AI & data platforms in R&D environments.

Product Roadmapping Stakeholder Management

Technical Due Diligence

Evaluating the technology stacks & ML strategies of healthcare startups for investment purposes.

Tech Stack Evaluation Risk Assessment

education

UNIL

PhD in Computer Science

University of Lausanne - Information Systems Department

November 2015 – August 2019

Thesis: Information Systems for Privacy-Aware Machine Learning

TU/e

M.Sc. Embedded Systems

Technische Universiteit Eindhoven (Master Thesis at ETH Zürich)

August 2014 – August 2015

Graduated with Cum Laude

TU Berlin

M.Sc. Information & Communication Technology

Technische Universität Berlin

February 2013 – August 2014

Graduated with Honours

peer-reviewed publications

Generating Synthetic Mobility Traffic using Recurrent Neural Networks

V. Kulkarni, B. Garbinato

ACM SIGSPATIAL - AI and Deep Learning for Geographic Knowledge Discovery, 2017

PDF →

Examining the Limits of Predictability of Human Mobility

V. Kulkarni, A. Mahalunkar, B. Garbinato, J. D. Kelleher

Entropy, 2019

PDF →

Generative Models for Simulating Mobility Trajectories

V. Kulkarni, N. Tagasovska, T. Vatter, B. Garbinato

Neural Information Processing Systems Workshop on spatiotemporal modeling, 2018

Preprint →

MobiDict: A Mobility Prediction System Leveraging Realtime Location Data Streams

V. Kulkarni*, A. Moro*, B. Garbinato (*co-primary authors)

ACM SIGSPATIAL Workshop on GeoStreaming, 2016

PDF →

Information Disclosure in Location-based Services: An Extended Privacy Calculus Model

D. Naous, V. Kulkarni, C. Legner, B. Garbinato

International Conference on Information Systems (ICIS), 2019

PDF →

Privacy-Preserving Location-Based Services by using Intel SGX

V. Kulkarni, B. Chapuis, B. Garbinato

ACM SenSys Workshop on Human-centered Sensing, Networking, and Systems, 2017

PDF →

20 Years of Mobility Modeling & Prediction: Trends, Shortcomings & Perspectives

V. Kulkarni, B. Garbinato

Advances in Geographic Information Systems (ACM SIGSPATIAL), 2019

PDF →

Capturing Complex Behavior for Predicting Distant Future Trajectories

B. Chapuis, A. Moro, V. Kulkarni, B. Garbinato

ACM SIGSPATIAL Workshop on Mobile Geographic Information Systems, 2016

PDF →

On the Inability of Markov Models to Capture Criticality in Human Mobility

V. Kulkarni, A. Mahalunkar, B. Garbinato, J. D. Kelleher

28th International Conference on Artificial Neural Networks (ICANN), 2019

PDF →

Extracting Hotspots without A-priori by Enabling Signal Processing over Geospatial Data

V. Kulkarni, A. Moro, B. Chapuis, B. Garbinato

Advances in Geographic Information Systems (ACM SIGSPATIAL), 2017

Preprint →

Breadcrumbs: A Feature Rich Mobility Dataset with Point of Interest Annotation

A. Moro, V. Kulkarni, P. Ghiringhelli, B. Chapuis, K. Huguenin, B. Garbinato

Advances in Geographic Information Systems (ACM SIGSPATIAL), 2019

PDF →

Controlled Interference Generation for Wireless Coexistence Research

A. Hithnawi, V. Kulkarni, S. Li, H. Shafagh

ACM MobiCom workshop in Software Radio Implementation Forum (SRIF), 2016

PDF →

Capstone: Mobility Modeling on Smartphones to Achieve Privacy by Design

V. Kulkarni, A. Moro, B. Chapuis, B. Garbinato

IEEE Conference On Trust, Security & Privacy In Computing & Communications, 2018

Preprint →

A Mobility Prediction System Leveraging Realtime Location Data Streams

V. Kulkarni, A. Moro, B. Garbinato

ACM Conference on Mobile Computing and Networking (MobiCom) (Poster), 2016

PDF →

certifications

Machine Learning in Production

DeepLearning.AI

July 2025

View Certification →

Drug Discovery

UC San Diego

September 2024

View Certification →

Rust Fundamentals

Duke University

June 2024

View Certification →

Generative AI with Large Language Models

DeepLearning.AI & AWS

December 2023

View Certification →

Data Engineering with Azure

Udemy

January 2023

View Certification →

Applied Machine Learning

University of Michigan

April 2020

View Certification →

Data Science

University of Michigan

April 2020

View Certification →

Amazon Web Services

AWS

May 2019

View Certification →

Laws & Economics of Media Platforms

University of Chicago

June 2018

View Certification →

Information Security

University College London

January 2018

View Certification →

Machine Learning

Stanford University

January 2016

View Certification →