The Data Explorer

Navigating the World of Data Science, Engineering, and Machine Learning.


I am a data professional with extensive experience in data engineering, data science, and machine learning. I specialize in developing data-driven solutions for enterprise systems, building scalable platforms for data capture and analysis, and creating algorithms for personalized recommendations and predictive insights.

Currently, I work as a Senior Data Scientist at Optum (UnitedHealth Group), where I develop advanced predictive models, optimize healthcare systems, and enhance fraud detection using cutting-edge machine learning techniques. My prior roles at Amazon Web Services, Fractal Analytics, Bewakoof Brands, and Reliance Jio have equipped me with expertise in designing recommendation engines, implementing A/B testing frameworks, and engineering real-time data pipelines.

I hold an M.S. in Data Science from the University of San Francisco, where I focused on Advanced Machine Learning, Distributed Systems, Data Structures & Algorithms, and Statistical Modeling.

On this website, you can explore my skills, experience, and projects, or get in touch with me to discuss potential opportunities and collaborations.

View Resume

Education

M.S. in Data Science

University of San Francisco | Jul 2022 - Jun 2023

Relevant Coursework:

  • Advanced Machine Learning
  • Distributed Data Systems
  • Data Structures & Algorithms
  • Relational Databases
  • Experiments in Data Science
  • Linear Regression

B.E. in Computer Engineering

University of Mumbai | Jul 2015 - May 2018

Relevant Coursework:

  • Data Structures & Algorithms
  • Computer Networks
  • Database Management Systems
  • Machine Learning

Skills & Capabilities

Discover my wide-ranging skills in data engineering, data science, and machine learning, developed through real-world experience working with various technologies and data-focused projects.

Big Data & Databases

Proficient in Kafka, Flink, Spark, HDFS, Elasticsearch, MySQL, and MongoDB.

Cloud Platforms

Skilled in GCP tools such as Pub/Sub, Dataflow, Storage, BigQuery, Cloud Functions, Firestore, Dataproc, Recommendations AI, and Composer.

Programming Languages

Strong programming skills in Python and Java with expertise in SQL.

Machine Learning Techniques

Knowledge of decision trees, random forest, linear regression, logistic regression, and feature engineering.

Data-driven Solutions

Experience in developing solutions for data warehouses, building scalable platforms for data capture, visualization, and analysis.

Collaboration & Communication

Effective at collaborating with cross-functional teams and translating complex technical concepts for non-technical stakeholders.

For a more comprehensive overview of my work experience, please download my resume using the button below:

View Resume

Projects

Here are some of the projects I've worked on:

OnlyStats - Entrepreneurship in Data Science

As part of a data science entrepreneurship course, collaborated in a team of five to conceptualize, design, and build an MVP for a sports analytics company. The platform serves both B2B and B2C markets, providing data-driven insights to sports franchises and fantasy sports players.

Product Search Engine

Search engine pipeline using PySpark, MongoDB, and Airflow. The pipeline fetches product data from the ASOS API, stores the data in Google Cloud Storage (GCS), loads the data into MongoDB, and calculates BM25 scores to rank the top 10 items.

Dataflow Pipelines

A collection of Apache Beam Dataflow pipelines, demonstrating various use cases for reading, processing, and writing data using Google Cloud Platform services, such as Pub/Sub, BigQuery, and Cloud Storage.

JDBC to BigQuery with Encryption

This project demonstrates how to read data from JDBC and write it to BigQuery with encryption using Google Tink and KMS. It also provides an example of how to customize the encryption by specifying the PII columns.

Recommendation System

Product recommendations for Amazon apparels using content-based recommendation techniques. We explore different text-based approaches to create meaningful recommendations for users based on the textual data associated with each product.

Blogs

I also write about my professional interests on Medium. Here are a couple of examples:

Data Encryption in Dataflow Streaming Jobs

Join us as we explore how to use Google Tink and Google KMS to encrypt and decrypt data in Google Dataflow streaming jobs.

Deploy Flask App on Ubuntu Under 15 minutes!

Flask is a micro web framework written in Python. It is classified as a microframework because it does not require particular tools or…

Contact Me

Get in touch with me through any of the following channels: