~ Explore my Latest Projects ~
Analyzed a large-scale A/B test for the mobile game 'Cookie Cats' using Python, measuring the impact of moving an in-game gate (level 30 → 40) on player retention and engagement.
Read More
Dashboard to help executives answer a fundamental question: “Is it worth continuing the iPhone Mini line?” Balances high-level KPIs with the ability to compare across product lines, drill into time trends, and examine product attributes such as color, storage, and geography.
Read MoreBuilding an automated data pipeline that extracts GPU listings from Amazon, transforms the data, and loads it into a PostgreSQL database for analysis. This post walks through the development of this ETL pipeline using Apache Airflow for orchestration, Docker for containerization, and PostgreSQL for data storage.
Read More
Building an interactive dashboard that would allow executives to track product quantity performance across different countries and time periods. The dashboard provides both high-level KPIs and the ability to drill down into specific areas of concern.
Read More
A machine learning implementation to facilitate an autofill system by predicting the frame materials of a building based on its other structural attributes. This model utilizes a Balanced Random Forest Classifier, but also experiments with other techniques such as Logistic Regression, Decision Trees, and Synthetic Minority Over-Sampling (SMOTE).
Read More
Data cleaning and wrangling on a proprietary dataset provided by Partner Engineering, containing millions of data points on various commercial real estate properties. Transformations were made to the dataset to prepare it for data analysis, where correlations between structural attributes was explored and evaluated as potential inputs to the machine learning model.
Read More
This is a quick demo of what can be done with the Pokemon REST API and pandas! In this project, we go over accessing data via an API GET request, transforming the output data from json into a pandas dataframe, and querying + aggregating the data with pandas to answer some questions. We are going to be using the pokemon endpoint for the PokeAPI, documentation on this and the other endpoints can be found in the notebook.
Read More
Understanding the landscape of data science salaries is crucial for job seekers and employers alike. To provide a clear picture, I developed a Tableau dashboard showcasing average data science job salaries worldwide.
Read More
This project aims to explore machine learning methods and their development processes by training on an IMDb ratings dataset to predict whether a user had a positive or negative sentiment towards a given movie. In this project we utilized kNN, Logistic Regression, Feedforward Neural Network, and Random Forest models to identify the most accurate fit.
Read More
An intersection of brain imaging, data analysis, and the legal system defined my experience as an Imaging Analysis Research Assistant at the University Neurocognitive Imaging lab. Over several months, I analyzed PET and DTI scans using MATLAB and SPM, mapping regions of interest and producing z-score maps. Our research provided pivotal evidence in three criminal trials, underscoring the practical relevance of neuroimaging and enhancing my proficiency in statistical inference in a unique context.
Read More
A robust web crawler to scrape extensive textual data from diverse ICS webpages + a custom search engine that indexes and queries data in under 300 milliseconds. This application utilizes libraries such as bs4, pandas, and NLTK, as well as advanced page ranking algorithms to optimize search result relevance and efficiency. This project not only deepened my expertise in web scraping and algorithmic optimization but also highlighted the practical applications of data acquisition.
Read More