Recent graduate in data analytics with a passion for using data to drive business insights and decision-making. Strong analytical skills and real world experience using in a variety of data analysis tools and technologies, including SQL, Python, and Excel. In his previous internships and academic projects, I have been able to learn at a high level the ability to collect, clean, and analyze large datasets, as well as communicate findings in clear and effective ways to diverse audiences. Seeking a role that will utilize my skills and experience to make a meaningful impact in a data-driven organization.
Logistic Regression, Linear Regression, Decision Tree, Random Forest, Forecasting
Pandas, Numpy, Scikit, Seaborn
Data Mining, Wrangling, Cleansing, Manipulation, Storytelling, Visualization,Anomaly Detection, Advanced Querying, Dashboarding, EDA
Data + Analytics (Intern)
PepsiCo
Reporting / Data Analysis
Old Cobblers Farm
M.S. Data Analytics GPA: 3.5
B.S Strategic Marketing GPA: 3.3
Project utilizes SQL/ Python to create a dynamic probability calculator that allows you to take a deep dive into different pitch statisics.Such as:
This machine learning project was done using Gradient Boost Classifier returning a train score of .9079 in cross validation. This model had binary target variables of survived (1) and died (0). GBC predicted that 73 deaths versus acutal of 84 deaths. The model scored a .73 recall survived and .91 recall deaths. The precision score came out to .85 dead and .84 for survived. The total accuracy of the model was 0.843.
With a passion for data and arriving conclusions I have always wanted to gain more experience with different technologies as I grow. This particular project could have been done with an excel sheet but I wanted to apply some of the knowledge I have picked up through codeacademy to an analysis. The technologies I chose were Python and SQL. Python is great for collection of data using the bs4 package so I applied that and was able to loop through a series of names and different link ids to get the span of time the player played in the nba. I then took that data and stored it in SQL so it could be easily queried from juypter notebook. Within python I was able to utilize the queries in SQL to my advantage when performing data manipulation. This made it seamless to create different visualizations. From this experience I gained a lot of knowledge about working through a clean notebook. I also picked up man data engineering skills while working with different structures to make the analysis smooth. This also helped me refresh on some familiar topics such as matplotlib, pandas and seaborn. In this analysis I used new visualizations to explore the data.
This is a link to my Tableau public page which contains some of my own work. Data collected in the dashboards are all real world data collected and cleansed by myself.
This is a general respository showing off my SQL code that I have developed for a small e-commerce company as well as general practice.
This project I did on my collegiate baseball team to predict the amount of runs scored against and runs batted in. For this project I took in two different ML models to predict each outcome logistic regression (runs scored against SNHU) and Random Forest (runs batted in for SNHU). Throughout this analysis I explore correlations, distributions and box plot (for outlier detection) away from the model and assessment itself.
This project was an analysis on a small e-commerce store that sells on Amazon, Shopify, eBay and Etsy The goal of this project was to get a better understanding of the customers we are serving as well as warehouse optimization for products. This project takes in data from multiple sources to extract the best information from the data as we can. The technologies used to assist this was Python and MySQL Workbench.