The Data Science and Visualization bootcamp at UC Berkeley was a 6 month program designed to bring me to high competence with a standard data science tech stack. Throughout the course of the bootcamp, I was challenged to use a wide variety of technologies to accomplish assignments with data from many industries, including financial, political, geographical, weather, performance, inventory, and biological data (among others). My skills in Python, SQL, Javascript, HTML, and CSS were fortified, and I was exposed to many incredibly useful and often times blazingly efficient libraries, modules, softwares and packages to help me take any project from raw unstructured data to fully-furnished visualizations on fully deployed dashboards and webpages.
Below, I present work I completed for this bootcamp. Click on the titles or images below to see my complete solutions for each assignment and project.
Language: Python Keywords: Python
Given a 2-column dataset, I used a python for-loop to collect summary statistics, and write the output to a .txt file.
Language: Python Keywords: Python
Given a 2-column dataset, I used a python for-loop to count votes and determine the winner and write the output to a .txt file.
Language: Python Keywords: Jupyter, Pandas, Numpy
Given a microtransaction dataset with several columns, I used pandas to munge and perform some EDA
Language: Python Keywords: Jupyter, Pandas, Numpy
I used pandas to munge and perform some EDA on school district data. I merged and created my own dataframes.
Language: Python Keywords: Jupyter, Matplotlib, Pandas, Numpy
I provided insights about a ride-sharing app by visualizing a bubble plot and three pie charts.
Language: Python Keywords: Jupyter, Matplotlib, Pandas, Numpy
I provided visualizations for a study of several anti-cancer drugs, and their effects on mice.
Language: Python, HTML, CSS Keywords: Jupyter, MatplotLib, Pandas, Numpy, Requests, Citipy, OpenWeather API
I leveraged Google's Openweather API to analyze the relationships between lattitude and several weather dimensions. I also deployed a static webpage that displays the resulting scatterplots.
Language: SQL Keywords: MySQL, Sakila, MySQL Workbench
I used SQL queries on a MySQL database to explore relationships within a video rental relational database. The commands listed herein include insertions, joins, group-bys, nested queries, and views.
Language: Python Keywords: Flask, SQL Alchemy, Pandas, MongoDB
I used SQLAlchemy within a Jupyter notebook to obtain data from and SQLite file, and created some simple visualizations of the data. I then used SQLAlchemy to query that file dynamically via a Flask route.
Language: Python, HTML Keywords: Jupyter, BeautifulSoup, Pandas, Requests, Splinter, Flask, MongoDB
I created a webpage that, at the press of a button, calls a Splinter webscraper to attain information from four different outer space-related webpages, then display the results, including images and text, on my webpage.
Language: Javascript, HTML, CSS Keywords: json, bootstrap css
I was given a json array of objects containing information about purported UFO sightings. I used a javascript file to dynamically add that data (parsed) to a table on an HTML page. I also added two filters to specify what data populated the table.
Language: Javascript, Python, HTML Keywords: Plotly.js, Flask, Heroku, SQLite
I created and deployed a webpage that dynamically queries a SQLite database of bacteria found on a specific site on the body for a large number of participants. The webpage produces a pie chart and a bubble plot in response to a selection of participant ID.
Language: Javascript Keywords: D3, csv
I used D3.js to produce a dynamic visualization of the interrelations between health and SES factors. The data file includes state-by-state demographic data from the US Census and measurements from health risks obtained by the Behavioral Risk Factor Surveillance System.
Language: Python(3.7) Keywords: Jupyter, SciPy, Keras, Pandas, SVM, Binary Logistic Regression, Deep Learning
I used Sci-Kit-Learn and Keras to create several ML models to predict exoplanet classification. The predictive veracity of these models was compared.
Language: Python, SQL, Javascript, HTML, CSS Tags: Elastic Net, Data Engineering, Heroku, DarkSky API, D3, SVG, Autocomplete
This application predicted head-to-head outcomes of disc-golf matches between any two of the 615 players in our private database. The underlying model considered prior performance on each of three courses, relevant weather factors both for the recorded scores and for the date of the prediction, and individual factors to produce an odds ratio that described which of the two players would come out on top, as well as predict the exact score that each would attain. It utilized a bespoke machine-learning model based on an elastic-net, and took live weather forecasts acquired through the DarkSky API. The weather stations were tested against our own weather readings recorded by anemometer.
Internal testing had shown that the model accurately predicts within 3 strokes on average. Unfortunately, the files underlying the app have been lost.
Language: Python, Javascript, HTML, CSS, SQL Keywords: Flask, Multiple.js, Plotly.js, AWS, Heroku, Jinja, Binance API
This fully-fledged application features live-updated minute-by-minute exchange-rate information (price compared to Bitcoin) for the top 6 altcoins (Ether, Binance Coin, Litecoin, EOS, Bitcoin Cash, and Ripple), and enables the user to, with very little effort, systematically explore the Pearson correlations within user-specified time periods, providing visualizations (for statistical assumption testing) and saving the results in an organized table.
This app also provides historical data (since inception) of the USD price of each of the coins, drawing text-based summary descriptions directly from Binance.com. Additionally, the front page of the app displays the current ticker data for each altcoin.