Proyects


Divvy Bikes 2024 Q1-Q2
Power BI Dashboard
Divvy is a system of shared bicycles in Chicago. It allows users to rent bicycles at one station and return them to another nearby. With a daily, monthly, or annual membership, Divvy promotes sustainable and healthy transportation in the city. With stations throughout the city, it offers a convenient and flexible option to get around, reducing traffic congestion and pollution.
This project contains the development of a dashboard using divvy datasets containing a process of data cleaning, transformation, data analysis expressions, and data modeling to connect different datasets.
This dashboard includes answers to case study questions such as:
What is the station where most trips start?
Which stations have an average trip duration greater than one hour?
Who uses the Divvy service more, subscribers or customers? Is there any significant difference in gender?
How has the trend of trips been in the first semester of 2014?
When is there greater activity on the part of users? On weekdays or weekends?
How is the behavior of users throughout the day?
In which time slot are there more trips, in the morning (6 a.m. to 11 a.m.), in the afternoon (12 p.m. to 6 p.m.), or at night (7 p.m. to 5:00 a.m.)?
What is the age distribution of the users?
Etc... (Download Full Report)






Relational Database Management System (RDBMS):
Spanish Soccer League Database
PostgreSQL
This database covers the 2023/2024 LaLiga season, Spain's first division soccer league. I manually collected the data from ESPN, capturing attributes for each table to create the database using SQL. In this database, I followed the 4th normal form (4NF) following the rules of the Boyce-Codd normal form (BCNF). La Liga runs from August to May with 20 teams competing. Teams earn points based on wins, defeats, and draws, with the highest-scoring team becoming the champion. The bottom three teams are relegated to the second division, while the top two second-division teams are promoted. A playoff among the third to sixth-placed second-division teams determines the third team promoted to LaLiga.
Database Squema:
Players Table: Tracks player details like ID, name, club ID (linked to teams), season, position, birthdate, birthplace, height, and weight, requiring season updates.
Players Statistics Table: Logs player stats with a sequence number, player ID, season, stat ID (linked to description table), and value.
Players Statistics Description Table: Defines statistic types using a unique stat ID and description.
Teams Table: Stores team data including ID, name, president, city, stadium ID (linked to stadium table), founding year, and kit sponsor.
Titles Table: Records team titles with a sequence number, club ID (linked to teams), title ID, and number of titles.
Titles Description Table: Describes titles with a unique title ID and name.
Stadium Table: Details stadiums with ID, name, capacity, construction date, and field size in square feet
Referees Table: Lists referees with ID, name, birthdate, and birthplace.
Games Table: Captures game details like ID, team IDs (local/away, linked to teams), season, date, referee ID, stadium ID, attendance, scores, possession percentages, and shot statistics.






Econometrics Research Regression:
South Florida Housing Market Dynamics
Housing dynamics, shaped by demographic and economic factors, are key to understanding affordability issues in South Florida's regional markets. This study uses regression analysis in Stata, with data from Rocket Homes, Data Census Gov, Ownwell, Realtor, and Redfin, to identify the main variables affecting median house prices. Variables like population, median household income, family size, employment rate, education levels, vacant housing units, and price per square foot are analyzed together to assess their combined impact. The results show that demographic factors (population, municipality size, total houses) and economic factors (household income, price per square mile) significantly influence median house prices in South Florida. These findings highlight key drivers of affordability challenges and the need for targeted policy interventions to improve housing accessibility.
Population Growth: A slight increase in total population negatively affects housing prices, possibly due to higher density reducing demand.
Household Income: The logarithm of total household income strongly boosts housing prices, reflecting greater purchasing power as a key driver.
Family Size: Larger average family size increases housing prices, likely due to demand for bigger homes.
Employment Rate: Surprisingly, higher employment rates negatively impact prices, possibly tied to local economic or affordability factors.
Housing Supply: More housing units slightly lower prices by reducing competition and improving affordability.
Key Drivers: Demographic factors (population, municipality size, total houses) and economic factors (household income, price per square mile) significantly influence median house prices in South Florida.
Policy Solutions: Affordable housing can be incentivized through tax breaks/subsidies for developers or Public-Private Partnerships (PPPs) to share costs and risks.
Number of Households: More households increase demand, pushing housing prices up.
Etc... (Download Full Report)





Forecasting & Error Methodology
Airline Passanger Miles
Through this report, we will analyze 192 observations related to the monthly U.S. passenger airline miles for every day from 2001 to 2016. Through this research, we are going to observe seasonality, trends, tables, and forecasting methods to determine potential insights that could lead to decision-making regarding passenger airline miles. Moreover, it is essential to determine that correlation does not mean causation; therefore, the analysis would be an overview of data about passenger airline miles over specific dates to determine the best forecasting method with reduced errors to obtain an educated prediction for future years.
Forecasting Methods Performed
Naïve
Average
Moving Average (3 days)
Moving Average (5 days)
Exponential Smoothing
After we conducted the error metrics for each forecasting method, we determined that the Average method was the best model based on having the lowest MAPE, which calculates absolute variance in percentage errors between actual and predicted values.
Key Insights & Trends
U.S. Passenger Miles increase every five years, decrease for the next two years, then rise again.
From February to August, U.S. Passenger Miles show an increase.
Passenger Miles increase from Quarter 1 to Quarter 3, then decrease in Quarter 4.
Using the Average forecasting method, a linear regression predicted a 5.27% increase in U.S. Passenger Miles for 2017 compared to 2016.
In 2017, compared to 2016, Passenger Miles decreased between February and July but increased in the following months.
