4.5 KiB
Conclusion
What Was Built
This project delivers a complete data warehousing pipeline from raw CSV files to an analytics-ready Data Mart:
| Deliverable | Details |
|---|---|
| OLTP schema | 14 tables, 3NF, MySQL 8.4 |
| Seed script | .NET 10 single-file C# script, loads ~700 rows across all tables |
| Docker setup | One-command start/stop for MySQL, Linux and Windows |
| Data Mart schema | 3 fact tables, 5 dimension tables, star schema, Oracle DDL |
| NiFi ETL | 8 pipelines, extract SQL + load SQL, documented step-by-step |
| Documentation | This docs folder |
Analytical Potential
The Data Mart enables a wide range of OLAP analyses. Below are the most interesting ones, mapped to the fact and dimension tables that support them.
Prize Money Distribution
"Where did the $100M go?"
Using FACT_TOURNAMENT sliced by DIM_GAME:
- Total prize pool by game genre (MOBA, FPS, Battle Royale, etc.)
- Average prize pool per participant by platform (PC vs Mobile)
- Prize concentration — what percentage of total prize money went to the top 5 tournaments
- Relationship between tournament duration and prize pool size
Country and Regional Dominance
"Which part of the world ruled EWC 2025?"
Using FACT_MEDAL_AWARD sliced by DIM_COUNTRY and DIM_GAME:
- Medal tally by country (gold/silver/bronze breakdown)
- Medal points by region (Asia vs Europe vs North America vs Middle East)
- Genre specialization — do Asian countries dominate MOBAs? Does Europe lead in FPS?
- Countries with most medals per player (efficiency metric using medal_count / total_players)
Club Championship Performance
"Which clubs built the best all-around teams?"
Using FACT_CLUB_STANDING sliced by DIM_ORGANIZATION:
- Points vs prize money correlation — are points a good predictor of earnings?
- Tournament breadth vs depth — do clubs that win fewer tournaments but make more top-8 finishes rank higher?
- Performance by region — Middle Eastern clubs (Team Falcons, Twisted Minds) vs European clubs
- Club partner ROI — do Current club partners finish higher than non-partners on average?
Event Timeline Analysis
"How did the event unfold week by week?"
Using FACT_TOURNAMENT sliced by DIM_DATE:
- Prize money at stake each week
- Which weeks had the most high-value tournaments running simultaneously
- Tournament density (how many events overlapped)
Suggested Power BI Reports
Report 1 — Prize & Tournament Analysis
A financial overview dashboard with:
- KPI cards: Total prize pool, number of tournaments, average prize per tournament
- Bar chart: Prize pool by game_type (filter: platform)
- Treemap: Prize pool breakdown by individual game
- Line chart: Cumulative prize money awarded by week (using DIM_DATE.week_number)
- Scatter plot: Prize pool vs num_participants (are bigger tournaments better funded?)
Slicers: Platform, Gender, Club Championship Points (Yes/No)
Report 2 — Performance & Medal Analysis
A competitive performance dashboard with:
- Map visual: Medal points by country (filled map using country name)
- Stacked bar chart: Gold/Silver/Bronze medals by region
- Matrix: Organizations × Game genres (medal_count as values — shows which orgs are specialists vs all-rounders)
- Bar chart: Top 10 organizations by total medal_points
- Table: Club Championship standings with conditional formatting on total_points and prize_money_usd
Slicers: Region, Game Type, Medal Type
Limitations
Dataset size — With only 27 tournaments and 257 medalists, the dataset is small by data warehousing standards. The analyses are valid but a real production data mart would have years of historical data for trend analysis.
Player earnings — The prize_earned_usd column in the player roster is zero for all players. Individual prize splits were not publicly available at the time the dataset was compiled, so per-player financial analysis is not possible.
Individual vs team events — Games like Chess, StarCraft II, and the fighting games are individual competitions. Their medal and match data is structured the same way as team events, but the "organization" in those cases is the player's sponsoring team rather than a competing unit. This is a nuance that Power BI visuals should label clearly.
Static snapshot — This is a point-in-time dataset for EWC 2025. The Data Mart has no slowly changing dimension (SCD) logic or historical tracking. It reflects the final state of the event.