docs
This commit is contained in:
98
docs/01-overview.md
Normal file
98
docs/01-overview.md
Normal file
@@ -0,0 +1,98 @@
|
||||
# Hotel Reservations — Data Warehouse Project
|
||||
|
||||
## Project Summary
|
||||
|
||||
This project implements a complete **Data Warehousing pipeline** for a hotel reservation system, covering all standard DW layers:
|
||||
|
||||
```
|
||||
MySQL OLTP ──► Apache NiFi ETL ──► Oracle Data Mart ──► Power BI Reports
|
||||
(source) (transform) (analytical store) (OLAP queries)
|
||||
```
|
||||
|
||||
The system is built around the **A.24 Hotel Reservations** domain from the course specification. The OLTP database was populated with **~635,000 synthetically generated rows** covering 200 hotels, 100,000 guests, 500,000 bookings, and 531,000 room bookings across a 4-year period (2022–2025).
|
||||
|
||||
---
|
||||
|
||||
## Business Context
|
||||
|
||||
A hotel chain needs to answer questions like:
|
||||
|
||||
- Which countries generate the most revenue per quarter?
|
||||
- How does occupancy differ between peak and off-peak seasons?
|
||||
- What is the revenue contribution of 5-star vs 3-star hotels?
|
||||
- How has a hotel's revenue changed after upgrading its star rating?
|
||||
|
||||
These questions require **historical, multi-dimensional analysis** that a normalized OLTP database cannot serve efficiently. The data mart provides pre-modelled, denormalized data optimized for analytical queries.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ SOURCE LAYER │
|
||||
│ MySQL 8.4 (Docker/Podman, port 13306) │
|
||||
│ Database: hotel_reservations │
|
||||
│ 13 normalized tables, ~635K rows │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│ JDBC (MySqlConnector)
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ ETL LAYER │
|
||||
│ Apache NiFi │
|
||||
│ 5 Process Groups: Date Dim / Static Dims / │
|
||||
│ SCD2 Hotel / SCD1 Guest / Incremental Fact │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│ JDBC (Oracle JDBC)
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ DATA MART LAYER │
|
||||
│ Oracle (university lab schema) │
|
||||
│ Star schema: 6 dimensions + 1 fact table │
|
||||
│ SCD Type 2 on DIM_HOTEL │
|
||||
└───────────────────────┬─────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ PRESENTATION LAYER │
|
||||
│ Power BI Desktop │
|
||||
│ OLAP reports via DirectQuery / Import │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Technology Stack
|
||||
|
||||
| Component | Technology | Version |
|
||||
|-----------|-----------|---------|
|
||||
| OLTP Database | MySQL | 8.4 |
|
||||
| Container runtime | Docker / Podman | — |
|
||||
| Data generator | C# (.NET) | 10 |
|
||||
| ETL tool | Apache NiFi | — |
|
||||
| Data Mart | Oracle RDBMS | university lab |
|
||||
| Reporting | Power BI Desktop | — |
|
||||
|
||||
---
|
||||
|
||||
## Repository Structure
|
||||
|
||||
```
|
||||
IPZ_1/
|
||||
├── docker/
|
||||
│ ├── start.sh # Start MySQL container (Linux/macOS)
|
||||
│ ├── stop.sh # Stop MySQL container
|
||||
│ ├── start.ps1 # Start MySQL container (Windows)
|
||||
│ └── stop.ps1 # Stop MySQL container
|
||||
├── sql/
|
||||
│ ├── schema.sql # MySQL OLTP DDL
|
||||
│ └── datamart_schema.sql # Oracle Data Mart DDL
|
||||
├── generator/
|
||||
│ └── generate.cs # .NET 10 data generator script
|
||||
└── docs/
|
||||
├── 01-overview.md # This file
|
||||
├── 02-oltp.md # OLTP database design
|
||||
├── 03-datamart.md # Data mart design
|
||||
├── 04-setup.md # Setup and run guide
|
||||
└── nifi-flow.md # NiFi ETL flow reference
|
||||
```
|
||||
Reference in New Issue
Block a user