# OLTP Database — Design & Details ## Overview The OLTP (Online Transaction Processing) database models a **hotel reservation system** using a fully normalized relational schema in **MySQL 8.4**. It follows 3NF and enforces referential integrity via foreign keys. - **Database:** `hotel_reservations` - **Character set:** `utf8mb4` / `utf8mb4_unicode_ci` - **Tables:** 13 - **Total rows:** ~635,000 --- ## Entity-Relationship Model The schema covers five entity groups: ``` hotel_chain ──┐ country ───────┼──► hotel ──► hotel_room ──► room_booking ──► booking ──► guest star_rating ──┘ │ └──► country hotel_characteristic ◄──► hotel (M:N via hotel_hotel_characteristic) room_type ◄──── hotel_room room_type ◄──┐ rate_period ◄─┴── period_room_rate (price per room type per season) ``` --- ## Table Descriptions ### Reference / Lookup Tables #### `hotel_chain` International hotel chains (Hilton, Marriott, Accor, etc.). | Column | Type | Description | |--------|------|-------------| | `hotel_chain_id` | INT UNSIGNED PK | Surrogate key | | `code` | VARCHAR(10) UNIQUE | Short code (e.g. `HLT`) | | `name` | VARCHAR(100) | Full name | **Rows:** 10 --- #### `country` Countries from which guests come and where hotels are located. | Column | Type | Description | |--------|------|-------------| | `country_id` | INT UNSIGNED PK | Surrogate key | | `code` | CHAR(2) UNIQUE | ISO 3166-1 alpha-2 (e.g. `GB`) | | `name` | VARCHAR(100) | Country name | | `currency` | VARCHAR(10) | ISO currency code (e.g. `EUR`) | **Rows:** 40 (Europe, Americas, Asia, Africa, Oceania) --- #### `star_rating` Hotel classification from 1★ to 5★. | Column | Type | Description | |--------|------|-------------| | `star_rating_id` | INT UNSIGNED PK | Surrogate key | | `code` | TINYINT UNIQUE | 1–5 | | `description` | VARCHAR(20) | e.g. `3 Star` | **Rows:** 5 --- #### `hotel_characteristic` Amenities and features a hotel may offer. | Column | Type | Description | |--------|------|-------------| | `characteristic_id` | INT UNSIGNED PK | Surrogate key | | `code` | VARCHAR(20) UNIQUE | e.g. `POOL`, `SPA`, `WIFI` | | `description` | VARCHAR(100) | Human-readable label | **Rows:** 12 (WiFi, Pool, Gym, Spa, Restaurant, Bar, Parking, Valet, Conference, Shuttle, Room Service, Pet Friendly) --- #### `room_type` Types of rooms a hotel can offer, with a standard (base) rate. | Column | Type | Description | |--------|------|-------------| | `room_type_id` | INT UNSIGNED PK | Surrogate key | | `code` | VARCHAR(20) UNIQUE | e.g. `SINGLE`, `SUITE` | | `description` | VARCHAR(100) | e.g. `Junior Suite` | | `standard_rate` | DECIMAL(10,2) | Base nightly rate (EUR) | | `smoking_yn` | BOOLEAN | Smoking allowed flag | **Rows:** 7 (Single €80, Double €120, Twin €115, Deluxe €180, Suite €280, Executive €450, Family €200) --- #### `rate_period` Seasonal pricing periods. Each period maps to a month range and applies a rate multiplier. | Column | Type | Description | |--------|------|-------------| | `rate_period_id` | INT UNSIGNED PK | Surrogate key | | `code` | VARCHAR(20) UNIQUE | e.g. `PEAK`, `WINTER` | | `description` | VARCHAR(50) | Human-readable label | | `month_from` | TINYINT | Start month (1–12) | | `month_to` | TINYINT | End month (1–12) | **Rows:** 4 | Code | Period | Months | Multiplier | |------|--------|--------|-----------| | PEAK | Peak Season | Jun–Aug | ×1.5 | | HIGH | High Season | Mar–May | ×1.2 | | AUTUMN | Autumn Season | Sep–Nov | ×1.1 | | WINTER | Winter Season | Dec–Feb | ×0.9 | --- ### Junction Tables #### `period_room_rate` The effective nightly rate for each (room_type, rate_period) combination. Rate = `standard_rate × season_multiplier`. | Column | Type | Description | |--------|------|-------------| | `room_type_id` | INT UNSIGNED PK/FK | | | `rate_period_id` | INT UNSIGNED PK/FK | | | `rate` | DECIMAL(10,2) | Effective nightly rate | **Rows:** 28 (7 room types × 4 seasons) --- #### `hotel_hotel_characteristic` M:N junction between hotels and their amenities. | Column | Type | |--------|------| | `hotel_id` | INT UNSIGNED PK/FK | | `characteristic_id` | INT UNSIGNED PK/FK | **Rows:** ~1,415 --- ### Core Entity Tables #### `hotel` Individual hotel properties. | Column | Type | Description | |--------|------|-------------| | `hotel_id` | INT UNSIGNED PK | | | `hotel_chain_id` | INT UNSIGNED FK | NULL for independent hotels | | `country_id` | INT UNSIGNED FK | | | `star_rating_id` | INT UNSIGNED FK | | | `code` | VARCHAR(20) UNIQUE | e.g. `HTL0001` | | `name` | VARCHAR(150) | | | `address` | VARCHAR(200) | | | `postcode` | VARCHAR(20) | | | `city` | VARCHAR(100) | | | `url` | VARCHAR(200) | | **Rows:** 200 (50 cities, star distribution: 5% 1★, 10% 2★, 40% 3★, 30% 4★, 15% 5★) --- #### `hotel_room` Individual rooms within each hotel. | Column | Type | Description | |--------|------|-------------| | `room_id` | INT UNSIGNED PK | | | `hotel_id` | INT UNSIGNED FK | | | `room_type_id` | INT UNSIGNED FK | | | `room_number` | VARCHAR(10) | Format: `{floor}{number}`, e.g. `101` | | `floor` | TINYINT UNSIGNED | | **Rows:** 5,334 (5–60 rooms per hotel depending on star rating) --- #### `guest` Hotel guests. | Column | Type | Description | |--------|------|-------------| | `guest_id` | INT UNSIGNED PK | | | `country_id` | INT UNSIGNED FK | Guest's home country | | `name` | VARCHAR(150) | Full name | | `email` | VARCHAR(150) | Unique synthetic email | | `address` | VARCHAR(200) | | | `city` | VARCHAR(100) | | **Rows:** 100,000 --- #### `booking` A reservation made by a guest at a hotel. One booking can cover multiple rooms. | Column | Type | Description | |--------|------|-------------| | `booking_id` | INT UNSIGNED PK | | | `guest_id` | INT UNSIGNED FK | | | `hotel_id` | INT UNSIGNED FK | | | `date_from` | DATE | Check-in | | `date_to` | DATE | Check-out | | `status` | ENUM | `confirmed`, `cancelled`, `completed`, `no_show` | | `created_at` | DATETIME | When booking was made | **Rows:** 500,000 **Status distribution:** 80% completed, 10% confirmed, 7% cancelled, 3% no_show **Date range:** 2022-01-01 – 2025-12-31 **Seasonal distribution:** June–August heaviest (peak), December–February lightest --- #### `room_booking` A specific room assigned within a booking. Stores the rate **as it was at booking time** (snapshot), independent of any future rate changes. | Column | Type | Description | |--------|------|-------------| | `room_booking_id` | INT UNSIGNED PK | | | `booking_id` | INT UNSIGNED FK | | | `room_id` | INT UNSIGNED FK | | | `date_from` | DATE | | | `date_to` | DATE | | | `nightly_rate` | DECIMAL(10,2) | Rate at time of booking | | `total_amount` | DECIMAL(10,2) | `nightly_rate × nights` | **Rows:** 531,382 **Room count per booking:** 90% single room, 8% two rooms, 2% three rooms --- ## Data Generation The database was populated using a **single-file C# script** (`generator/generate.cs`) running on .NET 10, using `MySqlConnector` as the only dependency. Key generation decisions: - **Seasonal booking distribution** via rejection sampling — months Jun–Aug are ~2.7× more likely than Jan–Feb - **Rate snapshot** — each `room_booking.nightly_rate` is looked up from `period_room_rate` at insert time and stored, not re-computed later - **Realistic stay lengths** — 30% one night, 25% two nights, 20% three nights, tapering off to 14-night stays - **Cancelled/no-show bookings** partially skip room assignment (60% of cancellations have no room_booking) ```bash # Run generator dotnet run generator/generate.cs ```