7.7 KiB
OLTP Database — Design & Details
Overview
The OLTP (Online Transaction Processing) database models a hotel reservation system using a fully normalized relational schema in MySQL 8.4. It follows 3NF and enforces referential integrity via foreign keys.
- Database:
hotel_reservations - Character set:
utf8mb4/utf8mb4_unicode_ci - Tables: 13
- Total rows: ~635,000
Entity-Relationship Model
The schema covers five entity groups:
hotel_chain ──┐
country ───────┼──► hotel ──► hotel_room ──► room_booking ──► booking ──► guest
star_rating ──┘ │
└──► country
hotel_characteristic ◄──► hotel (M:N via hotel_hotel_characteristic)
room_type ◄──── hotel_room
room_type ◄──┐
rate_period ◄─┴── period_room_rate (price per room type per season)
Table Descriptions
Reference / Lookup Tables
hotel_chain
International hotel chains (Hilton, Marriott, Accor, etc.).
| Column | Type | Description |
|---|---|---|
hotel_chain_id |
INT UNSIGNED PK | Surrogate key |
code |
VARCHAR(10) UNIQUE | Short code (e.g. HLT) |
name |
VARCHAR(100) | Full name |
Rows: 10
country
Countries from which guests come and where hotels are located.
| Column | Type | Description |
|---|---|---|
country_id |
INT UNSIGNED PK | Surrogate key |
code |
CHAR(2) UNIQUE | ISO 3166-1 alpha-2 (e.g. GB) |
name |
VARCHAR(100) | Country name |
currency |
VARCHAR(10) | ISO currency code (e.g. EUR) |
Rows: 40 (Europe, Americas, Asia, Africa, Oceania)
star_rating
Hotel classification from 1★ to 5★.
| Column | Type | Description |
|---|---|---|
star_rating_id |
INT UNSIGNED PK | Surrogate key |
code |
TINYINT UNIQUE | 1–5 |
description |
VARCHAR(20) | e.g. 3 Star |
Rows: 5
hotel_characteristic
Amenities and features a hotel may offer.
| Column | Type | Description |
|---|---|---|
characteristic_id |
INT UNSIGNED PK | Surrogate key |
code |
VARCHAR(20) UNIQUE | e.g. POOL, SPA, WIFI |
description |
VARCHAR(100) | Human-readable label |
Rows: 12 (WiFi, Pool, Gym, Spa, Restaurant, Bar, Parking, Valet, Conference, Shuttle, Room Service, Pet Friendly)
room_type
Types of rooms a hotel can offer, with a standard (base) rate.
| Column | Type | Description |
|---|---|---|
room_type_id |
INT UNSIGNED PK | Surrogate key |
code |
VARCHAR(20) UNIQUE | e.g. SINGLE, SUITE |
description |
VARCHAR(100) | e.g. Junior Suite |
standard_rate |
DECIMAL(10,2) | Base nightly rate (EUR) |
smoking_yn |
BOOLEAN | Smoking allowed flag |
Rows: 7 (Single €80, Double €120, Twin €115, Deluxe €180, Suite €280, Executive €450, Family €200)
rate_period
Seasonal pricing periods. Each period maps to a month range and applies a rate multiplier.
| Column | Type | Description |
|---|---|---|
rate_period_id |
INT UNSIGNED PK | Surrogate key |
code |
VARCHAR(20) UNIQUE | e.g. PEAK, WINTER |
description |
VARCHAR(50) | Human-readable label |
month_from |
TINYINT | Start month (1–12) |
month_to |
TINYINT | End month (1–12) |
Rows: 4
| Code | Period | Months | Multiplier |
|---|---|---|---|
| PEAK | Peak Season | Jun–Aug | ×1.5 |
| HIGH | High Season | Mar–May | ×1.2 |
| AUTUMN | Autumn Season | Sep–Nov | ×1.1 |
| WINTER | Winter Season | Dec–Feb | ×0.9 |
Junction Tables
period_room_rate
The effective nightly rate for each (room_type, rate_period) combination.
Rate = standard_rate × season_multiplier.
| Column | Type | Description |
|---|---|---|
room_type_id |
INT UNSIGNED PK/FK | |
rate_period_id |
INT UNSIGNED PK/FK | |
rate |
DECIMAL(10,2) | Effective nightly rate |
Rows: 28 (7 room types × 4 seasons)
hotel_hotel_characteristic
M:N junction between hotels and their amenities.
| Column | Type |
|---|---|
hotel_id |
INT UNSIGNED PK/FK |
characteristic_id |
INT UNSIGNED PK/FK |
Rows: ~1,415
Core Entity Tables
hotel
Individual hotel properties.
| Column | Type | Description |
|---|---|---|
hotel_id |
INT UNSIGNED PK | |
hotel_chain_id |
INT UNSIGNED FK | NULL for independent hotels |
country_id |
INT UNSIGNED FK | |
star_rating_id |
INT UNSIGNED FK | |
code |
VARCHAR(20) UNIQUE | e.g. HTL0001 |
name |
VARCHAR(150) | |
address |
VARCHAR(200) | |
postcode |
VARCHAR(20) | |
city |
VARCHAR(100) | |
url |
VARCHAR(200) |
Rows: 200 (50 cities, star distribution: 5% 1★, 10% 2★, 40% 3★, 30% 4★, 15% 5★)
hotel_room
Individual rooms within each hotel.
| Column | Type | Description |
|---|---|---|
room_id |
INT UNSIGNED PK | |
hotel_id |
INT UNSIGNED FK | |
room_type_id |
INT UNSIGNED FK | |
room_number |
VARCHAR(10) | Format: {floor}{number}, e.g. 101 |
floor |
TINYINT UNSIGNED |
Rows: 5,334 (5–60 rooms per hotel depending on star rating)
guest
Hotel guests.
| Column | Type | Description |
|---|---|---|
guest_id |
INT UNSIGNED PK | |
country_id |
INT UNSIGNED FK | Guest's home country |
name |
VARCHAR(150) | Full name |
email |
VARCHAR(150) | Unique synthetic email |
address |
VARCHAR(200) | |
city |
VARCHAR(100) |
Rows: 100,000
booking
A reservation made by a guest at a hotel. One booking can cover multiple rooms.
| Column | Type | Description |
|---|---|---|
booking_id |
INT UNSIGNED PK | |
guest_id |
INT UNSIGNED FK | |
hotel_id |
INT UNSIGNED FK | |
date_from |
DATE | Check-in |
date_to |
DATE | Check-out |
status |
ENUM | confirmed, cancelled, completed, no_show |
created_at |
DATETIME | When booking was made |
Rows: 500,000
Status distribution: 80% completed, 10% confirmed, 7% cancelled, 3% no_show
Date range: 2022-01-01 – 2025-12-31
Seasonal distribution: June–August heaviest (peak), December–February lightest
room_booking
A specific room assigned within a booking. Stores the rate as it was at booking time (snapshot), independent of any future rate changes.
| Column | Type | Description |
|---|---|---|
room_booking_id |
INT UNSIGNED PK | |
booking_id |
INT UNSIGNED FK | |
room_id |
INT UNSIGNED FK | |
date_from |
DATE | |
date_to |
DATE | |
nightly_rate |
DECIMAL(10,2) | Rate at time of booking |
total_amount |
DECIMAL(10,2) | nightly_rate × nights |
Rows: 531,382
Room count per booking: 90% single room, 8% two rooms, 2% three rooms
Data Generation
The database was populated using a single-file C# script (generator/generate.cs) running on .NET 10, using MySqlConnector as the only dependency.
Key generation decisions:
- Seasonal booking distribution via rejection sampling — months Jun–Aug are ~2.7× more likely than Jan–Feb
- Rate snapshot — each
room_booking.nightly_rateis looked up fromperiod_room_rateat insert time and stored, not re-computed later - Realistic stay lengths — 30% one night, 25% two nights, 20% three nights, tapering off to 14-night stays
- Cancelled/no-show bookings partially skip room assignment (60% of cancellations have no room_booking)
# Run generator
dotnet run generator/generate.cs