How to Scrape Booking.com with Python

How to Scrape Booking.com with Python is a question that often surfaces among developers, data analysts, and digital marketers looking to gather pricing, availability, or review data from one of the world’s largest travel platforms. In this guide, we will walk through practical ways to scrape Booking.com with Python effectively, ethically, and efficiently. Whether you’re working on market intelligence, price comparisons, or sentiment analysis, you will find this guide useful.
By the end of this article, you will understand not only how to technically approach the scraping process but also how to apply it in real-world scenarios using Python tools, proxy solutions, and strategic planning.
Why Scraping Booking.com with Python
Booking.com is a rich source of public data for accommodation details, reviews, pricing trends, and regional availability. However, scraping this data is not straightforward due to the platform’s frequent structure changes, bot detection systems, and dynamic content loading. Python offers multiple libraries that simplify these challenges and make it easier to collect and process data at scale.
Scraping Booking.com with Python is often used for:
- Competitor price tracking
- Market trend analysis for hotels and travel agencies
- Review aggregation for sentiment analysis
- Listing verification for travel affiliates
Data You Can Scrape From Booking.com
Here’s a list of key data points that can be extracted from Booking.com:
- Property details: Hotel name, address, distance from landmarks (e.g., city center).
- Pricing information: Regular and discounted prices.
- Reviews and ratings: Review score, number of reviews, and guest feedback.
- Availability: Room types available, booking options, and dates with availability.
- Media: Property and room images.
- Amenities: Facilities offered (e.g., Wi-Fi, parking, pool) and room-specific amenities.
- Promotions: Special offers, discounts, and limited-time deals.
- Policies: Cancellation policy and check-in/check-out times.
- Additional details: Property description and nearby attractions.
How To Scrape Booking.com with Python: Step-by-Step Guide
This step-by-step guide will show you how to build a Booking.com scraper using Python to ensure efficient and reliable data extraction.
Step 1️⃣: Project Setup
Make sure you have Python 3 installed. If not, download and install it from the official website.
Create a folder for your project:
mkdir booking-scraper
Navigate into the project folder and initialize a virtual environment:
cd booking-scraper
python -m venv env
Activate the virtual environment:
- Linux/macOS:
./env/bin/activate
- Windows:
env/Scripts/activate
Create a scraper.py
file in the project directory, ready for scraping logic.
Step 2️⃣: Select the Scraping Library
Booking.com is a dynamic website, meaning it loads content dynamically using JavaScript. The best approach for scraping dynamic sites is using browser automation tools. For this tutorial, we’ll use Selenium.
Step 3️⃣: Install and Configure Selenium
Install Selenium using pip:
pip install selenium
Import Selenium in scraper.py
and initialize a WebDriver
:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
# create a Chrome web driver instance
driver = webdriver.Chrome(service=Service())
driver.quit()
Remember to include driver.quit()
at the end of the script to close the browser.
Step 4️⃣: Visit the Target Page
Manually perform a search on Booking.com and copy the resulting URL. Then, use Selenium to visit the target page:
driver.get("https://www.booking.com/searchresults.html?ss=New+York&ssne=New+York&ssne_untouched=New+York&label=gen173nr-1FCAEoggI46AdIM1gEaHGIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4Aof767kGwAIB0gIkNGE2MTI1MjgtZjJlNC00YWM4LWFlMmQtOGIxZjM3NWIyNDlm2AIF4AIB&sid=b91524e727f20006ae00489afb379d3a&aid=304142&lang=en-us&sb=1&src_elem=sb&src=index&dest_id=20088325&dest_type=city&checkin=2025-11-18&checkout=2025-12-18&group_adults=2&no_rooms=1&group_children=0")
Step 5️⃣: Deal With the Login Alert
Booking.com often shows a login alert, blocking page content. You need to use Selenium to close it. Identify the close button using developer tools and write a try-except block:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
try:
close_button = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CSS_SELECTOR, "[role=\"dialog\"] button[aria-label=\"Dismiss sign-in info.\"] "))
)
close_button.click()
except TimeoutException:
print("Sign-in modal did not appear, continuing...")
Step 6️⃣: Select the Booking.com Items
Initialize an empty list to hold scraped data:
items = []
Select all property card elements on the page using CSS selector:
property_items = driver.find_elements(By.CSS_SELECTOR, "[data-testid=\"property-card\"]")
Step 7️⃣: Scrape the Booking.com Items
Use a custom exception handler function to gracefully deal with inconsistent elements on different property items
from selenium.common import NoSuchElementException
def handle_no_such_element_exception(data_extraction_task):
try:
return data_extraction_task()
except NoSuchElementException as e:
return None
Then, inside the loop, extract property data using CSS selectors and our error handler. For example:
for property_item in property_items:
# scraping logic...
url = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "a[data-testid=\"property-card-desktop-single-image\"]").get_attribute("href"))
image = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "img[data-testid=\"image\"]").get_attribute("src"))
title = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"title\"]").text)
address = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"address\"]").text)
distance = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"distance\"]").text)
review_score = None
review_count = None
review_text = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"review-score\"]").text)
if review_text is not None:
# split the review string by \n
parts = review_text.split("\n")
# process each part
for part in parts:
part = part.strip()
# check if this part is a number (potential review score)
if part.replace(".", "", 1).isdigit():
review_score = float(part)
# check if it contains the \"reviews\" string
elif "reviews" in part:
# extract the number before \"reviews\"
review_count = int(part.split(" ")[0].replace(",", ""))
description = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"recommended-units\"]").text)
price_element = handle_no_such_element_exception(lambda: (property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"availability-rate-information\"]")))
if price_element is not None:
original_price = handle_no_such_element_exception(lambda: (
price_element.find_element(By.CSS_SELECTOR, "[aria-hidden=\"true\"]:not([data-testid])").text.replace(",", "")
))
price = handle_no_such_element_exception(lambda: (
price_element.find_element(By.CSS_SELECTOR, "[data-testid=\"price-and-discounted-price\"]").text.replace(",", "")
))
item = {
"url": url,
"image": image,
"title": title,
"address": address,
"distance": distance,
"review_score": review_score,
"review_count": review_count,
"description": description,
"original_price": original_price,
"price": price
}
items.append(item)
Step 8️⃣: Export to CSV
Import the csv
library and write scraped data to CSV:
import csv
output_file = "properties.csv"
with open(output_file, mode="w", newline="", encoding="utf-8") as file:
writer = csv.DictWriter(file, fieldnames=["url", "image", "title", "address", "distance", "review_score", "review_count", "description", "original_price", "price"])
writer.writeheader()
writer.writerows(items)
Step 9️⃣: Put It All Together
Review the complete scraper.py
:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium.common import NoSuchElementException
import csv
def handle_no_such_element_exception(data_extraction_task):
try:
return data_extraction_task()
except NoSuchElementException as e:
return None
# create a Chrome web driver instance
driver = webdriver.Chrome(service=Service())
# connect to the target page
driver.get("https://www.booking.com/searchresults.html?ss=New+York&ssne=New+York&ssne_untouched=New+York&label=gen173nr-1FCAEoggI46AdIM1gEaHGIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4Aof767kGwAIB0gIkNGE2MTI1MjgtZjJlNC00YWM4LWFlMmQtOGIxZjM3NWIyNDlm2AIF4AIB&sid=b91524e727f20006ae00489afb379d3a&aid=304142&lang=en-us&sb=1&src_elem=sb&src=index&dest_id=20088325&dest_type=city&checkin=2025-11-18&checkout=2025-12-18&group_adults=2&no_rooms=1&group_children=0")
# handle the sign-in alert
try:
# wait up to 20 seconds for the sign-in alert to appear
close_button = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CSS_SELECTOR, "[role=\"dialog\"] button[aria-label=\"Dismiss sign-in info.\"] "))
)
# click the close button
close_button.click()
except TimeoutException:
print("Sign-in modal did not appear, continuing...")
# where to store the scraped data
items = []
# select all property items on the page
property_items = driver.find_elements(By.CSS_SELECTOR, "[data-testid=\"property-card\"]")
# iterate over the property items and
# extract data from them
for property_item in property_items:
# scraping logic...
url = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "a[data-testid=\"property-card-desktop-single-image\"]").get_attribute("href"))
image = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "img[data-testid=\"image\"]").get_attribute("src"))
title = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"title\"]").text)
address = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"address\"]").text)
distance = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"distance\"]").text)
review_score = None
review_count = None
review_text = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"review-score\"]").text)
if review_text is not None:
# split the review string by \n
parts = review_text.split("\n")
# process each part
for part in parts:
part = part.strip()
# check if this part is a number (potential review score)
if part.replace(".", "", 1).isdigit():
review_score = float(part)
# check if it contains the \"reviews\" string
elif "reviews" in part:
# extract the number before \"reviews\"
review_count = int(part.split(" ")[0].replace(",", ""))
description = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"recommended-units\"]").text)
price_element = handle_no_such_element_exception(lambda: (property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"availability-rate-information\"]")))
if price_element is not None:
original_price = handle_no_such_element_exception(lambda: (
price_element.find_element(By.CSS_SELECTOR, "[aria-hidden=\"true\"]:not([data-testid])").text.replace(",", "")
))
price = handle_no_such_element_exception(lambda: (
price_element.find_element(By.CSS_SELECTOR, "[data-testid=\"price-and-discounted-price\"]").text.replace(",", "")
))
# populate a new item with the scraped data
item = {
"url": url,
"image": image,
"title": title,
"address": address,
"distance": distance,
"review_score": review_score,
"review_count": review_count,
"description": description,
"original_price": original_price,
"price": price
}
# add the new item to the list of scraped items
items.append(item)
# specify the name of the output CSV file
output_file = "properties.csv"
# export the items list to a CSV file
with open(output_file, mode="w", newline="", encoding="utf-8") as file:
#create a CSV writer object
writer = csv.DictWriter(file, fieldnames=["url", "image", "title", "address", "distance", "review_score", "review_count", "description", "original_price", "price"])
# write the header row
writer.writeheader()
# write each item as a row in the CSV
writer.writerows(items)
# close the web driver and release its resources
driver.quit()
Run the script using the command python scraper.py
, and you’ll find the results in properties.csv
.
Real Use Cases That Scrape Booking.com with Python
Scraping Booking.com with Python has enabled companies and solo developers to build innovative solutions. Here are some real-world scenarios:
- Price Optimization: A travel startup used Booking.com data to adjust their nightly rates based on city-level trends. This increased their average booking revenue by 18 percent.
- Competitor Monitoring: Digital agencies track local competitor listings across cities to assess how often they appear in search results or gain new reviews.
- Affiliate Automation: Webmasters automate the extraction of top listings and synchronize their affiliate marketing platforms without manual updates.
- Review Sentiment Analysis: Companies collect guest review data, run NLP models to extract sentiment, and correlate that with pricing strategies.
- Regional Availability Reports: Analytics teams in large OTAs analyze when regions are overbooked to recommend alternate destinations to users in real-time.
Next Steps After You Scrape Booking.com with Python
If you now feel confident about how to scrape Booking.com with Python, you are ready to scale, analyze, and integrate your results. The goal is not to scrape and forget, but to build workflows that derive business intelligence or power real-time services. Be mindful of data usage policies and implement caching, retries, and proxy rotation as you grow your scraping infrastructure. Consider running your scraper via cloud functions or cron jobs for consistent, periodic updates.