Affordable Rotating Residential Proxies with Unlimited Bandwidth
  • Products
  • Features
  • Pricing
  • Solutions
  • Blog

Contact sales

Give us a call or fill in the form below and we will contact you. We endeavor to answer all inquiries within 24 hours on business days. Or drop us a message at support@proxytee.com.

Edit Content



    Sign In
    Tutorial

    How to Scrape Booking.com with Python

    April 23, 2025 Mike
    How to Scrape Booking.com with Python

    How to Scrape Booking.com with Python is a question that often surfaces among developers, data analysts, and digital marketers looking to gather pricing, availability, or review data from one of the world’s largest travel platforms. In this guide, we will walk through practical ways to scrape Booking.com with Python effectively, ethically, and efficiently. Whether you’re working on market intelligence, price comparisons, or sentiment analysis, you will find this guide useful.

    By the end of this article, you will understand not only how to technically approach the scraping process but also how to apply it in real-world scenarios using Python tools, proxy solutions, and strategic planning.

    Why Scraping Booking.com with Python

    Booking.com is a rich source of public data for accommodation details, reviews, pricing trends, and regional availability. However, scraping this data is not straightforward due to the platform’s frequent structure changes, bot detection systems, and dynamic content loading. Python offers multiple libraries that simplify these challenges and make it easier to collect and process data at scale.

    Scraping Booking.com with Python is often used for:

    • Competitor price tracking
    • Market trend analysis for hotels and travel agencies
    • Review aggregation for sentiment analysis
    • Listing verification for travel affiliates

    Data You Can Scrape From Booking.com

    Here’s a list of key data points that can be extracted from Booking.com:

    • Property details: Hotel name, address, distance from landmarks (e.g., city center).
    • Pricing information: Regular and discounted prices.
    • Reviews and ratings: Review score, number of reviews, and guest feedback.
    • Availability: Room types available, booking options, and dates with availability.
    • Media: Property and room images.
    • Amenities: Facilities offered (e.g., Wi-Fi, parking, pool) and room-specific amenities.
    • Promotions: Special offers, discounts, and limited-time deals.
    • Policies: Cancellation policy and check-in/check-out times.
    • Additional details: Property description and nearby attractions.

    How To Scrape Booking.com with Python: Step-by-Step Guide

    This step-by-step guide will show you how to build a Booking.com scraper using Python to ensure efficient and reliable data extraction.

    Step 1️⃣: Project Setup

    Make sure you have Python 3 installed. If not, download and install it from the official website.

    Create a folder for your project:

    mkdir booking-scraper

    Navigate into the project folder and initialize a virtual environment:

    cd booking-scraper
    python -m venv env

    Activate the virtual environment:

    • Linux/macOS: ./env/bin/activate
    • Windows: env/Scripts/activate

    Create a scraper.py file in the project directory, ready for scraping logic.

    Step 2️⃣: Select the Scraping Library

    Booking.com is a dynamic website, meaning it loads content dynamically using JavaScript. The best approach for scraping dynamic sites is using browser automation tools. For this tutorial, we’ll use Selenium.

    Step 3️⃣: Install and Configure Selenium

    Install Selenium using pip:

    pip install selenium

    Import Selenium in scraper.py and initialize a WebDriver:

    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    
    # create a Chrome web driver instance
    driver = webdriver.Chrome(service=Service())
    
    driver.quit()

    Remember to include driver.quit() at the end of the script to close the browser.

    Step 4️⃣: Visit the Target Page

    Manually perform a search on Booking.com and copy the resulting URL. Then, use Selenium to visit the target page:

    driver.get("https://www.booking.com/searchresults.html?ss=New+York&ssne=New+York&ssne_untouched=New+York&label=gen173nr-1FCAEoggI46AdIM1gEaHGIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4Aof767kGwAIB0gIkNGE2MTI1MjgtZjJlNC00YWM4LWFlMmQtOGIxZjM3NWIyNDlm2AIF4AIB&sid=b91524e727f20006ae00489afb379d3a&aid=304142&lang=en-us&sb=1&src_elem=sb&src=index&dest_id=20088325&dest_type=city&checkin=2025-11-18&checkout=2025-12-18&group_adults=2&no_rooms=1&group_children=0")

    Step 5️⃣: Deal With the Login Alert

    Booking.com often shows a login alert, blocking page content. You need to use Selenium to close it. Identify the close button using developer tools and write a try-except block:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    from selenium.common.exceptions import TimeoutException
    
    try:
        close_button = WebDriverWait(driver, 20).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "[role=\"dialog\"] button[aria-label=\"Dismiss sign-in info.\"] "))
        )
        close_button.click()
    except TimeoutException:
        print("Sign-in modal did not appear, continuing...")
    

    Step 6️⃣: Select the Booking.com Items

    Initialize an empty list to hold scraped data:

    items = []
    

    Select all property card elements on the page using CSS selector:

    property_items = driver.find_elements(By.CSS_SELECTOR, "[data-testid=\"property-card\"]")
    

    Step 7️⃣: Scrape the Booking.com Items

    Use a custom exception handler function to gracefully deal with inconsistent elements on different property items

    from selenium.common import NoSuchElementException
    
    def handle_no_such_element_exception(data_extraction_task):
        try:
            return data_extraction_task()
        except NoSuchElementException as e:
            return None
    

    Then, inside the loop, extract property data using CSS selectors and our error handler. For example:

    for property_item in property_items:
        # scraping logic...
        url = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "a[data-testid=\"property-card-desktop-single-image\"]").get_attribute("href"))
        image = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "img[data-testid=\"image\"]").get_attribute("src"))
        
        title = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"title\"]").text)
        address = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"address\"]").text)
        distance = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"distance\"]").text)
    
        review_score = None
        review_count = None
        review_text = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"review-score\"]").text)
        if review_text is not None:
          # split the review string by \n
          parts = review_text.split("\n")
    
          # process each part
          for part in parts:
              part = part.strip()
              # check if this part is a number (potential review score)
              if part.replace(".", "", 1).isdigit():
                  review_score = float(part)
              # check if it contains the \"reviews\" string
              elif "reviews" in part:
                  # extract the number before \"reviews\"
                  review_count = int(part.split(" ")[0].replace(",", ""))
        
        description = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"recommended-units\"]").text)
    
        price_element = handle_no_such_element_exception(lambda: (property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"availability-rate-information\"]")))
        if price_element is not None:
            original_price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[aria-hidden=\"true\"]:not([data-testid])").text.replace(",", "")
            ))
            price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[data-testid=\"price-and-discounted-price\"]").text.replace(",", "")
            ))
        
        item = {
          "url": url,
          "image": image,
          "title": title,
          "address": address,
          "distance": distance,
          "review_score": review_score,
          "review_count": review_count,
          "description": description,
          "original_price": original_price,
          "price": price
        }
        items.append(item)
    

    Step 8️⃣: Export to CSV

    Import the csv library and write scraped data to CSV:

    import csv
    
    output_file = "properties.csv"
    with open(output_file, mode="w", newline="", encoding="utf-8") as file:
        writer = csv.DictWriter(file, fieldnames=["url", "image", "title", "address", "distance", "review_score", "review_count", "description", "original_price", "price"])
        writer.writeheader()
        writer.writerows(items)
    

    Step 9️⃣: Put It All Together

    Review the complete scraper.py:

    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    from selenium.common.exceptions import TimeoutException
    from selenium.common import NoSuchElementException
    
    import csv
    
    def handle_no_such_element_exception(data_extraction_task):
        try:
            return data_extraction_task()
        except NoSuchElementException as e:
            return None
    
    # create a Chrome web driver instance
    driver = webdriver.Chrome(service=Service())
    
    # connect to the target page
    driver.get("https://www.booking.com/searchresults.html?ss=New+York&ssne=New+York&ssne_untouched=New+York&label=gen173nr-1FCAEoggI46AdIM1gEaHGIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4Aof767kGwAIB0gIkNGE2MTI1MjgtZjJlNC00YWM4LWFlMmQtOGIxZjM3NWIyNDlm2AIF4AIB&sid=b91524e727f20006ae00489afb379d3a&aid=304142&lang=en-us&sb=1&src_elem=sb&src=index&dest_id=20088325&dest_type=city&checkin=2025-11-18&checkout=2025-12-18&group_adults=2&no_rooms=1&group_children=0")
    
    # handle the sign-in alert
    try:
        # wait up to 20 seconds for the sign-in alert to appear
        close_button = WebDriverWait(driver, 20).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "[role=\"dialog\"] button[aria-label=\"Dismiss sign-in info.\"] "))
        )
        # click the close button
        close_button.click()
    except TimeoutException:
        print("Sign-in modal did not appear, continuing...")
    
    # where to store the scraped data
    items = []
    
    # select all property items on the page
    property_items = driver.find_elements(By.CSS_SELECTOR, "[data-testid=\"property-card\"]")
    
    # iterate over the property items and
    # extract data from them
    for property_item in property_items:
        # scraping logic...
        url = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "a[data-testid=\"property-card-desktop-single-image\"]").get_attribute("href"))
        image = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "img[data-testid=\"image\"]").get_attribute("src"))
    
        title = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"title\"]").text)
        address = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"address\"]").text)
        distance = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"distance\"]").text)
    
        review_score = None
        review_count = None
        review_text = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"review-score\"]").text)
        if review_text is not None:
          # split the review string by \n
          parts = review_text.split("\n")
    
          # process each part
          for part in parts:
              part = part.strip()
              # check if this part is a number (potential review score)
              if part.replace(".", "", 1).isdigit():
                  review_score = float(part)
              # check if it contains the \"reviews\" string
              elif "reviews" in part:
                  # extract the number before \"reviews\"
                  review_count = int(part.split(" ")[0].replace(",", ""))
    
        description = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"recommended-units\"]").text)
    
        price_element = handle_no_such_element_exception(lambda: (property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"availability-rate-information\"]")))
        if price_element is not None:
            original_price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[aria-hidden=\"true\"]:not([data-testid])").text.replace(",", "")
            ))
            price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[data-testid=\"price-and-discounted-price\"]").text.replace(",", "")
            ))
        
        # populate a new item with the scraped data
        item = {
          "url": url,
          "image": image,
          "title": title,
          "address": address,
          "distance": distance,
          "review_score": review_score,
          "review_count": review_count,
          "description": description,
          "original_price": original_price,
          "price": price
        }
        # add the new item to the list of scraped items
        items.append(item)
    
    # specify the name of the output CSV file
    output_file = "properties.csv"
    
    # export the items list to a CSV file
    with open(output_file, mode="w", newline="", encoding="utf-8") as file:
        #create a CSV writer object
        writer = csv.DictWriter(file, fieldnames=["url", "image", "title", "address", "distance", "review_score", "review_count", "description", "original_price", "price"])
        # write the header row
        writer.writeheader()
    
        # write each item as a row in the CSV
        writer.writerows(items)
    
    # close the web driver and release its resources
    driver.quit()
    

    Run the script using the command python scraper.py, and you’ll find the results in properties.csv.

    Real Use Cases That Scrape Booking.com with Python

    Scraping Booking.com with Python has enabled companies and solo developers to build innovative solutions. Here are some real-world scenarios:

    • Price Optimization: A travel startup used Booking.com data to adjust their nightly rates based on city-level trends. This increased their average booking revenue by 18 percent.
    • Competitor Monitoring: Digital agencies track local competitor listings across cities to assess how often they appear in search results or gain new reviews.
    • Affiliate Automation: Webmasters automate the extraction of top listings and synchronize their affiliate marketing platforms without manual updates.
    • Review Sentiment Analysis: Companies collect guest review data, run NLP models to extract sentiment, and correlate that with pricing strategies.
    • Regional Availability Reports: Analytics teams in large OTAs analyze when regions are overbooked to recommend alternate destinations to users in real-time.

    Next Steps After You Scrape Booking.com with Python

    If you now feel confident about how to scrape Booking.com with Python, you are ready to scale, analyze, and integrate your results. The goal is not to scrape and forget, but to build workflows that derive business intelligence or power real-time services. Be mindful of data usage policies and implement caching, retries, and proxy rotation as you grow your scraping infrastructure. Consider running your scraper via cloud functions or cron jobs for consistent, periodic updates.

    • Python
    • Web Scraping

    Post navigation

    Previous
    Next

    Categories

    • Comparison & Differences
    • Exploring
    • Integration
    • Tutorial

    Recent posts

    • Set Up ProxyTee Proxies in GeeLark for Smooth Online Tasks
      Set Up ProxyTee Proxies in GeeLark for Smooth Online Tasks
    • Web Scraping with Beautiful Soup
      Learn Web Scraping with Beautiful Soup
    • How to Set Up a Proxy in SwitchyOmega
      How to Set Up a Proxy in SwitchyOmega (Step-by-Step Guide)
    • DuoPlus Cloud Mobile Feature Overview: Empowering Unlimited Opportunities Abroad
      DuoPlus Cloud Mobile Feature Overview: Empowering Unlimited Opportunities Abroad!
    • Best Rotating Proxies in 2025
      Best Rotating Proxies in 2025

    Related Posts

    Web Scraping with Beautiful Soup
    Tutorial

    Learn Web Scraping with Beautiful Soup

    May 30, 2025 Mike

    Learn Web Scraping with Beautiful Soup and unlock the power of automated data collection from websites. Whether you’re a developer, digital marketer, data analyst, or simply curious, web scraping provides efficient ways to gather information from the internet. In this guide, we explore how Beautiful Soup can help you parse HTML and XML data, and […]

    Best Rotating Proxies in 2025
    Comparison & Differences

    Best Rotating Proxies in 2025

    May 19, 2025 Mike

    Best Rotating Proxies in 2025 are essential tools for developers, marketers, and SEO professionals seeking efficient and reliable data collection. With the increasing complexity of web scraping and data gathering, choosing the right proxy service can significantly impact your operations. This article explores the leading rotating proxy providers in 2025, highlighting their unique features and […]

    How to Scrape Websites with Puppeteer: A 2025 Beginner’s Guide
    Tutorial

    How to Scrape Websites with Puppeteer: A 2025 Beginner’s Guide

    May 19, 2025 Mike

    Scrape websites with Puppeteer efficiently using modern techniques that are perfect for developers, SEO professionals, and data analysts. Puppeteer, a Node.js library developed by Google, has become one of the go-to solutions for browser automation and web scraping in recent years. Whether you are scraping data for competitive analysis, price monitoring, or SEO audits, learning […]

    We help ambitious businesses achieve more

    Free consultation
    Contact sales
    • Sign In
    • Sign Up
    • Contact
    • Facebook
    • Twitter
    • Telegram
    Affordable Rotating Residential Proxies with Unlimited Bandwidth

    Get reliable, affordable rotating proxies with unlimited bandwidth for seamless browsing and enhanced security.

    Products
    • Features
    • Pricing
    • Solutions
    • Testimonials
    • FAQs
    • Partners
    Tools
    • App
    • API
    • Blog
    • Check Proxies
    • Free Proxies
    Legal
    • Privacy Policy
    • Terms of Use
    • Affiliate
    • Reseller
    • White-label
    Support
    • Contact
    • Support Center
    • Knowlegde Base

    Copyright © 2025 ProxyTee