Affordable Rotating Residential Proxies with Unlimited Bandwidth
  • Products
  • Features
  • Pricing
  • Solutions
  • Blog

Contact sales

Give us a call or fill in the form below and we will contact you. We endeavor to answer all inquiries within 24 hours on business days. Or drop us a message at support@proxytee.com.

Edit Content



    Sign In
    Tutorial

    How to Scrape Booking.com with Python Using ProxyTee

    April 23, 2025 Mike
    How To Scrape Booking.com using Python with ProxyTee

    A Booking.com scraper is an automated tool designed to extract data from Booking.com pages. This tool retrieves essential details from property listings, including hotel names, prices, reviews, ratings, amenities, and availability. This data is invaluable for market analysis, price comparison, and creating travel-related datasets. ProxyTee can help in this process by providing the robust infrastructure needed for efficient scraping.

    In this post, you’ll learn how to build a Python scraper for Booking.com to efficiently extract hotel data, reviews, and prices, all while leveraging the power of ProxyTee.


    Data You Can Scrape From Booking.com

    Here’s a list of key data points that can be extracted from Booking.com:

    • Property details: Hotel name, address, distance from landmarks (e.g., city center).
    • Pricing information: Regular and discounted prices.
    • Reviews and ratings: Review score, number of reviews, and guest feedback.
    • Availability: Room types available, booking options, and dates with availability.
    • Media: Property and room images.
    • Amenities: Facilities offered (e.g., Wi-Fi, parking, pool) and room-specific amenities.
    • Promotions: Special offers, discounts, and limited-time deals.
    • Policies: Cancellation policy and check-in/check-out times.
    • Additional details: Property description and nearby attractions.

    Scraping Booking.com in Python: Step-by-Step Guide

    This step-by-step guide will show you how to build a Booking.com scraper using Python with ProxyTee to ensure efficient and reliable data extraction.

    Step 1️⃣: Project Setup

    Make sure you have Python 3 installed. If not, download and install it from the official website.

    Create a folder for your project:

    mkdir booking-scraper

    Navigate into the project folder and initialize a virtual environment:

    cd booking-scraper
    python -m venv env

    Activate the virtual environment:

    • Linux/macOS: ./env/bin/activate
    • Windows: env/Scripts/activate

    Create a scraper.py file in the project directory, ready for scraping logic.

    Step 2️⃣: Select the Scraping Library

    Booking.com is a dynamic website, meaning it loads content dynamically using JavaScript. The best approach for scraping dynamic sites is using browser automation tools. For this tutorial, we’ll use Selenium.

    Step 3️⃣: Install and Configure Selenium

    Install Selenium using pip:

    pip install selenium

    Import Selenium in scraper.py and initialize a WebDriver:

    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    
    # create a Chrome web driver instance
    driver = webdriver.Chrome(service=Service())
    
    driver.quit()

    Remember to include driver.quit() at the end of the script to close the browser.

    Step 4️⃣: Visit the Target Page

    Manually perform a search on Booking.com and copy the resulting URL. Then, use Selenium to visit the target page:

    driver.get("https://www.booking.com/searchresults.html?ss=New+York&ssne=New+York&ssne_untouched=New+York&label=gen173nr-1FCAEoggI46AdIM1gEaHGIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4Aof767kGwAIB0gIkNGE2MTI1MjgtZjJlNC00YWM4LWFlMmQtOGIxZjM3NWIyNDlm2AIF4AIB&sid=b91524e727f20006ae00489afb379d3a&aid=304142&lang=en-us&sb=1&src_elem=sb&src=index&dest_id=20088325&dest_type=city&checkin=2025-11-18&checkout=2025-12-18&group_adults=2&no_rooms=1&group_children=0")

    Step 5️⃣: Deal With the Login Alert

    Booking.com often shows a login alert, blocking page content. You need to use Selenium to close it. Identify the close button using developer tools and write a try-except block:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    from selenium.common.exceptions import TimeoutException
    
    try:
        close_button = WebDriverWait(driver, 20).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "[role=\"dialog\"] button[aria-label=\"Dismiss sign-in info.\"] "))
        )
        close_button.click()
    except TimeoutException:
        print("Sign-in modal did not appear, continuing...")
    

    Step 6️⃣: Select the Booking.com Items

    Initialize an empty list to hold scraped data:

    items = []
    

    Select all property card elements on the page using CSS selector:

    property_items = driver.find_elements(By.CSS_SELECTOR, "[data-testid=\"property-card\"]")
    

    Step 7️⃣: Scrape the Booking.com Items

    Use a custom exception handler function to gracefully deal with inconsistent elements on different property items

    from selenium.common import NoSuchElementException
    
    def handle_no_such_element_exception(data_extraction_task):
        try:
            return data_extraction_task()
        except NoSuchElementException as e:
            return None
    

    Then, inside the loop, extract property data using CSS selectors and our error handler. For example:

    for property_item in property_items:
        # scraping logic...
        url = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "a[data-testid=\"property-card-desktop-single-image\"]").get_attribute("href"))
        image = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "img[data-testid=\"image\"]").get_attribute("src"))
        
        title = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"title\"]").text)
        address = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"address\"]").text)
        distance = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"distance\"]").text)
    
        review_score = None
        review_count = None
        review_text = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"review-score\"]").text)
        if review_text is not None:
          # split the review string by \n
          parts = review_text.split("\n")
    
          # process each part
          for part in parts:
              part = part.strip()
              # check if this part is a number (potential review score)
              if part.replace(".", "", 1).isdigit():
                  review_score = float(part)
              # check if it contains the \"reviews\" string
              elif "reviews" in part:
                  # extract the number before \"reviews\"
                  review_count = int(part.split(" ")[0].replace(",", ""))
        
        description = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"recommended-units\"]").text)
    
        price_element = handle_no_such_element_exception(lambda: (property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"availability-rate-information\"]")))
        if price_element is not None:
            original_price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[aria-hidden=\"true\"]:not([data-testid])").text.replace(",", "")
            ))
            price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[data-testid=\"price-and-discounted-price\"]").text.replace(",", "")
            ))
        
        item = {
          "url": url,
          "image": image,
          "title": title,
          "address": address,
          "distance": distance,
          "review_score": review_score,
          "review_count": review_count,
          "description": description,
          "original_price": original_price,
          "price": price
        }
        items.append(item)
    

    Step 8️⃣: Export to CSV

    Import the csv library and write scraped data to CSV:

    import csv
    
    output_file = "properties.csv"
    with open(output_file, mode="w", newline="", encoding="utf-8") as file:
        writer = csv.DictWriter(file, fieldnames=["url", "image", "title", "address", "distance", "review_score", "review_count", "description", "original_price", "price"])
        writer.writeheader()
        writer.writerows(items)
    

    Step 9️⃣: Put It All Together

    Review the complete scraper.py:

    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    from selenium.common.exceptions import TimeoutException
    from selenium.common import NoSuchElementException
    
    import csv
    
    def handle_no_such_element_exception(data_extraction_task):
        try:
            return data_extraction_task()
        except NoSuchElementException as e:
            return None
    
    # create a Chrome web driver instance
    driver = webdriver.Chrome(service=Service())
    
    # connect to the target page
    driver.get("https://www.booking.com/searchresults.html?ss=New+York&ssne=New+York&ssne_untouched=New+York&label=gen173nr-1FCAEoggI46AdIM1gEaHGIAQGYATG4ARfIAQzYAQHoAQH4AQKIAgGoAgO4Aof767kGwAIB0gIkNGE2MTI1MjgtZjJlNC00YWM4LWFlMmQtOGIxZjM3NWIyNDlm2AIF4AIB&sid=b91524e727f20006ae00489afb379d3a&aid=304142&lang=en-us&sb=1&src_elem=sb&src=index&dest_id=20088325&dest_type=city&checkin=2025-11-18&checkout=2025-12-18&group_adults=2&no_rooms=1&group_children=0")
    
    # handle the sign-in alert
    try:
        # wait up to 20 seconds for the sign-in alert to appear
        close_button = WebDriverWait(driver, 20).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, "[role=\"dialog\"] button[aria-label=\"Dismiss sign-in info.\"] "))
        )
        # click the close button
        close_button.click()
    except TimeoutException:
        print("Sign-in modal did not appear, continuing...")
    
    # where to store the scraped data
    items = []
    
    # select all property items on the page
    property_items = driver.find_elements(By.CSS_SELECTOR, "[data-testid=\"property-card\"]")
    
    # iterate over the property items and
    # extract data from them
    for property_item in property_items:
        # scraping logic...
        url = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "a[data-testid=\"property-card-desktop-single-image\"]").get_attribute("href"))
        image = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "img[data-testid=\"image\"]").get_attribute("src"))
    
        title = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"title\"]").text)
        address = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"address\"]").text)
        distance = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"distance\"]").text)
    
        review_score = None
        review_count = None
        review_text = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"review-score\"]").text)
        if review_text is not None:
          # split the review string by \n
          parts = review_text.split("\n")
    
          # process each part
          for part in parts:
              part = part.strip()
              # check if this part is a number (potential review score)
              if part.replace(".", "", 1).isdigit():
                  review_score = float(part)
              # check if it contains the \"reviews\" string
              elif "reviews" in part:
                  # extract the number before \"reviews\"
                  review_count = int(part.split(" ")[0].replace(",", ""))
    
        description = handle_no_such_element_exception(lambda: property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"recommended-units\"]").text)
    
        price_element = handle_no_such_element_exception(lambda: (property_item.find_element(By.CSS_SELECTOR, "[data-testid=\"availability-rate-information\"]")))
        if price_element is not None:
            original_price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[aria-hidden=\"true\"]:not([data-testid])").text.replace(",", "")
            ))
            price = handle_no_such_element_exception(lambda: (
                price_element.find_element(By.CSS_SELECTOR, "[data-testid=\"price-and-discounted-price\"]").text.replace(",", "")
            ))
        
        # populate a new item with the scraped data
        item = {
          "url": url,
          "image": image,
          "title": title,
          "address": address,
          "distance": distance,
          "review_score": review_score,
          "review_count": review_count,
          "description": description,
          "original_price": original_price,
          "price": price
        }
        # add the new item to the list of scraped items
        items.append(item)
    
    # specify the name of the output CSV file
    output_file = "properties.csv"
    
    # export the items list to a CSV file
    with open(output_file, mode="w", newline="", encoding="utf-8") as file:
        #create a CSV writer object
        writer = csv.DictWriter(file, fieldnames=["url", "image", "title", "address", "distance", "review_score", "review_count", "description", "original_price", "price"])
        # write the header row
        writer.writeheader()
    
        # write each item as a row in the CSV
        writer.writerows(items)
    
    # close the web driver and release its resources
    driver.quit()
    

    Run the script using the command python scraper.py, and you’ll find the results in properties.csv.


    Taking Your Scraper Further

    This tutorial has demonstrated how to build a Booking.com scraper with Python. While the basic script covers fundamental scraping, be aware that issues like anti-scraping measures and dynamic content handling can make scraping challenging.

    To ensure a robust and reliable scraping process, consider using ProxyTee. ProxyTee provides unlimited residential proxies that rotates IPs, thus reducing the risk of being blocked. ProxyTee offers:

    • Unlimited bandwidth, allowing intensive data operations without worrying about overages.
    • Global IP coverage to access specific regions for location-based tasks.
    • Auto rotation, which rotates IPs frequently to prevent detection and bans from websites, including booking.com. You can also customize the rotation interval for your needs.
    • Simple API integration, supporting automation and compatibility with different workflows
    • A user-friendly interface.

    Consider using ProxyTee for robust and reliable scraping. ProxyTee ensures your web scraping tasks are efficient, effective, and reliable. Start your data gathering journey with ProxyTee today.

    • Python
    • Web Scraping

    Post navigation

    Previous
    Next

    Table of Contents

    • Data You Can Scrape From Booking.com
    • Scraping Booking.com in Python: Step-by-Step Guide
    • Taking Your Scraper Further

    Categories

    • Comparison & Differences (25)
    • Cybersecurity (5)
    • Datacenter Proxies (2)
    • Digital Marketing & Data Analytics (1)
    • Exploring (67)
    • Guide (1)
    • Mobile Proxies (2)
    • Residental Proxies (4)
    • Rotating Proxies (3)
    • Tutorial (52)
    • Uncategorized (1)
    • Web Scraping (3)

    Recent posts

    • Types of Proxies Explained: Mastering 3 Key Categories
      Types of Proxies Explained: Mastering 3 Key Categories
    • What is MAP Monitoring and Why It’s Crucial for Your Brand?
      What is MAP Monitoring and Why It’s Crucial for Your Brand?
    • earning with proxytee, affiliate, reseller, unlimited bandwidth, types of proxies, unlimited residential proxy, contact, press-kit
      Unlock Peak Performance with an Unlimited Residential Proxy
    • Web Scraping with lxml: A Guide Using ProxyTee
      Web Scraping with lxml: A Guide Using ProxyTee
    • How to Scrape Yelp Data with ProxyTee
      How to Scrape Yelp Data for Local Business Insights

    Related Posts

    Web Scraping with lxml: A Guide Using ProxyTee
    Tutorial

    Web Scraping with lxml: A Guide Using ProxyTee

    May 12, 2025 Mike

    Web scraping is an automated process of collecting data from websites, which is essential for many purposes, such as data analysis and training AI models. Python is a popular language for web scraping, and lxml is a robust library for parsing HTML and XML documents. In this post, we’ll explore how to leverage lxml for web […]

    How to Scrape Yelp Data with ProxyTee
    Tutorial

    How to Scrape Yelp Data for Local Business Insights

    May 10, 2025 Mike

    Scraping Yelp data can open up a world of insights for marketers, developers, and SEO professionals. Whether you’re conducting market research, generating leads, or monitoring local business trends, having access to structured Yelp data is invaluable. In this article, we’ll walk you through how to scrape Yelp data safely and effectively. You’ll discover real use […]

    Understanding Data Extraction with ProxyTee
    Exploring

    Understanding Data Extraction with ProxyTee

    May 9, 2025 Mike

    Data extraction is a cornerstone for many modern businesses, spanning various sectors from finance to e-commerce. Effective data extraction tools are crucial for automating tasks, saving time, resources, and money. This post delves into the essentials of data extraction, covering its uses, methods, and challenges, and explores how ProxyTee can enhance this process with its […]

    We help ambitious businesses achieve more

    Free consultation
    Contact sales
    • Sign In
    • Sign Up
    • Contact
    • Facebook
    • Twitter
    • Telegram
    Affordable Rotating Residential Proxies with Unlimited Bandwidth

    Get reliable, affordable rotating proxies with unlimited bandwidth for seamless browsing and enhanced security.

    Products
    • Features
    • Pricing
    • Solutions
    • Testimonials
    • FAQs
    • Partners
    Tools
    • App
    • API
    • Blog
    • Check Proxies
    • Free Proxies
    Legal
    • Privacy Policy
    • Terms of Use
    • Affiliate
    • Reseller
    • White-label
    Support
    • Contact
    • Support Center
    • Knowlegde Base

    Copyright © 2025 ProxyTee