Scrape Idealista 🏠 Real Estate Property Listings with ScrapeGraph AI 🕷️

Created by

Last update

Last update 18 hours ago

Key Advantages

1. ✅ Fully Automated Real Estate Data Collection

Automatically navigates through multiple listing pages, extracts property URLs, and retrieves detailed property information without manual browsing.

2. ✅AI-Powered Data Extraction

Uses ScrapeGraphAI to intelligently extract structured information such as:

Property title
Description
Price
Area (sqm)
Bedrooms & bathrooms
Floor and room count
Balcony, terrace, cellar
Heating and air conditioning
Property image URLs

3. ✅Scalable Pagination Handling

Dynamically generates paginated URLs, allowing the workflow to scrape hundreds or thousands of listings efficiently.

4. ✅Google Sheets Integration

Automatically writes and updates extracted property data into Google Sheets, creating a centralized and continuously updated real estate database.

5. ✅Duplicate Prevention

Uses the property URL as a unique identifier to append or update listings without creating duplicates.

6. ✅Highly Customizable

The workflow can be adapted to:

Different cities or search filters
Other real estate websites
Different extraction schemas
Alternative storage systems (CRM, database, Airtable, etc.)

7. ✅Structured JSON Schema Extraction

Ensures consistent and reliable data formatting, making the output ready for:

Market analysis
Lead generation
CRM enrichment
Investment scouting
Real estate dashboards

8. ✅Low-Code & Modular Architecture

Built entirely inside n8n with reusable modules, making maintenance and future upgrades simple.

Ideal Use Cases

Real estate lead generation
Property market monitoring
Investment opportunity analysis
Building property databases
Real estate CRM automation
Competitor and pricing analysis
Automated property aggregation platforms

How it works

This workflow automates the extraction of real estate listings from Idealista by performing two main phases: listing URL discovery and detailed data extraction.

Trigger and Pagination Setup
A Manual Trigger starts the workflow. A Set node defines the base search URL and the maximum number of pages to scrape. A Code node then generates the paginated URLs (e.g., .../lista-1.htm, .../lista-2.htm).
Extract Listing URLs from Search Pages
The generated URLs are split into batches using a Split In Batches node. For each search page, a ScrapegraphAI node extracts all individual property URLs that match the pattern https://www.idealista.it/immobile/xxxx. The results are then aggregated and unified using an Aggregate and a Code node to remove duplicates and flatten the list.
Process Each Property URL
The unified list of property URLs is split again into batches. For each property URL, a second ScrapegraphAI node extracts detailed information following a strict JSON schema (including title, description, price, area, bedrooms, bathrooms, floor, rooms, balcony, terrace, cellar, heating, air conditioning, and image URLs).
Store Data in Google Sheets
The extracted data is finally written to a Google Sheet using the Google Sheets node configured with appendOrUpdate mode, which avoids duplicates by matching the URL column.

Set up steps

Import and Configure Credentials
Import the workflow into n8n. Add the following credentials:
- ScrapegraphAI API (used by both ScrapegraphAI nodes).
- Google Sheets OAuth2 (used for writing data).
Prepare the Google Sheet
Clone this template sheet or create your own. Update the Google Sheets node with your Document ID and Sheet Name.
Configure the Search Parameters
In the Set params node, modify the url variable to target your desired search (location, filters, etc.) and set max_pages to control how many search result pages to scrape.
Adjust Extraction Logic (if needed)
- Verify that the Scrape listings node’s prompt correctly matches the listing URL structure of your target site.
- Update the Extract data node’s outputSchema (JSON schema) to match the fields you want to extract.
Enable and Execute
Activate the workflow. Click the Execute Workflow button to start scraping. The results will automatically populate the configured Google Sheet, appending new listing data without creating duplicates.

👉 Subscribe to my new YouTube channel. Here I’ll share videos and Shorts with practical tutorials and FREE templates for n8n.

Need help customizing?

Contact me for consulting and support or add me on Linkedin.