What are the main UK laws governing web scraping?

The primary UK legislation governing web scraping includes the Data Protection Act 2018 (implementing GDPR), the UK General Data Protection Regulation (GDPR), the Computer Misuse Act 1990, and the Copyright, Designs and Patents Act 1988. The Electronic Commerce (EC Directive) Regulations 2002 also apply to online commercial activities.

What kind of data is considered personal data when web scraping under GDPR?

Under UK GDPR and DPA 2018, personal data includes any information relating to an identified or identifiable natural person. This commonly includes names, contact details, email addresses, phone numbers, social media profiles, professional information, online identifiers like IP addresses, and behavioural data.

What legal basis is most commonly used for web scraping personal data?

The most commonly used lawful basis for processing personal data during web scraping is 'Legitimate Interests'. This requires a balance between the organisation's interests and the rights and freedoms of data subjects, and is suitable for activities like market research and competitive analysis.

Can I scrape content protected by copyright?

Original literary, artistic, or editorial content on a website is protected by copyright. Scraping and reproducing such content may constitute infringement under the Copyright, Designs and Patents Act 1988. Confine scraping to factual data or rely on exceptions like Text and Data Mining for non-commercial research, or obtain a licence for commercial projects.

UK Web Scraping Compliance Guide 2026 | GDPR & Data Protection

Q: How do website Terms of Service (ToS) affect web scraping compliance?

Website Terms of Service are enforceable contracts that can restrict automated access. Courts increasingly uphold ToS restrictions on automated access, making it crucial to review these terms, along with Privacy Policies and Acceptable Use Policies, before scraping.

Web Scraping Updated March 2026 12 min read

Legal experts specialising in UK data protection and technology law

UK Legal Framework Overview

Web scraping in the UK operates within a legal framework that has developed significantly since GDPR came into force in 2018. Any organisation doing automated data collection needs to understand how it applies to them.

The primary legislation governing web scraping activities in the UK includes:

Data Protection Act 2018 (DPA 2018) - The UK's implementation of GDPR
General Data Protection Regulation (GDPR) - Retained EU law post-Brexit
Computer Misuse Act 1990 - Criminalises unauthorised access to computer systems
Copyright, Designs and Patents Act 1988 - Protects intellectual property rights
Electronic Commerce (EC Directive) Regulations 2002 - Governs online commercial activities

⚖️ Legal Disclaimer

This guide provides general information about UK web scraping compliance and should not be considered as legal advice. For specific legal matters, consult with qualified legal professionals who specialise in data protection and technology law.

Website Terms of Service

A website's Terms of Service (ToS) is a contractual document that governs how users may interact with the site. In UK law, ToS agreements are enforceable contracts provided the user has been given reasonable notice of the terms — typically through a clickwrap or browsewrap mechanism. Courts have shown increasing willingness to uphold ToS restrictions on automated access, making them a primary compliance consideration before any web scraping project begins.

Reviewing Terms Before You Scrape

Before deploying a scraper, locate the target site's Terms of Service, Privacy Policy, and any Acceptable Use Policy. Search for keywords such as "automated", "scraping", "crawling", "robots", and "commercial use". Many platforms explicitly prohibit data extraction for commercial purposes or restrict the reuse of content in competing products.

Common Restrictive Clauses

Prohibition on automated access or bots
Restrictions on commercial use of extracted data
Bans on systematic downloading or mirroring
Clauses requiring prior written consent for data collection
Prohibitions on circumventing technical access controls

robots.txt as a Signal of Intent

The robots.txt file is not legally binding in itself, but courts and regulators treat compliance with it as strong evidence of good faith. A website that explicitly disallows crawling in its robots.txt is communicating a clear intention to restrict automated access. Ignoring these directives significantly increases legal exposure.

Safe Approach

Always read the ToS before scraping. Respect all Disallow directives in robots.txt. Never attempt to circumvent technical barriers such as rate limiting, CAPTCHAs, or login walls. If in doubt, seek written permission from the site owner or contact us for a compliance review.

Intellectual Property Considerations

Intellectual property law creates some of the most significant legal risks in web scraping. Two overlapping regimes apply in the UK: copyright under the Copyright, Designs and Patents Act 1988 (CDPA), and the sui generis database right retained from the EU Database Directive. Understanding both is essential before extracting content at scale.

Copyright in Scraped Content

Original literary, artistic, or editorial content on a website is automatically protected by copyright from the moment of creation. Scraping and reproducing such content — even temporarily in a dataset — may constitute copying under section 17 of the CDPA. This includes article text, product descriptions written by humans, photographs, and other creative works. The threshold for originality in UK law is low: if a human author exercised skill and judgement in creating the content, it is likely protected.

Database Rights

The UK retained the sui generis database right post-Brexit under the Database Regulations 1997. This right protects databases where there has been substantial investment in obtaining, verifying, or presenting the contents. Systematically extracting a substantial part of a protected database — even if individual records are factual and unoriginal — can infringe this right. Price comparison sites, property portals, and job boards are typical examples of heavily protected databases.

Permitted Acts

Text and Data Mining (TDM): Section 29A CDPA permits TDM for non-commercial research without authorisation, provided lawful access to the source material exists.
News Reporting: Fair dealing for reporting current events may permit limited use of scraped content with appropriate attribution.
Research and Private Study: Fair dealing for non-commercial research and private study covers limited reproduction.

Safe Use

Confine scraping to factual data rather than expressive content. Rely on the TDM exception for non-commercial research. For commercial data scraping projects, obtain a licence or legal opinion before extracting from content-rich or database-heavy sites.

Computer Misuse Act 1990

The Computer Misuse Act 1990 (CMA) is the UK's primary legislation targeting unauthorised access to computer systems. While it was enacted before web scraping existed as a practice, its provisions are broad enough to apply where a scraper accesses systems in a manner that exceeds or circumvents authorisation. Criminal liability under the CMA carries custodial sentences, making it the most serious legal risk in aggressive scraping operations.

What Constitutes Unauthorised Access

Under section 1 of the CMA, it is an offence to cause a computer to perform any function with intent to secure unauthorised access to any program or data. Authorisation in this context is interpreted broadly. If a website's ToS prohibits automated access, a court may find that any automated access is therefore unauthorised, even if no technical barrier was overcome.

High-Risk Scraping Behaviours

CAPTCHA bypass: Programmatically solving or circumventing CAPTCHAs is a strong indicator of intent to exceed authorisation and may constitute a CMA offence.
Credential stuffing: Using harvested credentials to access accounts is clearly unauthorised access under section 1.
Accessing password-protected content: Scraping behind a login wall without permission carries significant CMA risk.
Denial of service through volume: Sending requests at a rate that degrades site performance could engage section 3 of the CMA (unauthorised impairment).

Rate Limiting and Respectful Access

Implementing considerate request rates is both a technical best practice and a legal safeguard. Scraping at a pace that mimics human browsing, honouring Crawl-delay directives, and scheduling jobs during off-peak hours all reduce the risk of CMA exposure and demonstrate good faith.

Practical Safe-Scraping Checklist

Never bypass CAPTCHAs or authentication mechanisms
Do not scrape login-gated content without explicit permission
Throttle requests to avoid server impact
Stop immediately if you receive a cease-and-desist or HTTP 429 responses at scale
Keep records of authorisation and access methodology

Compliance requirements

Responsible web scraping is not only about avoiding legal liability — it is about operating in a manner that is sustainable, transparent, and respectful of the systems and people whose data you collect. The following practices form a baseline compliance framework for any web scraping operation in the UK.

Identify Yourself

Configure your scraper to send a descriptive User-Agent string that identifies your bot, your organisation, and a contact URL or email address. Masquerading as a standard browser undermines your good-faith defence.

Respect robots.txt

Parse and honour robots.txt before each crawl. Implement Crawl-delay directives where specified. Re-check robots.txt on ongoing projects as site policies change.

Rate Limiting

As a general rule, stay below one request per second for sensitive or consumer-facing sites. For large-scale projects, negotiate crawl access directly with the site operator or use official APIs where available.

Data Minimisation

Under UK GDPR, collect only the personal data necessary for your stated purpose. Do not harvest email addresses, names, or profile data speculatively. Filter personal data at the point of collection rather than post-hoc.

Logging and Audit Trails

Maintain detailed logs of every scraping job: the target URL, date and time, volume of records collected, fields extracted, and the lawful basis relied upon. These logs are invaluable if your activities are later challenged by a site operator, a data subject, or a regulator.

Document Your Lawful Basis

Before each new scraping project, record in writing the lawful basis under UK GDPR (if personal data is involved), the IP assessment under CDPA, and the ToS review outcome. This documentation discipline is the hallmark of a GDPR-compliant data operation.

Legal Risk Assessment Framework

Not all scraping projects carry equal legal risk. A structured risk assessment before each project allows you to allocate appropriate resources to compliance review, obtain legal advice where necessary, and document your decision-making.

Four-Factor Scoring Matrix

Data Type

Low: Purely factual, non-personal data (prices, statistics)
Medium: Aggregated or anonymised personal data
High: Identifiable personal data, special category data

Volume

Low: Spot-check or sample extraction
Medium: Regular scheduled crawls of a defined dataset
High: Systematic extraction of substantially all site content

Website Sensitivity

Low: Government open data, explicitly licensed content
Medium: General commercial sites with permissive ToS
High: Sites with explicit scraping bans, login walls, or technical barriers

Commercial Use

Low: Internal research, academic study, non-commercial analysis
Medium: Internal commercial intelligence not shared externally
High: Data sold to third parties, used in competing products, or published commercially

Risk Classification

Score each factor 1–3 and sum the results. A score of 4–6 is low risk and may proceed with standard documentation. A score of 7–9 is medium risk and requires a written legal basis assessment and senior sign-off. A score of 10–12 is high risk and requires legal review before any data is collected.

Red Flags Requiring Immediate Legal Review

The target site's ToS explicitly prohibits scraping
The data includes health, financial, or biometric information
The project involves circumventing any technical access control
Extracted data will be sold or licensed to third parties
The site has previously issued legal challenges to scrapers

Green-Light Checklist

ToS reviewed and does not prohibit automated access
robots.txt reviewed and target paths are not disallowed
No personal data collected, or lawful basis documented
Rate limiting and User-Agent configured
Data minimisation principles applied
Audit log mechanism in place

Documentation & Governance

Solid documentation is the foundation of a defensible scraping operation. Whether you face a challenge from a site operator, a subject access request from an individual, or an ICO investigation, your ability to produce clear records of what you collected, why, and how will determine the outcome.

Data Processing Register

Under UK GDPR Article 30, organisations that process personal data must maintain a Record of Processing Activities (ROPA). Each scraping activity that touches personal data requires a ROPA entry covering: the purpose of processing, categories of data subjects and data, lawful basis, retention period, security measures, and any third parties with whom data is shared.

Retention Policies and Deletion Schedules

Define a retention period for every dataset before collection begins. Scraped data should not be held indefinitely — establish a deletion schedule aligned with your stated purpose. Implement automated deletion or pseudonymisation of personal data fields once the purpose is fulfilled. Document retention decisions in your ROPA entry and review them annually.

Incident Response

If your scraper receives a cease-and-desist letter or formal complaint, have a response procedure in place before it happens: immediate suspension of the relevant crawl, preservation of logs, escalation to legal counsel, and a designated point of contact for external communications. Do not delete logs or data when challenged — this may constitute destruction of evidence.

Internal Approval Workflow

Project owner completes a risk assessment using the four-factor matrix
ToS review and robots.txt check documented in writing
Data Protection Officer (or equivalent) signs off on GDPR basis where personal data is involved
Legal review triggered for medium or high-risk projects
Technical configuration (User-Agent, rate limits) reviewed and approved
Project logged in the scraping register with start date and expected review date

Industry-Specific Considerations

While the legal principles covered in this guide apply across all sectors, certain industries present heightened risks that practitioners must understand before deploying a data scraping solution.

Financial Services

Scraping data from FCA-regulated platforms carries specific risks beyond general data protection law. Collecting non-public price-sensitive information could engage market abuse provisions under the UK Market Abuse Regulation (MAR). Even where data appears publicly available, the manner of collection and subsequent use may attract regulatory scrutiny. Use of official data vendors and licensed feeds is strongly preferred in this sector.

Property

Property portals such as Rightmove and Zoopla maintain detailed ToS that explicitly prohibit scraping and commercial reuse of listing data. Both platforms actively enforce these restrictions. For property data projects, consider HM Land Registry's Price Paid Data, published under the Open Government Licence and freely available for commercial use without legal risk.

Learn more about our property data extraction.

Healthcare

Health data is special category data under Article 9 of UK GDPR and attracts the highest level of protection. Scraping identifiable health information — including from patient forums, NHS-adjacent platforms, or healthcare directories — is effectively prohibited without explicit consent or a specific statutory gateway. Any project touching healthcare data requires specialist legal advice.

Recruitment and Professional Networking

LinkedIn's ToS explicitly prohibits scraping and the platform actively pursues enforcement. Scraping CVs, profiles, or contact details from recruitment platforms also risks processing special category data (health, ethnicity, religion) embedded in candidate profiles. Exercise extreme caution and seek legal advice before any recruitment data project.

E-commerce

Scraping publicly displayed pricing and product availability data is generally considered lower risk, as this information carries no personal data dimension and is deliberately made public by retailers. However, user-generated reviews may contain personal data and are often protected by database right. Extract aggregate pricing and availability data rather than full review text. Our web scraping service can help structure e-commerce data projects within appropriate legal boundaries.

Conclusion & Next Steps

Web scraping compliance in the UK requires careful consideration of multiple legal frameworks and ongoing attention to regulatory developments. The rules continue to develop with new case law and regulatory guidance. For businesses seeking professional data services, understanding these requirements is essential for sustainable operations.

Key Takeaways

Proactive Compliance: Build compliance into your scraping strategy from the outset
Risk-Based Approach: Tailor your compliance measures to the specific risks of each project
Documentation: Keep detailed records that demonstrate compliance
Technical Safeguards: Implement respectful scraping practices
Legal Review: Seek professional legal advice for complex or high-risk activities

Need Expert Legal Guidance?

Our legal compliance team provides specialist advice on web scraping regulations and data protection law. We work with leading UK law firms to ensure your data collection activities remain compliant with evolving regulations. Learn more about our GDPR compliance services and case studies from compliance implementations.

Request Legal Consultation

Frequently Asked Questions

Is web scraping legal in the UK in 2026?

Yes, web scraping is legal in the UK when conducted in compliance with the Data Protection Act 2018, GDPR, website terms of service, and relevant intellectual property laws. The key is ensuring your scraping activities respect data protection principles and do not breach access controls.

What are the main legal risks of web scraping in the UK?

The primary legal risks include violations of the Data Protection Act 2018/GDPR for personal data, breach of website terms of service, copyright infringement for protected content, and potential violations of the Computer Misuse Act 1990 if access controls are circumvented.

Do I need consent for web scraping publicly available data?

For publicly available non-personal data, consent is typically not required. However, if scraping personal data, you must have a lawful basis under GDPR (such as legitimate interests) and ensure compliance with data protection principles including purpose limitation and data minimisation.

How do I conduct a Data Protection Impact Assessment for web scraping?

A DPIA should assess the necessity and proportionality of processing, identify and mitigate risks to data subjects, and demonstrate compliance measures. Consider factors like data sensitivity, processing scale, potential impact on individuals, and technical safeguards implemented.

UK Web Scraping Compliance Guide 2026 | GDPR & Data Protection

UK Legal Framework Overview

⚖️ Legal Disclaimer

GDPR & Data Protection Act 2018 Compliance

What Constitutes Personal Data?

Lawful Basis for Processing

🔓 Legitimate Interests

✅ Consent

📋 Contractual Necessity

Data Protection Principles

Website Terms of Service

Reviewing Terms Before You Scrape

Common Restrictive Clauses

robots.txt as a Signal of Intent

Safe Approach

Intellectual Property Considerations

Copyright in Scraped Content

Database Rights

Permitted Acts

Safe Use

Computer Misuse Act 1990

What Constitutes Unauthorised Access

High-Risk Scraping Behaviours

Rate Limiting and Respectful Access

Practical Safe-Scraping Checklist

Compliance requirements

Identify Yourself

Respect robots.txt

Rate Limiting

Data Minimisation

Logging and Audit Trails

Document Your Lawful Basis

Legal Risk Assessment Framework

Four-Factor Scoring Matrix

Data Type

Volume

Website Sensitivity

Commercial Use

Risk Classification

Red Flags Requiring Immediate Legal Review

Green-Light Checklist

Documentation & Governance

Data Processing Register

Retention Policies and Deletion Schedules

Incident Response

Internal Approval Workflow

Industry-Specific Considerations

Financial Services

Property

Healthcare

Recruitment and Professional Networking

E-commerce

Conclusion & Next Steps

Key Takeaways

Need Expert Legal Guidance?

Frequently Asked Questions

Is web scraping legal in the UK in 2026?

What are the main legal risks of web scraping in the UK?

Do I need consent for web scraping publicly available data?

How do I conduct a Data Protection Impact Assessment for web scraping?

Related Articles

GDPR Data Minimisation: Best Practices for Data Teams

How to Handle CAPTCHAs in Web Scraping: 7 Methods That Work

DPIA Guide: Data Protection Impact Assessments for the UK

Frequently Asked Questions