Disclaimer: This article is for general information purposes only and does not constitute legal advice. The legal landscape around web scraping is evolving and jurisdiction-specific. Businesses should seek qualified legal counsel before commencing any web scraping activity, particularly where personal data or cross-border data flows are involved.
Table of Contents
Web scraping sits at the intersection of technology, intellectual property, data protection, and computer access law. Neither the UK nor the US has enacted legislation specifically addressed at web scraping, which means businesses must understand how existing laws apply — and they apply differently on each side of the Atlantic. For UK organisations working with British or American data sources, understanding both frameworks is increasingly important.
UK Legal Framework
Computer Misuse Act 1990
The Computer Misuse Act 1990 (CMA) is the primary piece of UK legislation that could render web scraping unlawful in certain circumstances. The CMA creates three principal offences: unauthorised access to computer material, unauthorised access with intent to commit further offences, and unauthorised modification of computer material.
Whether web scraping constitutes "unauthorised access" under the CMA depends on the circumstances. Scraping publicly accessible web pages that carry no access restrictions is unlikely to fall within the Act. However, scraping pages that require authentication, circumventing technical access controls, or deliberately overloading a server to obtain data could engage the CMA. The courts have not yet definitively ruled on the boundary, which means caution and legal advice remain essential for anything other than straightforward public data collection.
UK GDPR
The UK General Data Protection Regulation — retained and adapted from EU GDPR following Brexit — applies whenever scraped data includes personal data. Personal data is broadly defined under UK GDPR: it encompasses any information relating to an identified or identifiable living individual. This includes names, email addresses, phone numbers, IP addresses in certain contexts, and combinations of data points that could identify someone even if no single field does so alone.
Where web scraping involves personal data, the organisation undertaking the scraping (or commissioning it) must identify a lawful basis for processing. The most commonly applicable basis in a commercial scraping context is legitimate interests under Article 6(1)(f) of the UK GDPR, but this requires a documented balancing test demonstrating that the processing is necessary and that the individual's interests do not override the legitimate interest claimed.
ICO Guidance
The Information Commissioner's Office has published guidance relevant to web scraping in the context of training AI systems and data collection more broadly. The ICO's position emphasises that publicly available personal data does not become exempt from UK GDPR simply by virtue of being accessible online. Organisations scraping personal data from public sources must still satisfy the lawful basis requirements, provide appropriate transparency, and respect data subject rights including the right to object.
Publicly Available Data vs Protected Data
A practical distinction that informs UK compliance is between truly public data and data that is publicly accessible but protected by database rights or contractual restrictions. The Database Directive (retained in UK law) protects substantial investments in creating databases. A website that has assembled a comprehensive dataset — a property portal's listings database, for instance — may have database rights over the compiled collection even if individual listings are viewable by anyone. Extracting systematic or substantial portions of such a database without a licence may infringe those rights independently of any personal data considerations.
US Legal Framework
Computer Fraud and Abuse Act (CFAA)
The primary US statute that has been used to challenge web scraping is the Computer Fraud and Abuse Act (CFAA), a federal law originally enacted in 1986 to criminalise hacking. The CFAA prohibits accessing a computer "without authorisation" or in a manner that "exceeds authorised access." For many years, website operators argued that scraping in violation of their terms of service constituted access without authorisation, potentially exposing scrapers to criminal liability.
The scope of the CFAA as applied to scraping was substantially narrowed by the US Supreme Court's 2021 decision in Van Buren v United States, which held that exceeding authorised access means circumventing technical access restrictions, not merely violating contractual terms of service. This significantly reduced the risk that legitimate scraping of publicly accessible data could be prosecuted under the CFAA.
hiQ v LinkedIn
The landmark case of hiQ Labs v LinkedIn Corporation has shaped the US legal position on scraping public data more directly. In a series of rulings from 2019 through to the Ninth Circuit's 2022 decision following the Van Buren ruling, US courts held that scraping data from publicly accessible web pages — pages that require no login to view — is unlikely to constitute a CFAA violation. LinkedIn's attempt to use the CFAA to prevent hiQ from scraping public profile data was ultimately unsuccessful at the Ninth Circuit level.
This does not mean scraping is unrestricted in the US. The hiQ decisions are persuasive rather than binding across all jurisdictions, and claims in tort, copyright, or breach of contract remain available to website operators regardless of the CFAA outcome.
State Laws: CCPA and Beyond
The United States lacks a federal equivalent to the UK GDPR, but state-level privacy laws are proliferating. The California Consumer Privacy Act (CCPA) — and its amendment, the California Privacy Rights Act (CPRA) — grants California residents rights over their personal data and imposes obligations on businesses processing that data. Organisations scraping personal data from US sources that includes California residents' information may have CCPA obligations, including providing privacy notices and honouring opt-out requests.
As of early 2026, more than a dozen US states have enacted comprehensive privacy legislation. The regulatory map is complex and changing rapidly.
robots.txt as Guidance, Not Law
In the US, as in the UK, a website's robots.txt file is a technical instruction rather than a legally binding prohibition. Courts have not uniformly treated violation of robots.txt as independently unlawful. However, ignoring explicit robots.txt disallow instructions can be relevant to arguments about whether access was authorised, and doing so knowingly may weaken a scraper's legal position in subsequent litigation.
Key Differences Between UK and US Frameworks
Personal Data: GDPR vs No Federal Standard
The most significant practical difference for businesses is the absence of a federal personal data protection law in the US comparable to the UK GDPR. UK organisations scraping personal data face clear, enforceable obligations: lawful basis, data minimisation, data subject rights, ICO accountability. US organisations face a patchwork of state laws that may or may not apply depending on whose personal data is involved and where that person resides.
For UK businesses scraping US-hosted sources that contain personal data, UK GDPR applies to the processing activity regardless of where the data originates. The obligation travels with the data controller, not with the data.
UK CMA vs CFAA: Scope and Application
The UK's Computer Misuse Act is older and has been applied in fewer scraping-specific contexts than the US CFAA, which has generated extensive case law. The post-Van Buren interpretation of the CFAA provides relatively clearer guidance that scraping publicly accessible pages is unlikely to violate the Act. The CMA's application to scraping remains less tested in UK courts.
Database Rights
The UK retains database rights derived from EU law that provide additional protection for substantial investments in database creation. The US provides no equivalent database right — in the US, facts are not copyrightable regardless of the effort invested in compiling them. This means UK-hosted databases enjoy a layer of protection against systematic extraction that US-hosted databases do not.
What This Means for UK Businesses Hiring a Scraping Provider
Questions to Ask Your Provider
- How do you assess whether a target source is legally accessible for scraping? A competent provider should have a documented pre-project compliance review process.
- What is your approach to personal data encountered during extraction? The answer should reference UK GDPR obligations, not just technical data handling.
- Do you maintain records of your legal basis for processing personal data? This is required under UK GDPR and should be a standard deliverable on any project touching personal data.
- Where is extracted data stored and processed? UK data residency is important for UK GDPR compliance, particularly post-Brexit.
- How do you handle websites' robots.txt instructions and terms of service? Responsible providers respect these signals even where they are not strictly legally binding.
GDPR Compliance Checklist for Web Scraping Projects
- Identify all fields in the target dataset that constitute personal data
- Establish and document a lawful basis for processing each category of personal data
- Conduct a legitimate interests assessment or DPIA as appropriate
- Apply data minimisation — do not collect personal data fields that are not required
- Ensure data is stored in the UK or in a country with adequate protections
- Define and document retention periods for scraped personal data
- Ensure data subject rights (access, erasure, objection) can be fulfilled
Best Practices That Keep You Compliant in Both Jurisdictions
Respect robots.txt
Honour disallow instructions in robots.txt files, particularly for URLs that clearly signal restricted access. Beyond the legal considerations, this is a mark of professional conduct that reduces the risk of dispute with website operators.
Do Not Scrape Personal Data Without Lawful Basis
Regardless of whether data is publicly accessible, establish and document your lawful basis before extracting personal data. Under UK GDPR, publicly available personal data is still personal data. Under US state laws, similar obligations are increasingly applying.
Rate Limiting
Send requests at rates that replicate reasonable human browsing behaviour rather than maxing out your scraping infrastructure. Aggressive scraping that degrades a website's performance for other users creates legal exposure under the CMA (disruption of computer services) and CFAA (damage to a protected computer) and is ethically indefensible.
Terms of Service Review
Review the terms of service of any website you intend to scrape. Where a ToS explicitly prohibits scraping, the risk profile of the project increases — not because ToS violations are automatically unlawful, but because an explicit prohibition is relevant evidence in any subsequent dispute. In some cases, a commercial data licence may be the appropriate path.
Document Everything
Maintain records of your compliance assessments, lawful basis determinations, and technical measures. Documentation demonstrates good faith and is required under UK GDPR's accountability principle. It is also your primary defence if a question is ever raised about your scraping activities.
How UK Data Services Handles Compliance
Every engagement with UK Data Services begins with a compliance review before any extraction work commences. We assess the legal basis for the project under UK GDPR, identify any personal data in scope, review the terms of service of target sources, and produce a written compliance summary that forms part of the project documentation.
We operate exclusively on UK data infrastructure, apply data minimisation by default, and do not extract personal data fields that are not necessary for the client's stated purpose. Our team stays current with ICO guidance and case law developments in both the UK and US jurisdictions relevant to our clients' projects.
Where a project raises compliance questions that require legal advice beyond our internal review — complex cross-border data flows, novel legal questions, or high-risk processing — we will say so clearly and recommend that the client seeks specialist legal counsel before we proceed.
Navigate Compliance with a Provider That Takes It Seriously
The legal landscape around web scraping is not static, and the differences between UK and US frameworks are material for businesses operating across both. Working with a provider that treats compliance as an engineering constraint rather than an afterthought is the most effective way to manage this risk.
Have a scraping project with compliance questions? Our team will walk through the requirements with you and provide a clear compliance assessment as part of every proposal.
Request a Quote Explore Our Services