Beyond Privacy Friction: Making Google Search Data Sharing Both Safe and Useful
My Submission to the European Commission on DMA.100209, the Consultation on the Proposed Measures for Google Search data sharing (Article 6(11) of the DMA)
There is a very difficult problem sitting at the centre of the European Commission’s consultation in Case DMA.100209 on Google Search data sharing under Article 6(11) of the Digital Markets Act.
It is not a simple privacy-versus-competition problem. That framing is too crude. Article 6(11) exists because access to search data can matter for contestability. Query, ranking, click and view data are not incidental traces. They are part of the feedback infrastructure that improves search quality. A gatekeeper that benefits from scale in those signals should not be able to turn that scale into a permanent insulation from competition. That is the force of the DMA obligation, and I support it.
It may reveal something about the searcher. It may also contain personal data that appears to concern another person. In practice, however, the system will often have no reliable way to distinguish between a query that contains personal data about the end user who issued it and a query that contains personal data about someone else. The safer approach is to treat personal data contained in the query text as part of the end-user anonymisation problem rather than assume it can be conceptually carved away.
This is why I have submitted a response to the Commission consultation. My concern is not that the Commission is wrong to pursue the Article 6(11) objective. On the contrary, the contestability objective is legitimate and important. My concern is that the final measures must not invent a privacy fiction to make the access obligation administratively convenient.
A useful but unsafe dataset fails data protection law. A dataset that reaches anonymity only by stripping out all meaningful contestability value may comply in form while failing the purpose of Article 6(11). The hard work is to respect the anonymisation requirement first and then ask how much useful access can still be preserved.
The danger, as I see it, is that search data might be treated as anonymised because obvious identifiers have been removed, certain thresholds have been applied, and recipients are contractually prohibited from re-identifying users. Contracts matter. Recipient obligations matter. Audit matters. Purpose limitation matters. But contracts cannot be asked to do the conceptual work of anonymisation if the underlying access model remains too risky. Protecting privacy by contract is not enough unless the technical, organisational and institutional controls genuinely change the realistic means of identification available in the recipient environment.
At the same time, privacy cannot become a veto over the DMA. If the answer to the privacy risk is simply to suppress local, fresh, entity-rich, sequence-based, and some rare-query signals rather than to ask whether they can be delivered through safer access modalities, then Article 6(11) risks becoming a formal right to degraded access.
So, my submission is solution-based. I argue that the Commission should move from a single transformed record-level dataset to a Safe Search Data Access Regime. That means recognising that “access” is not the same thing as unrestricted possession. Lower-risk, common-tail data may be suitable for export. Aggregate signals may be suitable for privacy-preserving release. Data that has been anonymised but remains unsuitable for ordinary download due to marginally elevated operational, combination, or output-leakage risk may require controlled API access, clean-room access, trusted execution, or regulator-supervised escrow. Data that cannot be anonymised to the Article 6(11) standard should not be disclosed through a clean room; it should be suppressed, aggregated, or used only to generate a privacy-tested model or to provide access to outputs. Some data may still need to be suppressed. But the decision should be evidence-led, reviewable and tied to both privacy risk and search utility.
The central proposal is an Anonymisation and Utility Impact Assessment. That assessment should ask first whether the data has genuinely been anonymised in the relevant recipient environment, and then whether the resulting access still supports real search-improvement tasks: ranking, query understanding, local search, freshness, relevant head and torso learning, carefully controlled rare-query analysis, click modelling and evaluation of OSE functionality in AI-enabled interfaces.
The AI dimension makes this even more urgent. The preliminary measures appear to assume that a chatbot with OSE functionality is eligible. The key point is that eligibility should not be conflated with service-wide access. Article 6(11) must not become a route for general-purpose model training, fine-tuning of non-OSE systems, model grounding outside the OSE function, advertising enrichment, identity-graph improvement or unrelated product optimisation. Access should follow the genuine online search engine function: retrieval, indexing, ranking, query understanding, SERP interaction and evaluation of that function. It should not become a general AI data pipeline.
There is also a broader institutional issue here. Europe cannot afford another unstable meaning of anonymisation. We already have debates across the GDPR, the Digital Omnibus, DMA implementation, DPA guidance, pseudonymisation guidance and AI governance about when data should be treated as personal, pseudonymised, anonymous, or contextually outside the realistic scope of identification. If one regime treats anonymisation as almost unreachable, while another appears to accept a more flexible contract-dependent model, the EU digital rulebook will become even less coherent.
The better answer is not absolutism. It is disciplined contextualism. Personal data is not a mystical status attached to information forever. Identifiability depends on the actor, the context, the means reasonably likely to be used, the legal constraints, the technical safeguards and the factual access environment. But that contextual analysis has to be honest. It cannot become a convenient label for weak anonymisation. Nor can it be stretched so far that every dataset remains personal for everyone, everywhere, forever.
The Commission has an opportunity here to do something important. It can make Article 6(11) work as a serious tool for contestability. It can protect users from a reckless search-log release model. It can provide smaller search engines with useful signals. It can ring-fence AI use. It can clarify the responsibilities among Google, recipients, auditors, DPAs, and the Commission. And it can avoid creating yet another incoherent layer in the law of anonymisation.
That is the point of my submission. Do not weaken Article 6(11). Do not pretend that search logs become anonymous by formula. Do not lower anonymisation to preserve utility. Once anonymisation is achieved, preserve contestability through a tiered, evidence-led and independently reviewable access regime.
#DMA #DigitalMarketsAct #GoogleSearch #DataProtection #GDPR #Privacy #Anonymisation #Pseudonymisation #CompetitionLaw #DigitalRegulation #PlatformRegulation #AIRegulation #Search #EuropeanCommission #DigitalOmnibus #LegalCertainty #DataGovernance

