By continuing to browse this website, you agree to our use of cookies. Learn more at the Privacy Policy page.

Data clean room: A silver bullet to a post-cookie transition?

PostedDecember 30, 2022 13 min read
Data-clean room in Xenoss blog


Ad IDs across platforms are falling like dominoes. Apple ATT framework has already curtailed ad targeting capabilities. Upcoming third-party cookie deprecation will further restrain access to audience knowledge.

In addition, emerging ad spaces like connected TVs, retail ads, and DOOH require a whole new set of cross-industry identifiers and data exchange mechanisms.

It looks like the ad industry is ready to enter a clean slate status, or more precisely — a data clean room state.

In this post, you’ll learn:

  • A quick data clean room definition
  • How data clean rooms work
  • What propels the demand for clean rooms
  • Which data clean room use cases pioneers pursue
  • The pros and cons of using data clean rooms
  • About alternatives to data clean room technology.

What is a data clean room?

A data clean room is an isolated, secured, and permissioned environment where publishers and advertisers can combine, match, analyze, and collaborate on anonymized datasets. An anonymized data set contains first-party data, cleansed of any personally identifiable information (PII).

Each party decides which type of data to disclose to the other party using the provided access controls.

Data clean room providers help brands and publishers broker a quid pro quo deal: you share your user identifiers and I’ll bring mine. Then we compare the parameters to learn more about our audiences and make more informed decisions.

A data clean room can serve a range of purposes, from ad frequency capping to customer lifetime value measurement.

Data clean room use cases-Xenoss blog

Sure, you can now do all of the above via SSP and DSP platforms. Yet, data abundance will soon come to an end. Data clean rooms can help the AdTech industry maintain the same high-precision marketing levels but by primarily using first-party data.

Quote Puneet Gangrade-Xenoss blog
Insights by Puneet Gangrade, Customer Success Engineer at Habu

How does data clean room work?

Think of a data clean room as a secure chamber with two doors and a counter in the middle. Each side comes in carrying some papers (aka first-party data). Then sits down at the counter and starts comparing the numbers they have (aka data matching).

This process can be governed by some underlying conditions (e.g., each side comes with an equal number of cards or can only submit whole numbers) to build a common understanding of the information each has.

For instance, a retail company comes in with its audience and transactional data and combines it with an audience dataset from a media publisher to figure out which shoppable content will best perform on its properties.

On the technology side, data clean room mechanics goes like this:

  1. Data ingestion. First-party data, stripped of PII and any other user identifiers, enters the clean room. This step requires fast and secure data cleansing, transformation, and encryption process.
  2. Partner connection. Participants provide permission for data disclosures. Big data algorithms parse injected data to identify correlative patterns (e.g., detect the same users). User identity resolution can happen in three ways: using deterministic, probabilistic, or integrated matching. You can obtain direct matches, near matches, or lookalikes.
  3. Data enrichment. To improve matching accuracy, data clean room providers allow data enrichment. You can bring in third-party data sources or tools to augment supplied datasets.
  4. Data mining. Using various algorithms and machine learning solutions, you can explore different analytics use cases. For example, search for data overlaps to introduce better ad frequency capping. Or apply propensity scoring models to identify new lookalike audiences for targeting.
  5. Action. Participants can then use the mined insights for other tasks like campaign planning, channel-level attribution, audience building, media mix model enhancements, or advanced personalization.

In practice, however, data clean room technology is more fragmented.

The emerging data clean room ecosystem

The concept of partitioning access to data isn’t new. Google launched Ads Data Hub back in 2017 as a solution for cross-validating your first-party data against user-level data contained within Google’s ecosystem.

A year later, Facebook released a similar feature for its biggest advertisers. Amazon Marketing Cloud also lets you combine proprietary data with Amazon Ads signals.

But this was just the cusp of bigger developments.

Privacy regulations got tighter. New marketing environments emerged — connected TV, retail media, and the metaverse. User attribution and ad targeting became more challenging as marketers lacked sufficient data. This further reflected on marketing efficacy:

How poor data quality impacts marketers organizations-Xenoss blog
Effects of poor data governance on marketing performance by Digital Commerce 

With third-party cookies no longer being a reliable (or lasting) industry identifier, the AdTech industry started evaluating alternative cookieless solutions.

Data clean rooms emerged as one of the strongest contenders.

Snowflake, Habu, LiveRamp, and InfoSum introduced new secure infrastructure for orchestrating secure first-party data exchanges between participants. With supplied software, users can ensure that their ad targeting data is accurate, consistent, reliable, and provides a multidimensional view of their audiences and campaigns.

Big brands and media companies with ample first-party data reserves were particularly hooked. So they teamed up with AdTech vendors to build private data clean room infrastructure. Disney, Omnicom Media, and Walgreens operate privately-owned data clean rooms, created in tandem with the infrastructure providers.

The major appeal of owning a data clean room is the full retention of data, paired with its enrichment by partners. Advertisers, in turn, are attracted by more in-depth information about customer journeys, spending behaviors, and cross-channel attribution capabilities.

That said, not every company can afford to operate a private data clean room. Or line up data exchange deals every other week.

Nascent data marketplace

To add extra value to users, some AdTech players decided to follow the lead of Big Tech firms: pre-populate their clean room ecosystem with extra data. Apart from bringing your own insights (and your partner’s), data clean room users can also benefit from the pre-loaded customer insights.

For instance, LiveRamp operates a data marketplace of anonymized user-level data from global media platforms, agencies, analytics environments, and TV partners. Users can both run private exchanges and augment their first-party data with third-party insights obtained via integrations with popular ad measurement solutions and identity resolution companies.

In January 2022, NBCUniversal launched NBCUnified — a unified data + identity platform advertisers can use to better track audiences across channels and market to them. General Motors (GM) was the first brand to merge its data with NBCUnified to improve its campaign performance.

Amazon and Habu, on the other hand, pursue a different strategy for scaling their data clean room offering. Habu is building out a cloud-agnostic platform that allows anyone to safely connect with another party without moving their data from the respective cloud.

Amazon, in turn, launched a clean room solution for AWS (in preview) in December 2022. Users can set up multiple data clean rooms using the available infrastructure to provide permission-based access to first-party data or integrate extra insights from third-party sources.

In short, we have three distinctive players on the market:

  • Big Tech firms who allow data-ins (aka bringing your own data for cross-validation), but no data outs.
  • Data clean room software providers that either provide the data clean room solution for brokering 1:1 deals or give access to their ecosystem of first-party data, collected through partners.
  • Brand-led data clean room projects built atop AdTech-supplied infrastructure.

What’s the difference between a CDP and a data clean room?

Customer data platforms (CDPs) share a lot of similarities with data clean rooms — both support secure, at-scale data collection. However, a CDP is primarily used for internal user data collection and aggregation. You can store user IDs and PII in there as long as they were compliantly collected. Full data is visible to all users, but you can’t share most (if any) of the data with third parties.

CDP vs data clean room comparison

Customer data platform vs. Data clean room - Xenoss blog

Mostly, a CDP serves its owner; a data clean room is a neutral ground. But it isn’t always the case. As Allison Schiff, managing editor at AdExchanger, pointed out:

Not every company that has or calls itself a data clean room does the same thing or takes the same approach.

Big Tech players operate single-party centralized clean rooms where you can bring in your data but can’t extract any owned information. Then you have multi-party centralized clean rooms, where participants consolidate data and run cross-dataset analytics. Other vendors offer agnostic data clean rooms, where multiple brands can broker private deals without ever combining all the data together.

To establish mutual grounds, IAB Tech Lab is set to release a new clean room standard in December 2022.

The gist:

  • Issue standards for data clean room security and privacy
  • Draw the line between neutral clean rooms, walled gardens, and CDPs
  • Establish interoperability between clean data rooms
Quote Bosko Milekic- Xenoss blog
Insights from Bosko Milekic, CPO and Co-Founder at Optable, member of IAB Tech Lab

CDPs will likely remain an internal solution for first-party data collection, while data clean rooms will be used collaboratively.

Today, you can use CDP without a data clean room to run internal analytics or even invite partners for data cross-validation. But it’s hard to join a data clean room partnership without a CDP (or an equivalent first-party data management solution).

Looking for an experienced team to build a customer data platform development?

Learn more

Why is there so much demand for data clean rooms?

As browser- and device-supplied knowledge is fading out, first-party data infrastructure and identity tech integrations are becoming the new staple of the industry.

In the post-cookie landscape, SSPs and DSPs, capable of securely processing bid requests with first-party data, will command a bigger advertising market share. Many also look for ways to curb ad budget waste and improve cross-channel measurement.

Gartner expects that 80% of advertisers with $1+ billion media budgets will be using data clean room technology in 2023.

That’s a likely prediction since many big-name brands are already in the game.

In 2022 Snowflake got a joint stake in OpenAP — a new connected TV ad measurement product — supported by Warner Bros. Discovery, NBCUniversal, Fox, and Paramount.

Roku, in turn, already has a clean room for CTV. Roku data clean room promises advertisers freedom to query matched data and run their own analyses to understand potential campaign reach, audience delivery, and advertising impact on product sales and other outcomes. Omnicom Media Group, Dentsu, Horizon Media, Icon Media Direct, and Camelot are among the early adopters.

As browser cookie and mobile ID data disappear, first-party data from direct consumer relationships stands out as the gold standard for advertising. At the same time, there is growing demand and need by consumers to ensure that their data is protected. Roku’s clean room allows marketers to match their first-party data directly to Roku data in a secure environment.

Louqman Parampath, Roku VP of Product Management, Advertising

Retail media advertising market faces a similar attribution crisis to CTV. Multinationals need extra customer data to devise better campaigns and co-branding deals. But audience insights are either locked within walled gardens or siloed across multiple entities in the AdTech chain. Data clean rooms can provide better monetization capabilities for retailers without exposing them to compliance risks.

3 real-world use cases of data clean rooms

Data clean rooms emerged to fill in a pressing market gap: lack of common identifiers. Across the board, brands and publishers struggle to identify users browsing on different devices, coming from different ad channels or different locations.

Data clean rooms facilitate user identification and, subsequently, audience building. In short: You get access to a privately-built or partner-supplied data pool you can query in a privacy-friendly way. All the data, shared or integrated into the clean room, is stripped of any user identifiers, which minimizes compliance risks.

With data clean rooms, advertisers can improve:

  • Ad measurement and targeting
  • Frequency capping
  • Customer cohort analysis
  • New audience discovery
  • Campaign budget optimization

To give you some extra context, we’ve lined up three examples of how data clean rooms are deployed to solve the above challenges.

Audience insights and attribution for CTV advertising

Ad spending on connected TV ads is set to increase by 14.4% in 2023. Yet, CTV measurement remains a sore spot.

Due to missing unified IDs, brands can’t prevent duplication across all popular TV OS platforms. Buyers painfully lack data on ad viewability, validity, and sometimes delivery. Compared to programmatic ad platforms for other channels, CTV offers “basic” audience insights, rarely going beyond demographic data.

Big guns like Disney promise to change that with data clean room technology. Disney unveiled a proprietary clean room in October 2021, together with Habu, InfoSum, and Snowflake. A year later, it also signed up for Unified ID 2.0. Initiative, led by The Trade Desk, to further expand the platform’s interoperability.

Advertisers with their own data can leverage it within Disney’s Clean Room for better planning and campaign insights. For those wanting to supplement their own data, they can utilize third-party segments through Habu for activation and analytics in Disney’s advertising environments. To meet various industry-specific needs, we have pre-sourced data, queries, and visualizations built within a safe, clean room environment.

Matt Kilmartin, Co-founder & CEO of Habu

Disney data clean room supplies advertisers with 2,000+ granular audience segments, varying from “chief decision-maker for CPG goods” to “shopping for a new car in the next 6 months.” With content consumption on the rise, Disney can capture even more behavioral and psychographic customer information — and pitch it to advertisers for targeting.

The adoption of Unified ID 2.0 will further help Disney with user attribution across platforms and provide platform users with tools for better cross-channel attribution.

Retail media offsite audience extension

Data clean rooms can help retailers better identify offline and online shoppers, plus recognize them in other environments — among TV watchers or mobile gamers. By combining offline and online owned data, retailers can also improve upsell and cross-sell deals to existing customers to boost CLTV.

In fact, that’s what Amazon has been doing for years. The giant has been using proprietary data clean rooms to better understand shopper behaviors and fuel its advertising business (which now makes more money than Amazon Prime, video, audio, and eBook subscription combined).

Other retailers can create similar “private” clean rooms to make more cents from retail media advertising and broker data exchanges with select partners. For example, run co-branding partnership deals with retailers, sell to lookalike audiences, or place shoppable content with target publishers.

Partnering up with an AdTech company is another option.

In August 2022, IRI and Epsilon signed a deal to create the first data clean room solution for CPG brands. Epsilon supplies the tech, while IRI gives transactional datasets collected using its CORE IDs across all brand touchpoints, devices, and customer groups.

If all goes well, CPG brands will be able to identify the highest-value consumer prospects in joint datasets, improve activation across owned and paid channels with partners, and significantly improve media planning.

Enhanced cross-platform ad measurement and optimization

Between programmatic ad fraud and the “AdTech tax” (fees DSPs/SSPs take from each ad sale), brands and publishers grow frustrated with mounting cross-channel ad waste. Ads are shown to the same person over and over again, resulting in high CPC and low conversions.

Unilever decided to throw a data clean room at this problem. The company is building a clean room to channel anonymized data from Google, Facebook, and Twitter ad campaigns, plus TV data, to neutral measurement partners (Kantar and Nielsen). Both will perform cross-validation and measurement to identify budget waste.

The cross-platform measurement initiative we’re building will allow us to see more clearly the connection between our communications and the outcomes we’re getting when it comes to attribution and sales.

Luis Di Como, SVP of Global Media at Unilever

Unilever data clean room serves as a “processing hub.” Sensitive data remains within the walled gardens. This setup reduces privacy risks for Unilever and its select publishers, who also get access to the clean room for placement optimization.

The pros and cons of using data clean rooms

The clean room data boom is only going to pick speed. 87% of ad executives believe that data clean room technology will play a strategic role for their organizations in 2023-2024.

Should data clean rooms be part of your product development strategy? Here are our weighty pros and cons for AdTech companies and media agencies to consider.

Pros of using data clean rooms

  • Compliant access to more data, supplied through partners and verified by neutral third parties. It provides more advanced functionality for audience building and campaign planning without extra exposure to privacy risks.
  • Ad waste minimization. Prevent customer budget erosion due to repetitive ad displays, poor visibility, ad fraud, and poor measurement. Entice users with better measurement capabilities for established and new ad channels.
  • Data security. Attract more users from privacy-focused industries who are concerned about data disclosures. Create convenient tools for regulating, limiting, and tracking data usage on the platform.
  • Personalization. Supply new insights to boost customer loyalty, CLTV, and conversions with multichannel, multi-creative campaigns, rather than endless retargeting.
  • Data partnerships. Join forces with other players to improve user identification, access extra customer knowledge, discover lookalike audiences, and improve campaign personalization.

Cons of using data clean rooms

  • Interoperability issues. Data clean room standards are at the nascent stage (but we’re getting there). Advertisers, publishers, and software providers define identities and audiences differently, complicating attribution and measurement.
  • High data maturity is required. To benefit from clean room technology, brands and publishers need a clear internal identity taxonomy and good data management practices. This limits the pool of potential customers.
  • Require substantial first-party data. Data clean room deals work between partners with a decent-sized pool of first-party data. Smaller publishers and brands will require data enrichment from third parties or risk being excluded from the ecosystem.
  • Compliance. Regulators are yet to issue definite rulings on data handling within clean rooms. Definitions of “private” or “anonymized” first-party data may change over time.
  • Security concerns. Data clean rooms need airtight cybersecurity protection to prevent accidental leaks or to withstand cyberattacks.
The pros and cons of using data clean rooms-Xenoss blog

What are the alternatives to data clean rooms?

Data clean rooms are one of the “blocks” within the emerging first-party infrastructure. The AdTech industry also explores other cookieless ad solutions:

  • Universal IDs — a tokenized standard for cross-ecosystem user identification backed by user identity graphs. ID graphs include anonymized first-party data points collected for each profiled user.
  • Google Topics API — a collection of user segments generated based on user on-device behavior and known interests. Doesn’t provide any good way for cross-channel attribution.
  • Contextual targeting — ad serving, adapted to the user’s actions rather than known identity information. Contextual targeting in CTV provided great returns, but this solution doesn’t address the measurement issues.


First-party data will play a focal role in the cookieless future. The more reserves you have — the more alternatives you can explore. Data clean rooms offer AdTech players (and their partners) a mechanism for private user identity matching using accumulated first- and zero-party data reserves plus partner-supplied third-party data.

With this mechanism in place, you can significantly improve cross-platform measurement and extend your marketing mix modeling capabilities without worrying about budget waste or compliance.

Xenoss helps AdTech and MarTech companies navigate the changing industry landscape. Contact us to get a personalized consultation on data clean room technology.