With all the MarTech and AdTech tools out there, collecting customer data is easy, but operationalizing it is not.
Historically, marketing teams struggled to get the full picture of the customer interest, let alone do that on a 1:1. The odds changed with customer data platforms (CDPs)—the technology that promised to consolidate a multitude of disparate customer insights into a 360-degree view. So that markets could then create better customer segments, identify lookalike audiences, resolve identity issues, and do so across any number of touchpoints.
However, this change didn’t happen immediately. The early CDPs failed to fulfill these promises, but the new breed of composable CDPs is succeeding.
In this post, you’ll learn:
- Why first CDP vendors, known as “packaged solutions” didn’t win over the markets
- How the new cohort of composable vendors addresses the early industry mishaps
- What “warehouse-native” means, and why it may be the future of MarTech.
The evolution of customer data platforms
Back in 2013, David Rabb, then a marketing technology consultant and now founder of the CDP Institute, came up with a new term, customer data platform (CDP), to describe a now-defunct product, Causata.
Unlike other vendors active at that time, Causata offered more than just database management, data integration infrastructure, and predictive modeling. The platform’s primary focus was on customer-related functions (not just marketing), data consolidation (and not just various data manipulations), and platform (a.k.a. it could support other business systems, not just aggregate data).
Today, CDPs are broadly defined as marketing technology for unifying customer data from multiple sources in a centralized repository, accessible to multiple downstream systems like business intelligence tools, activation apps, customer engagement platforms, and others.
The promise of early CDP tech vendors was to provide a single customer view (SVC) and easy data activation through native platform tools or third-party integrations.
But that didn’t quite happen. Here’s why.
Off-the-shelf CDPs couldn’t handle integrations with existing MarTech stacks
An average marketing team has between five and ten tools in their marketing tech stack. Think of an assortment of marketing automation tools, dedicated marketing databases, data management platforms, social media tools, webinar platforms, etc. And some enterprises have over 120 tools for marketing.
But the more isn’t the merrier. Many MarTech apps come with a private database. Traditional integrations only support limited data exchanges. Initially, CDPs promised to create a centralized data store and exchange system for aggregating data from all those apps. However, CDP tech vendors were slow to implement API-level integrations for the myriad of apps out there. Moreover, once data entered a CDP, it remained there, making it harder to query with non-native tools. This leads us to the second issue.
Packaged CDPs didn’t account for users’ data infrastructure
Apart from first-party data, most B2B markets use one to five sources for third-party data, 32% use between five and ten, and 15% use more, an Ateriad report says. In theory, more data should mean better outcomes. Yet, the same report says that those using 6+ third-party sources reported less targeting benefit and 13% less customer experience benefit.
To some extent, poor measurement or attribution is to blame, though infrastructure played a bigger role. Packaged CDPs require users to send data directly to their platform. As a result, companies often end up with two separate data stores: A CDP, housing event data, and a data warehouse (DHW), housing first-party audience data.
The lack of native data-sharing capabilities with other systems meant expensive data pipelines. A large MarTech stack also meant that data had to be queried across multiple databases, which made CDP implementation even more challenging and expensive.
Data engineers became frustrated with the need to use tools native to the CDP. Domain experts and analysts had to settle for fragmented audience segments instead of the promised ‘single source of truth.’
Early CDPs lacked a clear value proposition
Early CDP (customer data platform) vendors often positioned themselves as ‘everything at once’—a data store + activation and analytics tools. In practice, they most often did just one thing well, e.g., tag management and identity resolution or plain data warehousing.
So, users were (and still are) confused about the purpose of a CDP. In a 2021 survey by MessageGears, the majority (52%) of respondents named ‘ability to manage customer-marketing activities’ as the primary purpose of a CDP, but almost half selected vastly different answers. As for the primary purpose, 42% chose ‘to help a company increase customer satisfaction,’ but three of the other four options got between 15% and 18% of responses.
For a long time, packaged CDPs focused on user-facing features for activation but didn’t address the more complex problems of effective data integration. So, it’s not surprising that only 58% of companies using a CDP in 2021 said it delivered value.
The rise of composable customer data platforms
Early CDPs targeted an acute marketing problem: The desire to easily use all available customer data across all applications. But their solutions weren’t that elegant.
Many ancillary technologies have also emerged in the 2010s: Scalable cloud data warehouses (Amazon Redshift, Google BigQuery, Clickhouse), data clouds (Snowflake, Databricks, Dremio), new data pipeline orchestration tools (Airflow, Kubeflow, Hightouch), and an even bigger list of new MarTech apps.
With that, many companies started wondering: Why settle for a closed-loop, rigid CDP when we can assemble a composable CDP using a broad selection of available technologies?
Think of a composable CDP as a collection of separately acquired Lego-like blocks. Instead of overpaying for features your team doesn’t need, a composable CDP lets you build a ‘best of breed’ toolkit to support your bespoke use cases by combining platform-native and third-party components.
Composable CDPs can also integrate directly with your existing data infrastructure instead of operating as a separate entity. This improves data visibility (a.k.a. gives the single source of truth for all connected applications). Lastly, they enable a better degree of control over your data and more room for infrastructure optimization.
Why opt for a composable CDP architecture?
Early CDPs had a ‘boxed’ offering, where aggregated customer data resided outside the company’s data infrastructure and eventually became another form of silos. Effectively, the first CDPs were the equivalent of modern cloud data platforms, offering users the ability to warehouse their data and then figure out how to connect it with the other tools in the chain.
Today, companies no longer need CDPs as a mere warehousing solution (and one with substantial integration constraints). Instead, brands like Warner Music Group, Chime, and Atlassian, among many others, choose to build composable architectures, combining different data warehousing technologies with preferred marketing tools and add-on services from composable CDP vendors (e.g., identity resolution or AI-powered customer segmentation).
Here are six substantial reasons why composable customer data platform architecture is worth considering. Let’s have a closer look at each.
Unified data view
Off-the-shelf CDPs promised single customer views—composable CDPs delivered that. In a composable CDP architecture, marketing features are decoupled from the data infrastructure layer. You can connect CDP tools directly to any type of data cloud platform you’re using and query it from that environment. No more expensive data duplication or elaborate data pipelines for data access.
Rather than funneling data into a CDP repository (which is expensive), you can resurface it from the connected storage layer. Using SQL queries, you can access data anytime instead of relying on limited proprietary APIs. So no more data fragmentation or privacy concerns to worry about either.
Composable CDPs integrate various data sources, aligning with the unified data view concept. The diagram below showcases how composable CDP (customer data platform) integrates data flow from websites, mobile apps, servers, SaaS tools, and advertising data through a modular system. The flexible architecture facilitates seamless data collection, storage, and activation, eliminating data duplication and enabling efficient querying and direct access via AWS, Azure, and Google Cloud.
In this context, composable CDPs streamline data management and activation, ensuring all relevant data is included in the analysis without unnecessary complexity or costs.
Technology agnostic
Traditional CDPs hooked users into their marketing data warehouses. Composable CDPs (customer data platforms) , in contrast, are technology agnostic—you can connect any supported data warehouse, data lake, or lakehouse. You pay to access other capabilities like their reverse ETL or data enrichment services for customer activation. Or propensity modeling and advanced personalization tools for customer engagement. With this approach, you avoid data-store vendor lock-in because you can easily switch providers or add new ones.
Likewise, you can easily add extra capabilities for customer data analytics and activation by funneling 360-degree data into other business applications—a mobile SSP, a data clean room, or a creative management platform.
Flexible use cases
On average, marketing teams use only 42% of the available MarTech stack capabilities (yet you get billed for 100% with a traditional CDP). Likewise, off-the-shelf offerings often lack industry-specific features. A retailer, for example, may want to use product catalog data for email campaign personalization, but a CDP doesn’t support this.
Composable architectures allow you to expand the range of supported use cases without purchasing additional technology (extra modules, new licenses, etc). Apart from using platform-native tools, you can also push data to third-party business or marketing applications using a reverse ETL tool (e.g., Census ETL or Fivetran) to support more user use cases.
Finally, composable CDPs allow your data science teams to embed custom business logic, run ad hoc data analysis, or perform advanced identity resolution. This way, marketers gain the data and features they need for their use cases while your data engineering team stays focused on optimizing existing infrastructure rather than building new integrations and pipelines.
Faster time-to-value
Traditional CDPs require substantial data transformations to match their proprietary data schema and significant integration efforts. Standard deployments take 6 to 12 months.
Composable CDPs support serverless architectures, so you can deploy new code without managing the infrastructure. Serverless architecture enables faster implementation of different engineering tasks—from data collection and transformation to automated customer segmentation and audience building.
Business users, in turn, get access to an intuitive interface for querying the available data or accessing it straight from their preferred applications without waiting for the data science team to upload the data to set up and provision access.
Better data governance
Composable CDPs (customer data platforms) minimize data spread across multiple isolated databases. Keeping unnecessary data duplication to a minimum prevents risks to data privacy and compliance. Data remains within your corporate security perimeter, which also makes it easier to manage access controls, data lineage, metadata generation, data quality, and interoperability — the critical elements of good data governance.
Your data teams can also implement appropriate controls to comply with data privacy regulations like GDPR, CCPA, and HIPAA. While cybersecurity professionals no longer have to worry about protecting data in transit to a CDP-owned data warehouse.
Lower total cost of ownership
Packaged CDPs make money by upselling extra data storage (going at a higher rate, compared to data clouds) and extra data transformation or activation features. Composable CDP vendors let you cherry-pick the features you need. To optimize costs, you can also go for open-source solutions (e.g., for ETL/ELT jobs or data modeling).
Since your company also remains in charge of data infrastructure management, you can also implement extra optimization techniques to tame your bill. Xenoss data engineers, for example, helped an AdTech company substantially reduce its data storage costs, while gaining extra efficiencies in performance. We replaced a data model, based on MongoDB servers with Aerospike’s node-local in-memory storage for real-time access. This helped improve server performance, which allowed the company to downscale to 10 servers instead of 450 while doubling traffic.
The future of CDPs is ‘warehouse native’
Early CDPs doubled as data warehouses. But businesses today have better options for building a modern data stack: data clouds, ETL/ELT and reverse ETL, event streaming services, and data build tools.
Reverse ETL, in particular, is improving data portability across systems. With them, data from the warehouse can be routed to third-party MarTech apps in near real-time. For example, Census reverse ETL lets you easily create bilateral data flows between over 200 popular MarTech products, cloud data warehouses, data lakes, transactional databases, event streams, and cloud storage. Census handles pipeline performance, scalability, and security, guaranteeing SOC2, GDPR, and HIPAA compliance. Twilio Segment also recently added a reverse ETL feature to its platform, which lets users automatically activate data from their warehouse in a connected business app. mParticle, in turn, launched warehouse sync—a more advanced solution that supports data mesh architectures.
And with that, the role of cloud data warehouses is changing. Not only do they offer a more cost-effective way for marketing data aggregation and storage, but they also have a growing degree of flexibility to connect with other solutions.
Over the past years, we’ve seen new ‘warehouse-native’ marketing products emerge, built on top of data clouds like Snowflake, BigQuery, Databricks, and Redshift. MessageGears offers a suite of customer segmentation, messaging, and personalization tools. Kubit — a host of self-service tools for product analytics. LiveRamp created a Snowflake-native app for identity enrichment and activation. CDP tech vendors like Amplitude and ActionIQ upgraded their offerings to accommodate warehouse-native deployments.
What we see is a tidal shift from CDPs, being used for getting data to various MarTech and AdTech tools, to using data warehouses for the same purpose.
Because a warehouse-native approach aggregates all data in one place, companies eliminate problems traditional CDPs (customer data platforms) cause like vendor lock-in, data siloing, expensive duplication, and constrained data access.
Sure, a cloud data warehouse isn’t the best solution if you don’t have data engineers to operate the infrastructure and prep data to support activation use cases. Likewise, many CDPs still have attractive analytics suits for customer segmentation, personalization, or conversion optimization.
However, many teams are also choosing to operate their data infrastructure and only acquire certain data activations from composable CDPs. Calendly, for example, uses HighTough reverse ETL to funnel data to downstream tools like Optimizely, Braze, and various advertising platforms. Others, like Make, forgo CDPs altogether and rely on a custom architecture.
The point? We’re past the era of ‘boxed’ monolith CDPs and entering the new one of ‘mix-and-match’ data architectures, capable of supporting marketers through changes in customer behaviors, privacy regulations, and marketing technologies.