Blog May 26, 2021 |

Why a CDP’s Database Architecture Matters

While there are myriad ways to divide up people, I think it’s fair to say that at least one dividing line is between puzzle people and non-puzzle people.

You know which one you are: You either love to do jigsaw puzzles — the more pieces the better — or you’d be happy to do basically anything else to pass time.

If you have an opinion one way or the other, you’ll indulge me on an analogy that customer data platforms (CDPs) are like jigsaw puzzles. Specifically, a CDP’s database architecture can be compared with how one solves a puzzle.

The 3 types of CDP database architectures to know when evaluating vendors

Before I explain the comparison between CDPs and jigsaw puzzles, let’s review the three kinds of databases, which determine possible CDP database architectures.

These databases are used by what Gartner calls “pure–play” CDPs. (The consulting firm goes into far greater detail about this in its Market Guide for Customer Data Platforms.)

1) Relational Databases

A customer data platform that uses a relational database is by far the most inflexible when it comes to enabling a technology user to define their own taxonomy.

Relational databases are highly structured architectures whereby the database enforces the relationship(s) between objects. Once implemented, you must work within its structure.

An example: A relational-database CDP needs to pre-define a relationship between unidentified site visitors and a campaign in order to store that anonymous info.

The campaign is the organizing principle. Which makes sense, as most CDPs with a relational database foundation are actually more like campaign management tools than built-for-purpose CDPs like BlueConic.

2) Event-Stream Databases

A CDP with an event-stream database is the most likely to explode in size and cost to scale.

These databases are at the other end of the spectrum from relational databases because they collect massive amounts of raw data into a big data structure.

But it’s only after the fact that it’s up to business technology users to sort through the raw data and determine what data should be mapped to the profile graph or be stuck with the event-to-graph schema the customer data platform ships with.

In this context, the profile graph is a set of attributes associated with an ID, but not consolidated into a separate entity — a la a unified customer profile.

These types of customer data platforms lean heavily into web and mobile, since these channels deal primarily in events (e.g., clicks, swipes, page views, etc.).

And, because of the volume of raw data they collect (and optionally store), they may be preferred by anyone who cares more about the breadth of data than the marketing & CX segmentation and personalization use cases supported by a built-for-purpose CDP.

3) Profile Databases

A CDP with a profile database, such as BlueConic, sits somewhere between the other two by offering a balance of flexibility and activation intention when it comes to data collection and consolidation.

That’s because there aren’t related tables or stores, which means values can be added easily and scale without limit.

BlueConic can create and store true unified profiles at the individual level, as opposed to just creating a chaotic graph and its deconstructed events or enforcing an arbitrary data schema on the data set.

Since, by definition, CDPs must provide a “persistent” profile, BlueConic’s profile database provides both high volume storage and fast read and write speeds so that data is unified and actionable for key tasks such as for segmentation, personalization, and predictive analytics.

(Fun fact: BlueConic is built on top of the same database technology that both Netflix and Rackspace use. We’ll discuss that more in a separate blog post, though).

Why choosing the right CDP matters

Here’s where the puzzle comes in.  

You are given a 100-piece puzzle. You solve it readily. Then you’re told that it’s actually only one part of a much bigger puzzle and you’re given another 500 pieces to add. Upon completion, again, you are given another 1,000 pieces to expand the puzzle with.    

Imagine that each puzzle piece is a data point about an individual person — prospect or customer — and your “image” of that person keeps expanding with every new piece.

The complete puzzle represents a unified profile for that person. The surface on which you are constructing the puzzle represents the CDP database. 

Now, let’s look at how much more difficult or easy it would be to keep adding new pieces — data points — depending on what type of database your CDP is using:

  • If you started with a relational database, the surface for your puzzle would have come with a space carved out for each of the first 100 puzzle pieces in advance. But as soon as you needed to expand to an additional 500 pieces, you’d have to take the whole thing apart and build a new surface from scratch with an additional 500 spaces carved out for what you need, which takes a lot of time, resources, and money. In the off chance you know exactly what you need ahead of time, this might not be a problem. But I’ve yet to find a company who can say that was the case.

  • If you started with an event-stream database, those first 100 pieces were manageable. But to add 500 more pieces actually means finding those 500 in a bucket of pieces with 10x more than you need. And then 20x. And 50x. Suddenly, in order to complete the puzzle, you have to weed through thousands of pieces for every one piece you need to find. Eventually, you’re running out of table space to account for all of these pieces. So you have to either 1) buy a much bigger table or 2) choose which puzzle pieces to throw away because there just isn’t room for all of it.

  • If you started with a profile database, you’d have an unblemished table that comes with some guidelines about where to start building and where to place puzzle pieces in anticipation of what’s to come. This table has an endless number of extending leaves within it. So when you need more room to account for the additional pieces, you can expand adjacently — and only as much as you need in that moment — as opposed to buying a bigger table. When you get a critical piece that changes the whole complexion of the puzzle, you can shift the entire work-in-progress to keep building in a new direction.

To be clear: Different database architectures underlying CDPs offer different strengths.

But for organizations that want to make the customer the central object of their entire tech stack, taking into account what the long-term implications and as yet-unforeseen consequences could be is an exercise well worth undertaking before you buy a CDP.

Otherwise, you might end up like me and avoid puzzles altogether.

Related Resources