New post: Introducing BYO ClickHouse for UserFlux
Learn

What is Identity Resolution?

Identity resolution is the process of connecting fragmented data — anonymous sessions, multiple devices, different channels — into a single, unified customer profile. It's the foundation of every Customer Data Platform.

What is Identity Resolution?

Identity resolutionis the process of matching and merging disparate data records that belong to the same person into a single, unified customer profile. It answers the question: “Are these two data points from the same person?”

In practice, this means connecting an anonymous website visit on Monday, an email signup on Tuesday, and a purchase on a mobile app on Wednesday — all into one customer record with a complete history across every touchpoint.

Without identity resolution, your customer data is fragmented. One person looks like three separate visitors. Your analytics overcount users, your personalization can't recognize returning customers, and your marketing sends duplicate messages to the same person across channels.

Why it matters:Identity resolution is not just a technical capability — it's the prerequisite for every meaningful customer experience. You cannot personalize, segment, or attribute correctly if you don't know who your customers are across touchpoints.

How Identity Resolution Works

Identity resolution operates as a continuous, real-time process that runs on every incoming event. Here are the four stages:

Data Collection

Events stream in from every touchpoint — web pages, mobile apps, server-side APIs, CRM, support tools — each carrying whatever identifiers are available at the time.

Identifier Extraction

The system extracts all available identifiers from each event: anonymous IDs, cookies, device IDs, email addresses, user IDs, phone numbers, and more.

Graph Building

Identifiers are linked together in an identity graph. When two events share an identifier, the system connects them — progressively building a map of which identifiers belong to the same person.

Profile Unification

Linked identifiers are merged into a single unified profile. The anonymous session from last week, the email signup yesterday, and today's purchase all become one customer record.

This process is continuous. Every new event — a page view, a form submission, an API call — is an opportunity to strengthen the identity graph with new links between identifiers. Over time, profiles become richer and more complete as more touchpoints are connected.

UserFlux's approach: UserFlux runs identity resolution natively on ClickHouse, processing identity stitching in real time as events stream in. Profiles are updated within milliseconds, so personalization and automation always work against the latest unified view.

Deterministic vs Probabilistic Resolution

There are two fundamental approaches to identity resolution, and most platforms use a combination of both:

Deterministic matching links records using exact identifier matches. If two events share the same email address, user ID, or phone number, they are definitively the same person. This is highly accurate but only works when a known identifier is present.

Probabilistic matchinguses statistical models to infer identity from signals like IP address, device type, browser fingerprint, and behavioral patterns. It's less certain but extends coverage to anonymous visitors who haven't identified themselves yet.

DeterministicProbabilistic
How it worksExact identifier matchStatistical modeling and inference
Identifiers usedEmail, user ID, phone, loginIP address, device fingerprint, behavior
AccuracyVery high (near 100%)Moderate (70-90% typical)
CoverageLimited to known usersExtends to anonymous visitors
Privacy impactLower risk — explicit identifiersHigher risk — inferred identity
Best forCross-device logged-in usersPre-login journey stitching

The best identity resolution systems use deterministic matching as the foundation and layer probabilistic signals on top for broader coverage. Deterministic links anchor the identity graph, while probabilistic signals help connect anonymous sessions that are likely — but not certainly — the same person.

Why Identity Resolution Matters

Identity resolution is the difference between fragmented data and actionable customer intelligence. Here are the most impactful outcomes:

Accurate Customer Counts

Without identity resolution, one person using three devices looks like three separate users. Resolution collapses these into one, giving you real customer counts and accurate analytics.

Personalized Experiences

You can't personalize for someone you don't recognize. Identity resolution connects the dots so you can deliver relevant content, recommendations, and offers across every channel.

Accurate Attribution

Understand the full customer journey — from first anonymous touchpoint to conversion — instead of fragmented sessions that look like separate, unrelated visits.

Privacy-Compliant Targeting

Build audiences from unified first-party profiles instead of relying on third-party cookies. Identity resolution on your own data is the privacy-safe path to effective targeting.

The business impact: Companies with strong identity resolution see higher engagement, lower acquisition costs, and better retention — because they can deliver the right experience to the right person at the right time, instead of treating known customers like strangers.

Common Challenges

Identity resolution is one of the hardest problems in customer data. Here are the challenges that every implementation must address:

Cross-Device Fragmentation

Users move between phone, tablet, laptop, and work computer. Without a login event, linking these devices to one person requires probabilistic methods that trade accuracy for coverage.

Privacy Constraints

GDPR, CCPA, and browser privacy features (ITP, ETP) limit the identifiers you can collect and how long you can store them. Resolution must work within these constraints.

Shared Devices and Accounts

Household members sharing a browser, or employees using a shared login, can cause incorrect merges. Resolution systems need safeguards to prevent over-merging profiles.

Data Quality

Typos in email addresses, temporary phone numbers, and inconsistent formatting can prevent correct matches or cause false merges. Data normalization is a prerequisite for good resolution.

These challenges are why identity resolution is typically handled by a dedicated Customer Data Platform rather than built in-house. CDPs have spent years solving edge cases around merge logic, conflict resolution, and privacy compliance that would take an engineering team months to handle independently.

Identity Resolution in a CDP

Identity resolution is a core capability of any Customer Data Platform. It's the mechanism that transforms raw event data into unified customer profiles — the fundamental value proposition of a CDP.

How CDPs handle identity: When your SDK sends an event, the CDP extracts all identifiers, checks them against the identity graph, and either links the event to an existing profile or creates a new one. This happens in real time, so every downstream action — personalization, automation, analytics — works against the latest unified profile.

The composable advantage: In a composable CDP, identity resolution runs directly on your data warehouse. This means the identity graph and unified profiles are accessible via SQL, custom ML models can query them directly, and you maintain full ownership of your identity data.

See it in action: UserFlux provides real-time identity resolution as part of its composable CDP platform. See how it compares to other CDPs for identity management: vs Segment, vs mParticle, vs RudderStack.

Frequently Asked Questions

Get started today with $150 credits

Create an account instantly to get started or contact us to design a custom package for your business.

Always know what you pay

Cost effective usage based pricing with no hidden fees.

Pricing details

Start your integration

Get up and running with our APIs in as little as 10 minutes.

API reference