What is Identity Resolution?
Identity resolution is the process of connecting fragmented data — anonymous sessions, multiple devices, different channels — into a single, unified customer profile. It's the foundation of every Customer Data Platform.
What is Identity Resolution?
Identity resolutionis the process of matching and merging disparate data records that belong to the same person into a single, unified customer profile. It answers the question: “Are these two data points from the same person?”
In practice, this means connecting an anonymous website visit on Monday, an email signup on Tuesday, and a purchase on a mobile app on Wednesday — all into one customer record with a complete history across every touchpoint.
Without identity resolution, your customer data is fragmented. One person looks like three separate visitors. Your analytics overcount users, your personalization can't recognize returning customers, and your marketing sends duplicate messages to the same person across channels.
How Identity Resolution Works
Identity resolution operates as a continuous, real-time process that runs on every incoming event. Here are the four stages:
Data Collection
Events stream in from every touchpoint — web pages, mobile apps, server-side APIs, CRM, support tools — each carrying whatever identifiers are available at the time.
Identifier Extraction
The system extracts all available identifiers from each event: anonymous IDs, cookies, device IDs, email addresses, user IDs, phone numbers, and more.
Graph Building
Identifiers are linked together in an identity graph. When two events share an identifier, the system connects them — progressively building a map of which identifiers belong to the same person.
Profile Unification
Linked identifiers are merged into a single unified profile. The anonymous session from last week, the email signup yesterday, and today's purchase all become one customer record.
This process is continuous. Every new event — a page view, a form submission, an API call — is an opportunity to strengthen the identity graph with new links between identifiers. Over time, profiles become richer and more complete as more touchpoints are connected.
Deterministic vs Probabilistic Resolution
There are two fundamental approaches to identity resolution, and most platforms use a combination of both:
Deterministic matching links records using exact identifier matches. If two events share the same email address, user ID, or phone number, they are definitively the same person. This is highly accurate but only works when a known identifier is present.
Probabilistic matchinguses statistical models to infer identity from signals like IP address, device type, browser fingerprint, and behavioral patterns. It's less certain but extends coverage to anonymous visitors who haven't identified themselves yet.
| Deterministic | Probabilistic | |
|---|---|---|
| How it works | Exact identifier match | Statistical modeling and inference |
| Identifiers used | Email, user ID, phone, login | IP address, device fingerprint, behavior |
| Accuracy | Very high (near 100%) | Moderate (70-90% typical) |
| Coverage | Limited to known users | Extends to anonymous visitors |
| Privacy impact | Lower risk — explicit identifiers | Higher risk — inferred identity |
| Best for | Cross-device logged-in users | Pre-login journey stitching |
The best identity resolution systems use deterministic matching as the foundation and layer probabilistic signals on top for broader coverage. Deterministic links anchor the identity graph, while probabilistic signals help connect anonymous sessions that are likely — but not certainly — the same person.
Why Identity Resolution Matters
Identity resolution is the difference between fragmented data and actionable customer intelligence. Here are the most impactful outcomes:
Accurate Customer Counts
Without identity resolution, one person using three devices looks like three separate users. Resolution collapses these into one, giving you real customer counts and accurate analytics.
Personalized Experiences
You can't personalize for someone you don't recognize. Identity resolution connects the dots so you can deliver relevant content, recommendations, and offers across every channel.
Accurate Attribution
Understand the full customer journey — from first anonymous touchpoint to conversion — instead of fragmented sessions that look like separate, unrelated visits.
Privacy-Compliant Targeting
Build audiences from unified first-party profiles instead of relying on third-party cookies. Identity resolution on your own data is the privacy-safe path to effective targeting.
Common Challenges
Identity resolution is one of the hardest problems in customer data. Here are the challenges that every implementation must address:
Cross-Device Fragmentation
Users move between phone, tablet, laptop, and work computer. Without a login event, linking these devices to one person requires probabilistic methods that trade accuracy for coverage.
Privacy Constraints
GDPR, CCPA, and browser privacy features (ITP, ETP) limit the identifiers you can collect and how long you can store them. Resolution must work within these constraints.
Shared Devices and Accounts
Household members sharing a browser, or employees using a shared login, can cause incorrect merges. Resolution systems need safeguards to prevent over-merging profiles.
Data Quality
Typos in email addresses, temporary phone numbers, and inconsistent formatting can prevent correct matches or cause false merges. Data normalization is a prerequisite for good resolution.
These challenges are why identity resolution is typically handled by a dedicated Customer Data Platform rather than built in-house. CDPs have spent years solving edge cases around merge logic, conflict resolution, and privacy compliance that would take an engineering team months to handle independently.
Identity Resolution in a CDP
Identity resolution is a core capability of any Customer Data Platform. It's the mechanism that transforms raw event data into unified customer profiles — the fundamental value proposition of a CDP.
How CDPs handle identity: When your SDK sends an event, the CDP extracts all identifiers, checks them against the identity graph, and either links the event to an existing profile or creates a new one. This happens in real time, so every downstream action — personalization, automation, analytics — works against the latest unified profile.
The composable advantage: In a composable CDP, identity resolution runs directly on your data warehouse. This means the identity graph and unified profiles are accessible via SQL, custom ML models can query them directly, and you maintain full ownership of your identity data.
Frequently Asked Questions
Get started today with $150 credits
Create an account instantly to get started or contact us to design a custom package for your business.