How-to guide · Updated 2026

How to deduplicate a LinkedIn lead list

Lead lists from different sources almost always have overlaps. Here's how to merge and dedupe them without losing data — or the most valuable row.

Step-by-step guide

To deduplicate a LinkedIn lead list from multiple CSV exports, dedupe by LinkedIn profile URL (the only truly unique identifier for a person across sources), not by email or name. When the same person appears twice, keep the row with more complete data — not the first one. Scrupp handles this automatically when you bulk-upload multiple CSVs: it normalizes LinkedIn URLs (stripping query params), merges rows on URL match, and keeps the "best" version of each field across duplicates.

The full walkthrough

6 steps — about 10-15 minutes end-to-end.

  1. 1
    Combine all CSV sources into one file

    Copy rows from every source CSV into a single sheet. Keep a "source" column to track where each row came from.

  2. 2
    Normalize LinkedIn URLs

    Strip query parameters (?miniProfileUrn=...) and trailing slashes. Normalize to lowercase. Example: https://linkedin.com/in/john-doe/ → linkedin.com/in/john-doe

  3. 3
    Sort by LinkedIn URL

    Sort the combined sheet alphabetically by the normalized URL column. This groups duplicates together.

  4. 4
    Merge duplicate rows

    For each group of duplicates, create a single merged row. For each field, keep the non-empty value (or the longest/most complete). This preserves data across sources.

  5. 5
    Remove the duplicate rows

    Keep only the merged row. In Google Sheets: use =UNIQUE(A:A) on the URL column and lookup the merged values. Excel: use Remove Duplicates on URL column.

  6. 6
    Verify row count + re-enrich if needed

    Check total count vs sum of source CSVs. The dedupe typically removes 15-30% of rows when merging from 2-3 sources. Re-run enrichment on the merged list if emails are stale.

Things that trip people up

Dedupe by LinkedIn URL, not email. Emails change jobs. URLs don't (except for rare renames). Name-based dedupe is the worst — different people share names.

Keep a "source" column. You'll want to know which source contributed each lead for attribution later.

Merge, don't discard. If Source A has the email and Source B has the phone, the merged row should have both.

Don't dedupe before enrichment. Enrich each source separately first, then merge. This maximizes data coverage per person.

Common questions

What if two people have the same name and company?

They'll have different LinkedIn URLs — URL-based dedupe keeps them separate. This is why URL-based dedupe is correct and name-based dedupe is dangerous.

How do I merge rows in Google Sheets automatically?

Use an array formula with IFERROR(INDEX(MATCH)) to pick non-empty values across duplicates. Or use a pivot table with "First" or "Last" aggregation. Scrupp handles this in one click on upload.

What about people who changed companies between exports?

The LinkedIn URL stays the same, but the company field will differ. Keep the most recent export's version as authoritative.

Is there a tool to automate this?

Yes — bulk upload tools accept multiple CSVs and deduplicate + merge on LinkedIn URL automatically. No manual spreadsheet work needed.

How many duplicates should I expect?

When merging 2-3 sources (e.g. Sales Navigator export + Apollo export + manual research), expect 15-30% overlap. The more targeted your ICP, the higher the overlap because multiple tools find the same people.

Should I dedupe before or after enrichment?

After. Enrich each source separately first — this maximizes data coverage per person. Then merge the enriched CSVs. During merge, keep the version of each field with the most complete data (e.g. Source A has email, Source B has phone → merged row has both).

What about duplicates in my CRM?

Before importing, cross-check the deduped list against existing CRM contacts by email address. Most CRMs (HubSpot, Salesforce, Pipedrive) have built-in import deduplication — set the merge key to email and choose "update existing" to fill empty fields without overwriting.

Can I dedupe across different LinkedIn URL formats?

Yes — normalize URLs first by stripping query parameters, trailing slashes, and converting to lowercase. linkedin.com/in/John-Doe/ and linkedin.com/in/john-doe?miniProfileUrn=abc become the same key after normalization.

Ready to try it?

Free Chrome extension. Pay only for successful enrichments. No credit card to start.

5,000+
sales teams
4.8/5
G2 & Capterra
2M+
leads exported
65%
avg email find rate