Instagram prospecting

Meta and Google custom audiences from Instagram-extracted lead lists

How to upload Instagram-extracted contacts into Meta Custom Audiences and Google Customer Match — match rates, compliance, refresh cadence, and pitfalls.

Teseo Calvente Head of Growth Research, Scraphex

Published May 19, 2026 · Updated May 19, 2026 · 12 min read

Most teams that commission an Instagram-extracted lead list put it to work in exactly one channel: cold email. The CSV gets imported into an outreach tool, sequences fire, replies come in, and the file sits on a shared drive until the next campaign cycle. That is leaving 30–60% of the channel’s economic value on the table. The same list, uploaded as a custom audience inside Meta Ads and Google Ads, gives the same contacts a second and third touch — paid impressions on Instagram and Facebook, paid search retargeting on Google — for a fraction of the cost per contacted account that a fresh prospecting campaign would carry.

This post is the operational guide to making that work. It covers what you can actually upload to each platform, the match rates to expect when the source is Instagram bios rather than a CRM, the compliance posture EU advertisers need to keep clean, the refresh cadence that keeps audiences performing past month one, and the three-channel choreography that turns a single Scraphex deliverable into a coordinated outbound + retargeting + search system.

Why paid retargeting is the natural second step after cold email

Cold email reaches a recipient’s inbox. Whether the recipient sees it depends on inbox prioritisation, time of day, and whether the recipient happens to be reviewing their inbox in the 72 hours the message is fresh. For lists sourced from Instagram, the open-window economics typically look like this:

35–55% open rate on the first send (assuming the deliverability discipline in our deliverability playbook is in place)
18–35% open rate on the second touch
1.5–4% reply rate across the full sequence

The remaining 60–80% of the list never visibly engages with the email itself — but they have a real, mappable identity. Their email address resolves to an account inside Meta and, for a smaller subset, inside Google. Uploading the list as a custom audience puts paid impressions in front of those exact contacts on Instagram, Facebook, Messenger, the Audience Network, and (separately) on Google Search, YouTube, and Display. The contact does not need to open the email for the paid layer to function. That is what makes the two channels complementary rather than substitutes.

The numbers below are what we see across well-run campaigns. They are not promises — they are the working range when the data hygiene and compliance posture are right.

What you can actually upload to each platform

Both Meta and Google accept hashed contact data and use it to match against accounts on their respective platforms. The fields, format, and behaviour differ in ways that matter when the source is an Instagram extraction.

Meta Custom Audiences

Meta accepts the following identifiers from a customer file:

Identifier	Required?	Match contribution
Email	High match value	Primary identifier for Instagram-sourced contacts
Phone number	High match value	Adds 20–35% to match rate when available
First name + Last name	Modest	Helps disambiguate near-matches
City + Country	Modest	Used for tie-breaking
Date of birth	Marginal	Almost never present in Instagram extractions
Mobile advertiser ID (MAID)	Strong but rarely available	Not derivable from Instagram extractions

Meta hashes everything client-side before upload (SHA-256). You can submit either pre-hashed or plaintext — pre-hashed is the cleaner posture for a list that leaves your laptop. The audience must contain at least 100 matched users before it can be used to target.

For a Scraphex-style Instagram-extracted list (email + sometimes phone + bio-derived first name), the typical Meta match rate is 25–45% of the uploaded rows. That sounds modest until you realise that the matched subset is, by definition, the segment that is actually findable inside Meta. The unmatched rows are largely either non-Meta accounts or accounts that registered with a different email than the one they published on their Instagram profile.

Google Customer Match

Google’s Customer Match has a different identifier set:

Identifier	Required?	Match contribution
Email	Primary	Drives the bulk of matches
Phone number	Strong secondary	Adds 15–25% match
Mailing address (full)	Optional	Marginal lift, rarely available
First/last name	Required when uploading address	—

Google requires a minimum 1,000 matched users before the audience is eligible to serve ads on Search, YouTube, Gmail, and Discover. Display campaigns have similar minimums. For Instagram-extracted lists, the Customer Match rate is usually lower than Meta — typically 18–30% — because Google identity is anchored to Gmail addresses, and a portion of Instagram-published emails are non-Gmail addresses (.es business domains, ProtonMail, Outlook) that Google can only match if the same user has linked them to their Google account.

The upshot: if your list is 6,000 rows, expect a Meta custom audience of 1,500–2,700 matched users and a Google Customer Match audience of 1,000–1,800. The remainder is unmatched, which is normal.

What you cannot do

A small but recurring trap: Meta and Google’s terms of service prohibit uploading lists you do not have a defensible lawful basis to contact. Both platforms have audited customer file workflows over the past two years. Lists obtained without any lawful basis — purchased dumps, scraping outputs with no targeting logic, undocumented bulk acquisitions — risk audience disable on review. The defensible posture is the same as for cold email: documented lawful basis (legitimate interest, properly balanced), a working opt-out, and a privacy notice. We walk through the LIA pattern for Instagram-sourced contacts in our legal post on cold email and Instagram leads.

Match rate in detail: why Instagram lists behave the way they do

Instagram-extracted contacts have a specific match profile because of how the email got into the bio in the first place.

Personal Gmail addresses match well in both platforms. A founder who put their personal [email protected] in the bio because that is also their work email matches at 75–90% inside Meta and almost always inside Google.

Business contact emails match unevenly. An [email protected] or [email protected] will match in Meta only if the account holder has linked that email to a Facebook or Instagram account (often they have, because they created the Instagram business profile with it). The match rate drops to 35–55% for these rows.

ProtonMail, Tuta, and other privacy-focused providers match poorly. Users on these providers actively avoid being trackable. Meta match rate falls to 10–20%, Google match rate to near zero.

Country-code domain emails match best when paired with phone numbers. A Spanish .es business email alone will match modestly in Meta. The same row with a +34 phone added jumps the match probability significantly because Meta uses phone as a strong secondary identifier.

The practical implication for list buyers: ask the data provider whether phone numbers are included alongside emails. A list with both runs much hotter in paid audiences than a list with email only. Scraphex deliverables include phone where it is publicly available in the profile or bio — typically 35–60% of rows depending on niche.

Lookalikes and similar audiences: when they are worth running

Once a matched custom audience exists, both platforms let you create lookalike (Meta) or similar-audience (Google) extrapolations. These find users on the platform who resemble the matched seed.

For Instagram-extracted seeds, lookalikes work surprisingly well — better than from a CRM contact list in many cases. The reason is that the seed is, by construction, behaviourally homogeneous: every member of the seed was either a follower of a specific competitor, an engagement on a specific niche post, or an account with a specific bio keyword pattern. That is a tighter behavioural signal than “everyone who bought from us in the last year”, which is the typical CRM seed.

Practical sizing for lookalikes off Instagram-extracted seeds:

Meta Lookalike 1% in the source country: usually the strongest, captures roughly 200k accounts in Spain
Meta Lookalike 2–3%: useful for prospecting at scale once 1% saturates
Meta Advantage Lookalike (post-2024, machine-learning-driven): often outperforms classic 1% when seed quality is high
Google Similar Audiences: deprecated for most placements in 2023; Google’s optimised-targeting and audience-signals replace it. Upload the custom match list as an audience signal for Performance Max or Demand Gen campaigns instead.

The audience-signal-into-Performance-Max workflow is the Google equivalent of a Meta lookalike in 2026. Treat the matched Customer Match list as a seed for the algorithm, not as the targeting itself.

Compliance posture for EU advertisers

The compliance bar for paid retargeting from cold-sourced lists is the same one as for cold email, plus one extra step. Five elements need to be in place:

Documented lawful basis for processing the contact data. Legitimate interest with a written balancing test (LIA) is the workable pattern for B2B Instagram-extracted contacts. See the linked legal post for the template.
Privacy notice accessible from any landing page the ads point to, naming the categories of data processed and the lawful basis.
Opt-out mechanism that works across both email and paid channels. A user who unsubscribes from the email sequence must be removed from the custom audience in Meta and Google at the same time.
Cross-channel suppression list maintained as a single source of truth. If you suppress in the cold email tool but not in Meta, the user keeps seeing ads after they asked to leave. That is the most common compliance failure for hybrid email + paid campaigns.
Hashing and minimal data submission. Submit only the fields the platform actually uses for matching. Do not submit notes fields, bio snippets, or anything else that is not required for the match. Both Meta and Google document the field set; stick to it.

For accounts where the audience is a list of EU recipients, the LIA should explicitly cover the secondary purpose of paid retargeting, not only the email use case. A single LIA that says “contact data will be used for email outreach” and is silent on paid audiences will not defend the paid use if challenged. The two purposes are technically distinct under GDPR Art. 6(1)(f) and should be named separately.

Refresh cadence: keeping the audience working past week three

Instagram-extracted custom audiences degrade in three ways across the first 90 days.

Match decay. Meta and Google periodically re-evaluate the user-to-identifier link. Accounts that change their primary email, deactivate, or churn from the platform drop out of the matched audience. Expect 10–20% match decay over 90 days even without doing anything.

Frequency fatigue. A matched audience of 1,800 served at 1.5 impressions per user per day exhausts attention quickly. Cap frequency at 1.0/day for cold audiences; 0.6/day is even safer for the first 30 days while creative is being tested.

Suppression growth. As the email sequence runs, opt-outs accumulate. Those names should be removed from the paid audience within a working day. If suppression sync runs weekly, you are over-serving for up to seven days per cycle.

The cadence we recommend to clients running this hybrid model:

Weekly: refresh the custom audience with the latest suppression sync, append any new opt-outs to the platform’s exclusion audience, and review frequency.
Monthly: upload an incremental delta from a fresh extraction (new followers of the same competitors, new accounts on the same hashtags) — this is the same monthly refresh pattern we describe in the agency competitor conquest playbook.
Quarterly: rebuild the lookalike seed if the seed has churned more than 25%.

Audiences that get this maintenance perform two to three times longer than audiences uploaded once and left to drift.

The three-channel choreography

The point of running email + Meta + Google in parallel is not redundancy. It is sequence. Each channel has a different role, and the order matters.

A practical 21-day choreography on a Scraphex-delivered list of 6,000 contacts:

Day 0–1: Upload the list to Meta and Google. Audiences process for 6–24 hours.

Day 2: Send cold email batch 1 to the top-quality 600-row segment. Open rate begins to register.

Day 3: Meta and Google start serving ads to the full matched audience (typically 1,500–2,700 in Meta, 1,000–1,800 in Google). Creative leads with the same value proposition as the email but reframed for a paid context — the recipient now sees the brand in two places, increasing reply-rate to the email’s second touch.

Day 5: Cold email batch 2 to the middle segment. By this point the highest-engagement recipients from the paid layer are starting to click through to the landing page.

Day 10: Cold email third touch to the top segment. Paid audiences begin to show fatigue; frequency cap holds.

Day 14: Suppression sync — remove repliers and opt-outs from both paid audiences. Re-evaluate creative.

Day 21: First weekly refresh complete. Performance reporting compares email-only contribution, paid-only contribution, and the assisted-conversion bucket (saw paid and replied to email, or vice versa). The assist bucket is typically 30–55% of total replies — and it is the bucket that justifies running both channels.

A team running this workflow on its own without the data layer pre-built spends roughly 60% of cycle time on file hygiene and audience refresh. With a Scraphex-style delivery — pre-filtered, pre-enriched, with the suppression-friendly schema and the phone column where available — the same workflow runs in less than a third of that time.

Where the data layer makes or breaks the result

Two structural properties of the source list determine whether the paid layer is worth running at all.

Phone coverage. A list with 35–60% phone coverage produces Meta match rates 20–35% higher than a list with email only. If the data provider does not return phone, ask why; for Instagram public-profile sources, phone is recoverable from a meaningful fraction of accounts and should not be silently dropped from the deliverable.

Source signal preserved as a column. The source_competitor or source_hashtag column on a Scraphex deliverable doubles as the segmentation key inside Meta and Google. Building separate custom audiences per source signal — one audience per competitor seed, one per hashtag — lets you tune creative and frequency per segment rather than blending behaviour into a single blob. The hashtag and location targeting post covers how those source signals get attached at extraction time.

Country segmentation. Spain-only audiences serve at very different CPMs than Spain + Italy + Germany blended audiences. Split the upload by country before audience creation. Most teams forget this on the first run and overpay 25–40% on impressions for the first 30 days.

A practical sample if you want to test the workflow

A list-as-only-email export is a 60% asset for this workflow. A list with email, phone where available, source-signal column, country code, and the bio-derived notes column is a 100% asset.

If you want to see the column structure that supports the email + paid choreography end to end, request a free sample. We will deliver a 50-row segment from a niche of your choice with the full column set, including phone where public and source-signal attribution, so you can run the matching test inside your own Meta and Google accounts before committing to a full campaign. The match-rate result on the sample is usually within 5 percentage points of the match rate on the full extraction, which makes it a clean sanity check before scaling.