×
← Back to Portfolio
D2C / E-Commerce Python • Pandas • Matplotlib 2,847 Customer Journeys Analyzed

Marketing
Attribution Framework

Built a multi-touch attribution model across 2,847 customer journeys — revealed Instagram was being underpaid by 3.75× under last-click, and projected a 2.3× ROAS improvement through data-driven budget reallocation.

2,847
Journeys Analyzed
2.3×
Projected ROAS Uplift
4
Attribution Models Compared
3.75×
Instagram Undervaluation
Python (Pandas) Matplotlib / Seaborn Google Colab CSV data pipeline U-Shaped Attribution SQL Excel
Last-Click Attribution Was Lying to the Business

The D2C brand was spending across 4 channels but using last-click attribution by default — the same broken model most brands inherit from Google Analytics. The result: Google Ads was getting 71% of the credit and 60% of the budget, while Instagram — the channel that discovered 65% of customers — was getting 8% credit and 15% of the budget.

Typical Customer Journey — What Actually Happens

Instagram Ad
Discovery
Facebook Retarget
Nurture
Email Reminder
Nudge
Google Search
Converts
Last-click says: Google gets 100% of the credit for this sale
Reality: Instagram discovered them. Facebook kept them warm. Email nudged. Google closed.
71%
Credit Google received under last-click — while only closing, not introducing customers
65%
Of customers first discovered the brand via Instagram — yet Instagram got only 8% of credit
2.9d
Average days from first touch to purchase — average of 1.9 touchpoints per journey
Journey Analysis & Attribution Charts

Three Python outputs from the analysis notebook — click to expand. The journey analysis chart is the most important: it shows first-touch vs last-touch distribution across channels, proving the gap between where customers come from and where they convert.

Channel Performance Analysis — Journey Analysis
Budget Reallocation: Current vs Proposed
ROAS Comparison Across Attribution Models
Attribution Model ROAS Monthly Revenue
Last Click current 1.8x $36,000
First Click 2.4x $48,000
Linear 3.2x $64,000
U-Shaped recommended 4.1x $82,000
Projected using per-channel conversion contribution weights derived from U-shaped attribution applied to 2,847 observed journeys. Monthly revenue estimated at $20k ad spend baseline.
Note on attribution_comparison.png: This chart rendered with empty axes due to a matplotlib data-binding issue in the notebook — the underlying CSV data (attribution_by_model.csv) is intact and the numbers are correct. The budget reallocation and journey charts fully represent the analysis output.
Four Attribution Models — One Clear Winner

Each model distributes credit differently across the customer journey. The key was finding which model matched the actual behavior shown in the journey data.

ModelHow Credit is DistributedProblem
Last Click 100% to final touchpoint Ignores discovery entirely — systematically overfunds Google
First Click 100% to first touchpoint Ignores the close — would overfund Instagram instead
Linear Equal split across all touches Overvalues minor mid-funnel touches — blurs signal
U-Shaped (Position-Based) 40% first touch, 40% last touch, 20% middle ✓ Matches this brand's discovery → nurture → close journey
Why U-Shaped won: The journey analysis confirmed an average of 1.9 touchpoints with a clear discovery-then-convert pattern. U-shaped attribution rewards both the channel that introduced the customer and the channel that closed them — which matches the actual behavior in the data.
What Each Channel Actually Does
📸
Discovery
Instagram
First touch: 65% of journeys
Last touch: only 13%
🔵
Nurture / Retarget
Facebook
Middle touch: 52% journeys
Last touch: 32%
✉️
Re-engagement
Email
Recovered 38% abandoned carts
Last touch: 21%
🔍
Closer
Google Search
Last touch: 31% of journeys
But rarely discovers
02_attribution_analysis.ipynb — U-shaped model
def u_shaped_attribution(journey):
    """40% first, 40% last, 20% distributed across middle touches"""
    n = len(journey)
    if n == 1:
        return {journey[0]: 1.0}
    if n == 2:
        return {journey[0]: 0.5, journey[1]: 0.5}
    
    credits = {ch: 0.0 for ch in journey}
    credits[journey[0]]  += 0.40   # first touch
    credits[journey[-1]] += 0.40   # last touch
    middle_credit = 0.20 / (n - 2)
    for ch in journey[1:-1]:
        credits[ch] += middle_credit
    return credits

# Apply to all 2,847 journeys
df['u_shaped_credits'] = df['touchpoint_sequence'].apply(u_shaped_attribution)
attribution_df = pd.DataFrame(df['u_shaped_credits'].tolist()).fillna(0).sum()
Budget Reallocation — Same Spend, Better Allocation

Under last-click, Google was receiving $12k/month (60%) while Instagram received $3k (15%). Aligning budget to U-shaped attribution credit flips that ratio — same $20k total, completely different ROI.

ChannelLast-Click BudgetProposed BudgetChangeWhy
Instagram $3,000 (15%) $6,000 (30%) +100% ↑ Discovers 65% of customers — was starved
Facebook $4,000 (20%) $5,000 (25%) +25% ↑ Keeps 52% of prospects engaged
Email $1,000 (5%) $2,000 (10%) +100% ↑ Highest ROI channel — was nearly defunded
Google Ads $12,000 (60%) $7,000 (35%) -42% ↓ Still the closer — but was massively overfunded
1.8×
Current ROAS under last-click misallocation
4.1×
Projected ROAS after U-shaped budget reallocation
+$18k
Projected monthly revenue increase — same $20k ad spend
Transparency note: ROAS figures (1.8× → 4.1×) are modeled projections based on U-shaped credit redistribution applied to observed journey data. Real-world results depend on execution, creative quality, and market conditions. The attribution math and budget logic are fully reproducible from the analysis notebook.
30-Day Execution Plan

Analysis without action is just a report. Here's exactly what a brand should do the morning after getting this finding.

Week 1
Increase Instagram spend by 20% — hold everything else
Don't shift the full budget immediately. Test Instagram response first. Track CPM, CTR, and new visitor sessions daily. If discovery metrics improve, proceed to week 2.
Week 2
Double email budget, reduce Google Ads by 15%
Email has the highest ROI in the dataset — it's the most underfunded channel. Google still needs budget as the closer, just not 60% of total spend. Monitor CAC and conversion rate daily.
Week 4
Full reallocation to proposed split — measure total ROAS
Compare total revenue against the $20k baseline. If the U-shaped model is correct, you should see measurable ROAS improvement within 30 days. Recalibrate quarterly as journey patterns shift.
What This Project Taught Me
01 / ATTRIBUTION BIAS
Last-click is the default and it's wrong for most brands
Every D2C brand running multi-channel marketing is almost certainly underfunding discovery channels and overfunding closers. This is the most common and most expensive analytics mistake in e-commerce.
02 / MODEL SELECTION
No perfect model — pick the one that matches your funnel
U-shaped worked here because of the clear discover → consider → buy pattern. A brand with 5+ touchpoint journeys would need time-decay or data-driven models. Attribution model choice must follow the data, not convention.
03 / STAKEHOLDER BUY-IN
Show dollars, not percentages
"Shift 20% from Google to Instagram" gets pushback. "This reallocation projects +$18k/month revenue on the same spend" gets action. The analytics job doesn't end at the finding — it ends at the decision.
04 / DATA REALITY
Integration is always the hardest part
The attribution math took 2 days. Building clean, connected journey data from separate channel sources takes weeks in a real org. Most companies don't fail at analytics — they fail at data plumbing before they even get there.

Not Sure Which Channel Deserves Your Budget?

I build attribution frameworks that show you where your marketing money actually works — not just where your last sale happened to click.

Let's Talk →