Every little thing you discovered about causal inference in academia is true. It’s additionally not sufficient, and most of us doing utilized causal inference expertise it.
, what’s totally different is the gravity of the choices that lean on the evaluation: not each determination deserves the identical degree of proof. Match your rigour and causal inference to the gravity of the choice, or waste assets.
Take product discovery. Earlier than constructing and transport, many assumptions want validation at a number of steps. Aiming to nail every reply with excellent causal inference; for what? Shifting up one sq. on a board of many related, even needed, however on their very own inadequate choices. The chance is already unfold, hedged, over many choices, because of a course of that values incremental proof, studying and iterations.
Concurrently, causal inference comes with materials alternative value: the rigour requires delays time-to-impact, whereas there may have been a undertaking ready for you the place this rigour was really wanted to enhance the choice high quality (cut back danger, enhance accuracy and reliability)
Ultimate vs. constructive choices is my go-to framing to make this concept easy:
- Constructive choices transfer you ahead in a course of. “Should we explore this feature further?”, “Is this user problem worth investigating?” Getting it improper prices you a dash, possibly two, whereas getting it proper doesn’t change the corporate, but.
- Ultimate choices commit assets or change route, and getting it improper is pricey or arduous to reverse: “Should we invest $2M in building this out?” “Ought to we kill this product line?“, “Ought to we allocate extra advertising funds into this or that channel?“
In tech, the amount and tempo of choices is unparalleled. Generally, these are remaining choices. However rather more frequent are constructive choices.
As information scientists we’re concerned in each varieties, and failing to recognise after we are coping with one or the opposite results in posing the improper questions or chasing the improper solutions, losing assets, finally.
On this article I need to floor three guidelines that I hold coming again to when embarking on causal inference initiatives:
- Begin with the issue, not with the reply
- When you can clear up it extra simply with out causal inference, do it
- Do 80/20 in your causal inference undertaking too
Guidelines hardly ever sound enjoyable. However these helped me enhance my influence by tons, really.
Let’s unpack that.
1. Begin with the issue, not the reply
Each causal inference undertaking begins with the issue you’re making an attempt to resolve; not with the identification technique and the estimator. It’s the right instance of doing the precise factor, over doing issues proper. Your strategies may be on level, however what’s the worth if you’re fixing for the improper factor? Nudge your self to kick off a undertaking with a crystal clear enterprise downside backing it up, and also you’d get 50% of labor is finished earlier than even beginning.
When you’re extremely technical, likelihood is you realize the anatomy of a causal inference undertaking: from DAG to mannequin, to inference, to sensitivity evaluation, and solutions.
However have you learnt the anatomy of downside fixing in organisations?
The issue behind the issue
Huge issues get damaged down into smaller ones. That’s simply extra workable for a crew that should discover options. And it permits us to mobilise a number of groups to resolve totally different a part of the larger (sub) downside. The identical goes throughout roles inside one crew: you’re estimating churn drivers; your PM wants that to resolve whether or not to put money into retention or acquisition.
That’s the problem: the issue you, the info scientist, are fixing is commonly not the endgame.
Your downside is nested inside another person’s. Different individuals, round you and above you, want your reply as one enter to their resolution. Recognise that dependency, and you may tailor your causal inference to what really issues upstream. The wins are concrete: tighter alignment on the causal estimand of curiosity, or faster discarding of causal inference altogether. Backside-line: shorter time-to-insight.
One time I used to be into community concept (Markov Random Fields was what made me perceive DAGs again in 2018). Every little thing was a community in my head. So I went to make a community of our inside BI functionality utilization. All dashboards have been nodes and they might have thicker edges between them after they have been utilized by the identical customers. I calculated all types of centrality metrics; I recognized influential dashboards: dashboards that introduced departments collectively; and rather more. I made a complete story round it, however actions by no means adopted. The problem was that I had by no means paid consideration to the issue my stakeholders have been making an attempt to resolve. Maybe I assumed the choice was of the remaining kind, whereas it was a constructive one all alongside. A easy rely of dashboard utilization may’ve executed the job, however I handled it as a analysis undertaking.
That was me then. And it wasn’t the final time one thing like that occurred. However the lesson discovered is to begin with the issue, not with the solutions.
The anti-rule: trying on the improper issues
If you would like a fast technique to throw away cash, then go clear up the improper issues. Not solely will the options don’t have any materials consequence, but additionally the chance value of not fixing the precise downside in that point will add up.
So, in being keen to seek out the issue behind the issue, be crucial about whether or not it’s the precise one to start, whenever you discover it.
In that sense, beginning with the solutions does provide the treatment. Nevertheless it goes barely in a different way. Ask your self:
- If we do get these solutions, what do we all know that we didn’t know earlier than?
- If we all know that, then so-what?
If the reply to the so-what query makes a whole lot of sense, not solely to you, but additionally to your supervisor and their supervisor (presumably), then you definately’re on the precise downside.
Magical.
2. When you can clear up it extra simply with out causal inference, then do it
There’s no cookie-cutter causal inference. Strategies grow to be canonical as a result of we’ve mapped their assumptions nicely; not as a result of utilizing them is mechanical. Each state of affairs can violate these assumptions in its personal means, and every one deserves full rigor.
The problem with that, although, is that we are able to’t justify doing so for all of them, resource-wise.
That’s when making use of causal inference turns into a cheap train: how a lot of the assets we could put in, in order that we attain the specified consequence with some needed degree of confidence?
Ask your self that query subsequent time.
Fortunately, each evaluation wants to not be as rigorous as a full causal inference undertaking to make the return of funding tip over to the optimistic aspect.
The alternate options: widespread sense, area data, and associative evaluation, derive good-enough solutions too.
It undoubtedly hurts a bit to say this; principled and rigorous me hates me now. However I’ve discovered that it pays to method the trade-off as a strategic selection.
Right here’s an instance to convey it house:
The query is: ought to we make investments additional in characteristic A? Now, I can simply flip this round to: what’s the influence of characteristic A on consumer acquisition/retention? (a quite common angle to soak up a SaaS state of affairs; and a causal query at its coronary heart)
If it’s excessive, then we put money into it, in any other case not.
That phrase influence alone places me straight right into a causal inference mode, as a result of influence ≠ affiliation. However we all know that’s pricey. Is the issue value it? What’s the choice?
One method is to know how many customers are utilizing this characteristic in any respect. How frequent do they use it, on condition that they selected to make use of it? That signifies how helpful a characteristic might be, and sign that we are able to additional make investments on this characteristic. No diff-in-diff, nor IPSW, nor A/B check: but when these solutions return adverse, would a exact causal inference matter nonetheless?
The reality could also be within the center; solutions to these query could also be extra indicative than decisive, and the primary query should really feel open. However absolutely, much less open than whenever you began: if these solutions ignite deeper analysis, then the product crew is in movement, and sure within the route. Maybe extra rigorous causal inference follows.
The anti-rule: skipping causal inference is harmful
Say, the product crew picks up the indicators out of your evaluation and makes some materials “improvements” to the characteristic. The pattern dimension is low and they’re quick on time, so that they skip the A/B check and launch it instantly.
Fanatic experimenters lose it at this level. I believe that it could very nicely be the precise determination, if someone did the mathematics and concluded there’s extra at stakes to experiment, than to to not. In fact I saved the case so generic nobody can really defend both aspect. That’d transcend the purpose.
However then, whereas the crew jumps onto the subsequent dash, the product administration nonetheless stresses how essential it’s to be taught one thing from what they launched beforehand. They nonetheless need to a) get a sense of the influence, and b) whether or not some segments the place impacted kind of than others.
You’re blissful as a result of learnings -> iterations is precisely the mentality you are attempting to foster. However you’re additionally in ache for at the least three causes:
- Lack of exchangeability: you realize that the customers that went on to make use of the characteristic are a extremely self-selected set. Contrasting them towards non-users. Actually?
- Interacting results: assume that one phase was certainly impacted greater than others. Now recall the primary level: we’re conditioning on extremely engaged customers. It could be that that phase displayed a better influence merely as a result of the customers have been additionally extremely engaged. The identical segments might not present that differential influence after we take into account decrease engaged customers. However you possibly can’t know. You’re working information is skewed in the direction of extremely engaged customers solely.
- Collider bias: in a worse case, conditioning on excessive engagement might flip across the relationship between segments and the result of curiosity. The evaluation would steer the crew to the improper route.
3. Do 80/20 in your causal inference undertaking too
The title is a false good friend. I’m not saying half-bake your evaluation: when the query calls for full rigor, give it. The 80/20 is about the place your effort goes throughout a call, not how deep you drill into the causal piece.
Recall the nested issues concept. Your causal inference undertaking typically sits inside a bigger enterprise determination, and it hardly ever is the one dimension that issues. The stakeholder has to weigh value, timing, strategic match, reversibility; alongside your estimate. Causal inference just isn’t every part we have to know.
In case your causal reply carries 30% of the load in that call, treating it like 100% is a waste. Worse: it’s a waste with a possibility value, as a result of the opposite 70% sits unanswered.
That is the place the final-vs-constructive framing earns its hold. For constructive choices, spreading effort throughout dimensions virtually at all times beats drilling into one. For remaining choices, the causal dimension typically is the core, and the mathematics suggestions the opposite means.
Guidelines 1, 2, and three overlap however they aren’t the identical. Rule 1 requested whether or not you’re tackling the precise downside. Rule 2 requested whether or not you want causal inference in any respect. Rule 3 assumes you’ve cleared each. Now the query is: throughout the undertaking, are you answering the precise questions, plural, and letting causal inference carry solely the load that’s really on it?
Ship the choice, not the estimate
A latest undertaking: estimate the impact of a brand new pricing tier on income per consumer. Instinctively, I reached for the cleanest identification technique I may deploy. Distinction-in-differences with parallel-trends sensitivity, placebo assessments, possibly a synth management for good measure. A month’s work, simply.
However after I zoomed out, the PM had three open questions, not one:
- What’s the impact on income per consumer? (causal)
- Are we cannibalising the present tier? (causal, totally different consequence)
- How reversible is that this if it tanks? (not causal; an ops and product query)
Spending a month on query 1 would have left 2 and three half-answered. The choice wanted all three to be roughly proper, not one to be exactly proper. So: a tighter diff-in-diff on query 1 in two weeks, with specific caveats, and the remaining time on 2 and three. The stakeholder walked into the choice assembly with a balanced image fairly than one quantity and two shrugs.
The anti-rule: when the causal query is the choice
When you 80/20 a causal inference undertaking the place the causal estimate is the entire determination, you’ve hollowed out the evaluation.
That is the final-decision state of affairs. “Should we invest $2M in this channel?” “Does this treatment cause a meaningful reduction in churn?” When the opposite dimensions are both already nailed down or genuinely secondary, the causal estimate just isn’t one among many inputs; it’s the enter. Slicing corners there to unencumber time for work that doesn’t change the choice inverts the unique rule: now you’re misallocating the opposite means.
The talent is understanding which state of affairs you’re in. A fast check: when you can’t listing three dimensions your stakeholder wants in addition to your estimate, your causal reply in all probability is the choice. Don’t 80/20 that one.
So, what now?
These guidelines apply throughout all analytical work, not simply causal inference. However causal inference is the place I’ve felt it the toughest in my previous roles.
Each time I really feel the pull of a clear synth management for a query no one requested, these are the reminders I tape to my very own brow:
The strategies come from finding out them. That’s one thing I gained’t cease. However on the market, on the battlefield, let’s be sharp on when making use of them does good, and when not.
If one among these guidelines prevent a dash subsequent time, or an argument with a PM, that’s already a win; and these wins compound. Rigour reveals up when it issues. The remainder of your time goes to issues that additionally matter.
I’d be blissful to have a dose of wholesome debating with you about all of the above. Join with me on LinkedIn, or observe my private web site for content material like this!



