Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate CATE by using Q and F models from DML #946

Open
vnwbd opened this issue Jan 17, 2025 · 1 comment
Open

Calculate CATE by using Q and F models from DML #946

vnwbd opened this issue Jan 17, 2025 · 1 comment

Comments

@vnwbd
Copy link

vnwbd commented Jan 17, 2025

Hi Team,

I am working with an observational dataset and aim to estimate the causal impact. However, since I need to compute the causal effects for various data slices, re-training the DML (Double Machine Learning) model for each slice is not computationally feasible.

To address this, I am considering the following approach and would appreciate your feedback:

Train the DML model on the full dataset by specifying all relevant features in the W argument. The X argument will be set to None as no confounders are explicitly defined.
From the trained DML model, extract the G and Q models as described in the methodology outlined (https://econml.azurewebsites.net/spec/estimation/dml.html#overview-of-formal-methodology).

For each data slice, leverage the pre-trained G and Q models to compute the Conditional Average Treatment Effect (CATE). Specifically, I plan to fit a simple regression model, regressing the Y_res on the T_res for the respective slice to estimate θ (the causal effect).
Does this workflow seem sound to you? Are there any potential issues or limitations with this approach?

Thank you in advance for your insights!

@kosiew
Copy link

kosiew commented Jan 20, 2025

Answering on behalf of Solis V:

Your approach makes sense in terms of computational efficiency, but there are a few important considerations to keep in mind:

Strengths of Your Approach
✅ Computational Efficiency – Training the DML model once and reusing the nuisance models (G, Q) across slices avoids redundant training, which is crucial when working with large datasets.
✅ Conceptual Validity – If the G (outcome model) and Q (treatment model) capture the relevant heterogeneity well, using them for CATE estimation across slices should, in principle, work.

Potential Issues & Limitations
⚠ Misspecification Risk – If treatment effects vary significantly across slices and your pre-trained G and Q models do not sufficiently capture this heterogeneity, you may introduce bias in CATE estimates. The models are trained on the full dataset, so they might not adapt well to specific slices.

⚠ Dependence on Correct Residualization – The performance of your simple regression (Y_res ~ T_res) depends on how well the residualization process removes confounding. If the full-sample models don’t account well for slice-specific effects, you might get misleading CATE estimates.

⚠ Weak Instrumentation in Some Slices – Some slices may have weak treatment variation (e.g., low overlap in propensity scores), leading to poor estimates of the residuals. You may want to check the strength of residualized treatment variation within each slice before proceeding.

Suggestions for Improvement
🔹 Diagnostics & Validation – Before fully committing, you might want to compare the slice-specific CATE estimates obtained via your method to those from a fully re-trained DML model (for a few key slices) to check for discrepancies.
🔹 Flexible G & Q Models – If feasible, consider using models that allow for interactions between covariates and treatment effects to better capture heterogeneity.
🔹 Alternative Estimation Methods – Instead of simple regression (Y_res ~ T_res), you could explore non-parametric approaches like kernel regression or local linear regression to ensure robustness.

Final Verdict
Your approach is reasonable given computational constraints, but it comes with risks related to model generalization across slices. If you ensure that your residual models (G, Q) sufficiently capture heterogeneity and validate your estimates against a baseline, it could be a practical solution. 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants