Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errata: "Counterfactual Transportability: A Formal Approach" and "General Transportability of Soft Interventions: Completeness Results" #206

Open
rjc955 opened this issue Jan 28, 2024 · 0 comments

Comments

@rjc955
Copy link
Contributor

rjc955 commented Jan 28, 2024

We have given these two papers a rigorous read while implementing algorithms from them, and have uncovered the following errata.

Errata

A. Correa, Lee, and Bareinboim 2022: "Counterfactual Transportability: A Formal Approach"

  1. p. 4372: the second paragraph of Section 1.1 states, "We denote by $\mathbf{V}(\mathcal{G})$ the set of vertices (i.e., variables) in a graph $\mathcal{H}$." The sentence should finish with "in a graph $\mathcal{G}$."

  2. p. 4375: the first paragraph in Example 4.1 states, "the corresponding ancestral set is $\mathbf{D} = An(Y_{x},W_{x},X,Z)$..." The correct expression for $\mathbf{D}$ is just $\set{Y_{x},W_{x},X,Z}$, removing the $An()$ operator.

  3. p. 4376: Equation 18 states that $P^{\ast}(x_{0_{[z]}},z) = P^{2}(x_{0_{[z]}},z) = P_{wy}^{2}(x_{0},z)$, but we think it should read $P^{\ast}(x_{0_{[z]}},z) = P^{2}(x_{0_{[z]}},z) = P^{2}(x_{0},z)$. In Equation 16, applying Theorem 4.1 to second term of the equality, $P^{1}(y_{xwz},w_{x})$, produces the third term, $P_{xz}^{1}(y,w)$. Equivalently applying Theorem 4.1 to the second term of Equation 18, $P^{2}(x_{0_{[z]}},z)$, produces $P^{2}(x_{0},z)$ and not $P_{wy}^{2}(x_{0},z)$. The difference is that for Equation 16, $\set{X,Z} \in Pa({Y,W})\backslash\set{Y,W}$ (i.e., $\set{Y,W}\in W'$ in Theorem 4.1), and therefore $\set{x,z}$ belong to the intervention set applied to $(y,w)$ in the third term. But for Equation 18, $W'=Pa(X,Z) = \emptyset$ and not $\set{Y,W}$, so $P^{2}(x_{0},z)$ should contain no interventions in the third term.

  4. p. 4377: line 4 of section 4.2 refers to "Theorem 4" in Correa et al. 2021 (Correa, J. D., Lee, S., and Bareinboim, E. "Nested Counterfactual Identification from Arbitrary Surrogate Experiments." In Advances in Neural Information Processing Systems, volume 34, 2021.) "Theorem 4" should read "Theorem 1".

  5. p. 4377: Algorithms 2 and 3 do not explicitly handle the case in which the SIMPLIFY algorithm (Algorithm 1) returns 0 when called in line 1 of Algorithm 2. SIMPLIFY returns 0 if a counterfactual event is determined to have probability 0, due to inconsistent variable values. In such cases, Algorithm 2 does not specify what to do. Likely the algorithm should return FAIL in the same way that it returns FAIL in line 3 when counterfactual variables are inconsistent. That logic would be inserted between lines 1 and 2 of Algorithm 2.

  6. p. 4377: The specification of the output for Algorithm 3 states that Algorithm 3 should sometimes return FAIL, but the pseudocode never actually does so. Between lines 3 and 4 of Algorithm 3, there should likely be a provision that Algorithm 3 returns FAIL if the call to ctfTRu returns FAIL.

  7. pp. 4378-4379: The usage of $\mathbf{X_{\ast}}()$ as an operator in Example 4.5 is inconsistent with how it is defined and then used in Figure 6, implying a problem with the example or with the notation definitions. The second paragraph in section 4.1 states "Let $\mathbf{X_{\ast}}(W_{\mathbf{t}})=\mathbf{V}(\lvert\lvert \mathbf{X_{\ast}} \rvert\rvert \cap An(W_{\mathbf{t}}))$, that is, the primitive variables in $\mathbf{X_{\ast}}$ that are ancestors of $W_{\mathbf{t}}$." The second paragraph in Section 2 states, "Define $\mathbf{V}(\mathbf{W_{\ast}}) = \set{W \in \mathbf{V} | W_{\mathbf{\widehat{T}}} \in \mathbf{W_{\ast}}}$, that is, the set of observables that appear in $\mathbf{W_{\ast}}$." Those two definitions imply that $\mathbf{X_{\ast}}$, when used as an operator on a set of counterfactual variables, produces a set of vertices in a graph and could not also produce counterfactual variables that may have interventions. The notation and definitions are consistent with Figure 6, which shows ${\mathbf{X_{\ast}}(Y_{x})}$ used to produce the subgraph $\mathcal{G}_{\mathbf{X_{\ast}}(Y_{x})}$. Example 4.5, however, states that $\mathbf{X_{\ast}}(Y_{x})=\set{Z_{x}}$, implying that the operator $\mathbf{X_{\ast}}()$ may produce counterfactual variables containing interventions such as $Z_{x}$, contradicting the notation definitions. From this ambiguity, it is not clear to us what the correction should be.

  8. p. 4389: the first paragraph in section B.2 of the supplemental materials gives the reference for the IDENTIFY algorithm as "(Tian & Pearl, 2002a)." The reference is Tian and Pearl's 2002 publication, "A General Identification Condition for Causal Effects" from Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI 2002). That publication does not list the algorithm, but a subsequent technical report ("On the Identification of Causal Effects") provides the IDENTIFY subroutine in Figure 7. That technical report should be cited and added to the list of references. (Note that Tian & Pearl's Technical Report R290-A from 2003 is in the list of references, but the correct one containing IDENTIFY is Technical Report R290-L from the same year.)

  9. p. 4390: Line 4 of Algorithm 5 reads:

    4: If A = C then

    The algorithm is adapted from page 26 of Tian and Pear's 2003 technical report. The line should read:

    4: If $C \subset A \subset T$ then

B. Correa and Bareinboim 2020: "General Transportability of Soft Interventions: Completeness Results"

  1. p. 7: Algorithm 1, Line 2 contains a for each loop that would be more clearly represented as two nested loops instead of one. An outer for each loop iterates over each $\mathbf{A}_{i} s.t. \mathbf{A}_{i} \cap \mathbf{X} = \emptyset$. An inner for loop iterates over each $\sigma_{\mathbf{Z}} \in \mathbb{Z}^{k} \in \mathbb{Z}$, for $k$ from 1 to the number of domains including the target domain.

  2. p. 7: Algorithm 1, Line 9 calls the subroutine $REPLACE(\mathbf{A}_{i},\sigma_{\mathbf{X}})$. The paper and its supplemental section do not provide pseudocode for the subroutine. The last sentence in the paragraph after Equation 12 describes the function of REPLACE as "to determine the factors of intervened variables according to the particular type of intervention," and the example that follows applies line 9 of Algorithm 1 as follows: "Next, $Q^{\ast}[X_{1};\sigma^{\ast}]$ and $Q^{\ast}[X_{2};\sigma^{\ast}]$ are given by policy $\sigma^{\ast}$ as $P^{\ast}[x_{1};\sigma^{\ast}]$ and $P^{\ast}[x_{2};\sigma^{\ast}]$." REPLACE appears to mean "compute the Q value for a C-component containing interventions by defining the parents of an intervention variable according to the intervention policy." That principle is sufficient for REPLACE to be codable because all C-components containing a variable affected by REPLACE must contain only one vertex as the intervention the policy produces breaks all double-headed arrows associated with the vertex. That said, the paper could still benefit from more rigorous pseudocode for REPLACE. Accordingly, line 9 could read as follows:

          9: for each $\mathbf{A}_{i}$ that is a C-component of size 1 containing a variable in $\mathbf{X}$ let $Q^{\ast}[\mathbf{A}_{i};\sigma_{\mathbf{X}}] = P^{\ast}(x_{j} | \mathbf{pa}_{x_{j}};\sigma^{\ast}_{\mathbf{X}})$, where $x_{j}$              denotes the jth variable in $\mathbf{X}$ that corresponds to the c-component $\mathbf{A}_i$.

  1. p. 7: the paragraph preceding Equation 12 contains the expression $Q^{1}[\mathbf{V};\sigma_{Z}]/Q^{1}[Z;\sigma_{Z}]=P^{1}(\mathbf{v};\sigma_{Z})/P^{1}(z|w,r,x;\sigma_{Z})$, implying that $Q^{1}[Z;\sigma_{Z}]=P^{1}(z|w,r,x;\sigma_{Z})$. Equation 7 of the paper defines $Q^{k}[\mathbf{C};\sigma_{\mathbf{X}}](\mathbf{v})=\Sigma_{\mathbf{u}(\mathbf{C})}{\prod_{{i|V_{i}\in\mathbf{C}}}{P^{k}(v_{i} | \mathbf{pa}_{i},\mathbf{u}_{i};\sigma_{\mathbf{X}})P^{k}(\mathbf{u}(\mathbf{C});\sigma_{\mathbf{X}})}}$ which is an adaptation of Equation 43 of Tian and Pearl 2002, but that implies $Q^{1}[Z;\sigma_{Z}]=P^{1}(z|r,x;\sigma_{Z})$ and not $P^{1}(z|w,r,x;\sigma_{Z})$ because the observable parents of $Z$ in Figure 2(d) are $R$ and $X$. We understand that the published expression for $Q^{1}[Z;\sigma_{Z}]$ is also correct, per Equation 37 of Tian and Pearl (2003) that states that $Q_{j}=\prod_{{i;V_{i}\in S_{j}}}{P(v_{i}|v^{(i-1)})}$ for the $Q$ value of the $j^{th}$ c-component of a graph. Applying Equation 37 to calculate $Q^{1}[Z;\sigma_{Z}]$ will condition on $w$ because $W$ is an ancestor of $X$ and must precede $X$ in a topological sort of $\mathcal{G}^{\Delta_{1}}_{\sigma^{1}_{Z}}$, so letting $Z$ be the ith vertex in the topological sort implies that $v^{(i-1)}={W,R,X}$ and therefore $Q^{1}[Z;\sigma_{Z}]=P^{1}(z|w,r,x;\sigma_{Z})$. Santu Tikka uses Equation 37 to calculate the Q value of a C-component in his Causal Effect R package and indeed that method of computing a C-component's Q value is simpler to implement in software than Equation 7, so the authors may have used a software application to check their computation for $Q^{1}[Z;\sigma_{Z}]$. Were $Q^{1}[Z;\sigma_{Z}]$ adjusted to equal $P^{1}(z|w,r,x;\sigma_{Z})$ to be consistent with Equation 7, we find that Equation 12 would simplify further than the published version to read $P^{*}(y|r,z;\sigma_{X})=\Sigma_{x}{P^{1}(y|z,r,x;\sigma_{Z})P^{1}(x|r;\sigma_{Z})}$.

  2. p. 7: The first paragraph after Equation 12 refers to the "subroutine IDENTIFY from [36]," and reference 36 is Tian and Pearl's 2002 publication, "A General Identification Condition for Causal Effects" from Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI 2002). That publication does not list the algorithm, but the subsequent technical report ("On the Identification of Causal Effects") provides the IDENTIFY subroutine in Figure 7.

  3. p. 8: the last paragraph of Section 4 reads $\mathbb{Z}=\set{\set{\sigma_{\emptyset}},\set{\sigma_{X}=x},\set{\sigma_{W}=g(R),\sigma_{R}=P'(R)}}$. It should probably read $\mathbb{Z}=\set{\set{\sigma_{\emptyset}},\set{\sigma_{X}=x},\set{\sigma_{W}=g(R)},\set{\sigma_{R}=P'(R)}}$, because later in the paragraph $\sigma_{W}$ is stated to be in $\pi^{2}$ and $\sigma_{R}$ is stated to be in $\pi^{3}$.

C. Correa and Bareinboim 2020: "General Transportability of Soft Interventions: Completeness Results", supplemental material

  1. p. 27: Line 4 of Algorithm 2 reads:

    4: If A = C then

    The algorithm is adapted from page 26 of Tian and Pear's 2003 technical report. The line should read:

    4: If $C \subset A \subset T$ then

D. Inconsistencies in terminology or algorithm names across papers

  1. There seems to be an inconsistency in how different authors use the term "c-factor." Tian and Pearl (2002) define "c-factor" as follows: "We will call $Q[S_{i}]$ the c-factor corresponding to the c-component $S_{i}$." In the rest of the paper, each time they refer to a "c-factor" it is to apply the $Q$ function to a c-component. In "General Transportability of Soft Interventions: Completeness Results," (page 6), Correa and Bareinboim cite Tian and Pearl's paper in referencing "the concept of C-factors and C-components," and they use "C-factor" to denote the Q function applied to any set of vertices in a graph. In the "Causal Effect" R package, Santu Tikka denotes the compute.c.factor() function to only apply to c-components. The confusion may originate in the ambiguity of Tian and Pearl's original definition of "c-factor": did they intend for "c-factor" to denote the function $Q$, or did they intend for the $Q$ function to be a "c-factor" only when applied to a c-component?

  2. The $\sigma$-TR algorithm listed as Algorithm 1 in Correa and Bareinboim 2020 ("General Transportability of Soft Interventions: Completeness Results") is not the same algorithm as the $\sigma$-TR algorithm listed as Algorithm 4 in Correa, Lee, and Bareinboim 2022 ("Counterfactual Transportability: A Formal Approach"). The latter algorithm is a modification of the former that calls sections of the former algorithm for each C-component in a graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant