counterfactual-fairness.html

<!doctype html><html lang=en-uk><head><script data-goatcounter=https://ruivieira-dev.goatcounter.com/count async src=//gc.zgo.at/count.js></script><script src=https://unpkg.com/@alpinejs/intersect@3.x.x/dist/cdn.min.js></script><script src=https://unpkg.com/alpinejs@3.x.x/dist/cdn.min.js></script><script type=module src=https://ruivieira.dev/js/deeplinks/deeplinks.js></script><link rel=preload href=https://ruivieira.dev/lib/fonts/fa-brands-400.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/lib/fonts/fa-regular-400.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/lib/fonts/fa-solid-900.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/fonts/firacode/FiraCode-Regular.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/fonts/vollkorn/Vollkorn-Regular.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=stylesheet href=https://ruivieira.dev/css/kbd.css type=text/css><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=edge"><title>Counterfactual Fairness · Rui Vieira</title>
<link rel=canonical href=https://ruivieira.dev/counterfactual-fairness.html><meta name=viewport content="width=device-width,initial-scale=1"><meta name=robots content="all,follow"><meta name=googlebot content="index,follow,snippet,archive"><meta property="og:title" content="Counterfactual Fairness"><meta property="og:description" content="Building counterfactually fair modelsDataTo evaluate counterfactual fairness we will be using the &ldquo;law school&rdquo; dataset1.
The Law School Admission Council conducted a survey across 163 law schools in the United States. It contains information on 21,790 law students such as their entrance exam scores (LSAT), their grade-point average (GPA) collected prior to law school, and their first year average grade (FYA). Given this data, a school may wish to predict if an applicant will have a high FYA."><meta property="og:type" content="article"><meta property="og:url" content="https://ruivieira.dev/counterfactual-fairness.html"><meta property="article:section" content="posts"><meta property="article:modified_time" content="2023-09-02T17:28:34+01:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Counterfactual Fairness"><meta name=twitter:description content="Building counterfactually fair modelsDataTo evaluate counterfactual fairness we will be using the &ldquo;law school&rdquo; dataset1.
The Law School Admission Council conducted a survey across 163 law schools in the United States. It contains information on 21,790 law students such as their entrance exam scores (LSAT), their grade-point average (GPA) collected prior to law school, and their first year average grade (FYA). Given this data, a school may wish to predict if an applicant will have a high FYA."><link rel=stylesheet href=https://ruivieira.dev/css/styles.css><!--[if lt IE 9]><script src=https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js></script><script src=https://oss.maxcdn.com/respond/1.4.2/respond.min.js></script><![endif]--><link rel=icon type=image/png href=https://ruivieira.dev/images/favicon.ico></head><body class="max-width mx-auto px3 ltr" x-data="{currentHeading: undefined}"><div class="content index py4"><div id=header-post><a id=menu-icon href=#><i class="fas fa-eye fa-lg"></i></a>
<a id=menu-icon-tablet href=#><i class="fas fa-eye fa-lg"></i></a>
<a id=top-icon-tablet href=# onclick='$("html, body").animate({scrollTop:0},"fast")' style=display:none aria-label="Top of Page"><i class="fas fa-chevron-up fa-lg"></i></a>
<span id=menu><span id=nav><ul><li><a href=https://ruivieira.dev/>Home</a></li><li><a href=https://ruivieira.dev/blog/>Blog</a></li><li><a href=https://ruivieira.dev/draw/>Drawings</a></li><li><a href=https://ruivieira.dev/map/>All pages</a></li><li><a href=https://ruivieira.dev/search.html>Search</a></li></ul></span><br><div id=share style=display:none></div><div id=toc><h4>Contents</h4><nav id=TableOfContents><ul><li><a href=#building-counterfactually-fair-models :class="{'toc-h2':true, 'toc-highlight': currentHeading == '#building-counterfactually-fair-models' }">Building counterfactually fair models</a></li><li><a href=#data :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#data' }">Data</a></li><li><a href=#pre-processing :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#pre-processing' }">Pre-processing</a></li><li><a href=#protected-attributes :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#protected-attributes' }">Protected attributes</a></li><li><a href=#training-and-testing-subsets :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#training-and-testing-subsets' }">Training and testing subsets</a></li><li><a href=#models :class="{'toc-h2':true, 'toc-highlight': currentHeading == '#models' }">Models</a></li><li><a href=#unfair-model :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#unfair-model' }">Unfair model</a></li><li><a href=#full-model :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#full-model' }">Full model</a></li><li><a href=#fairness-through-unawareness-ftu :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#fairness-through-unawareness-ftu' }">Fairness through unawareness (FTU)</a></li><li><a href=#latent-variable-model :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#latent-variable-model' }">Latent variable model</a></li><li><a href=#additive-error-model :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#additive-error-model' }">Additive error model</a></li><li><a href=#comparison :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#comparison' }">Comparison</a></li><li><a href=#measuring-counterfactual-fairness :class="{'toc-h2':true, 'toc-highlight': currentHeading == '#measuring-counterfactual-fairness' }">Measuring counterfactual fairness</a></li><li><a href=#statistical-parity-difference--disparate-impact :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#statistical-parity-difference--disparate-impact' }">Statistical Parity Difference / Disparate Impact</a></li><li><a href=#finding-sensitive-features :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#finding-sensitive-features' }">Finding sensitive features</a></li></ul></nav><h4>Related</h4><nav><ul><li class="header-post toc"><span class=backlink-count>3</span>
<a href=https://ruivieira.dev/counterfactual-fairness-in-java.html>Counterfactual Fairness in Java</a></li><li class="header-post toc"><span class=backlink-count>1</span>
<a href=https://ruivieira.dev/fairness-in-machine-learning.html>Fairness in Machine Learning</a></li></ul></nav></div></span></div><article class=post itemscope itemtype=http://schema.org/BlogPosting><header><h1 class=posttitle itemprop="name headline">Counterfactual Fairness</h1><div class=meta><div class=postdate>Updated <time datetime="2023-09-02 17:28:34 +0100 BST" itemprop=datePublished>2023-09-02</time>
<span class=commit-hash>(<a href=https://ruivieira.dev/log/index.html#d64c4a5>d64c4a5</a>)</span></div></div></header><div class=content itemprop=articleBody><h2 id=building-counterfactually-fair-models x-intersect="currentHeading = '#building-counterfactually-fair-models'">Building counterfactually fair models</h2><h3 id=data x-intersect="currentHeading = '#data'">Data</h3><p>To evaluate <em>counterfactual fairness</em> we will be using the &ldquo;law school&rdquo; dataset<sup id=fnref:1><a href=#fn:1 class=footnote-ref role=doc-noteref>1</a></sup>.</p><p>The Law School Admission Council conducted a survey across 163 law schools in the United States. It contains information on 21,790 law students such as their entrance exam scores (<code>LSAT</code>), their grade-point average (<code>GPA</code>) collected prior to law school, and their first year average grade (<code>FYA</code>).
Given this data, a school may wish to predict if an applicant will have a high <code>FYA</code>. The school would
also like to make sure these predictions are not biased by an individual’s race and sex.
However, the <code>LSAT</code>, <code>GPA</code>, and <code>FYA</code> scores, may be biased due to social factors.</p><p>We start by importing the data into a [Pandas]] <code>DataFrame</code>.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>warnings</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>warnings<span style=font-weight:700>.</span>filterwarnings(<span style=color:#b84>&#34;ignore&#34;</span>)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>pandas</span> <span style=font-weight:700>as</span> <span style=color:#555>pd</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>df <span style=font-weight:700>=</span> pd<span style=font-weight:700>.</span>read_csv(<span style=color:#b84>&#34;data/law_data.csv&#34;</span>, index_col<span style=font-weight:700>=</span><span style=color:#099>0</span>)
</span></span><span style=display:flex><span>df<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><h3 id=pre-processing x-intersect="currentHeading = '#pre-processing'">Pre-processing</h3><p>We now pre-process the data. We start by creating categorical &ldquo;dummy&rdquo; variables according to the <code>race</code> variable.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df <span style=font-weight:700>=</span> pd<span style=font-weight:700>.</span>get_dummies(df, columns<span style=font-weight:700>=</span>[<span style=color:#b84>&#34;race&#34;</span>], prefix<span style=font-weight:700>=</span><span style=color:#b84>&#34;&#34;</span>, prefix_sep<span style=font-weight:700>=</span><span style=color:#b84>&#34;&#34;</span>)
</span></span><span style=display:flex><span>df<span style=font-weight:700>.</span>iloc[:, : <span style=color:#099>7</span>]<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df<span style=font-weight:700>.</span>iloc[:, <span style=color:#099>7</span> :]<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><p>We also want to expand the <code>sex</code> variable into <code>male</code> / <code>female</code> categorical variables and remove the original.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df[<span style=color:#b84>&#34;male&#34;</span>] <span style=font-weight:700>=</span> df[<span style=color:#b84>&#34;sex&#34;</span>]<span style=font-weight:700>.</span>map(<span style=font-weight:700>lambda</span> x: <span style=color:#099>1</span> <span style=font-weight:700>if</span> x <span style=font-weight:700>==</span> <span style=color:#099>2</span> <span style=font-weight:700>else</span> <span style=color:#099>0</span>)
</span></span><span style=display:flex><span>df[<span style=color:#b84>&#34;female&#34;</span>] <span style=font-weight:700>=</span> df[<span style=color:#b84>&#34;sex&#34;</span>]<span style=font-weight:700>.</span>map(<span style=font-weight:700>lambda</span> x: <span style=color:#099>1</span> <span style=font-weight:700>if</span> x <span style=font-weight:700>==</span> <span style=color:#099>1</span> <span style=font-weight:700>else</span> <span style=color:#099>0</span>)
</span></span><span style=display:flex><span>df <span style=font-weight:700>=</span> df<span style=font-weight:700>.</span>drop(axis<span style=font-weight:700>=</span><span style=color:#099>1</span>, columns<span style=font-weight:700>=</span>[<span style=color:#b84>&#34;sex&#34;</span>])
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df<span style=font-weight:700>.</span>iloc[:, <span style=color:#099>0</span>:<span style=color:#099>7</span>]<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df<span style=font-weight:700>.</span>iloc[:, <span style=color:#099>7</span>:]<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><p>We will also convert the entrance exam scores (<code>LSAT</code>) to a discrete variable.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df[<span style=color:#b84>&#34;LSAT&#34;</span>] <span style=font-weight:700>=</span> df[<span style=color:#b84>&#34;LSAT&#34;</span>]<span style=font-weight:700>.</span>astype(<span style=color:#999>int</span>)
</span></span><span style=display:flex><span>df<span style=font-weight:700>.</span>iloc[:, :<span style=color:#099>6</span>]<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df<span style=font-weight:700>.</span>iloc[:, <span style=color:#099>6</span>:]<span style=font-weight:700>.</span>head()
</span></span></code></pre></div><h3 id=protected-attributes x-intersect="currentHeading = '#protected-attributes'">Protected attributes</h3><p><em>Counterfactual fairness</em> enforces that a distribution over possible predictions for an individual should
remain unchanged in a world where an individual’s protected attributes $A$ had been different in a causal sense.
Let&rsquo;s start by defining the /protected attributes/. Obvious candidates are the different categorical variables for ethnicity (<code>Asian</code>, <code>White</code>, <code>Black</code>, <em>etc</em>) and gender (<code>male</code>, <code>female</code>).</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>A <span style=font-weight:700>=</span> [
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Amerindian&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Asian&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Black&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Hispanic&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Mexican&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Other&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;Puertorican&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;White&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;male&#34;</span>,
</span></span><span style=display:flex><span>    <span style=color:#b84>&#34;female&#34;</span>,
</span></span><span style=display:flex><span>]
</span></span></code></pre></div><h3 id=training-and-testing-subsets x-intersect="currentHeading = '#training-and-testing-subsets'">Training and testing subsets</h3><p>We will now divide the dataset into training and testing subsets.
We will use the same ratio as in <sup id=fnref:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup>, that is 20%.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>from</span> <span style=color:#555>sklearn.model_selection</span> <span style=font-weight:700>import</span> train_test_split
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>df_train, df_test <span style=font-weight:700>=</span> train_test_split(df, random_state<span style=font-weight:700>=</span><span style=color:#099>23</span>, test_size<span style=font-weight:700>=</span><span style=color:#099>0.2</span>);
</span></span></code></pre></div><h2 id=models x-intersect="currentHeading = '#models'">Models</h2><h3 id=unfair-model x-intersect="currentHeading = '#unfair-model'">Unfair model</h3><p>As detailed in <sup id=fnref1:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup>, the concept of counterfactual fairness holds
under three levels of assumptions of increasing strength.</p><p>The first of such levels is <em>Level 1</em>, where $\hat{Y}$ is built using only the observable non-descendants of $A$.
This only requires <em>partial</em> causal ordering and no further causal assumptions, but in many problems there will be few, if any,
observables which are not descendants of protected demographic factors.</p><p>For this dataset, since <code>LSAT</code>, <code>GPA</code>, and <code>FYA</code> are all biased by ethnicity and gender, we cannot use any observed
features to construct a Level 1 counterfactually fair predictor as described in Level 1.</p><p>Instead (and in order to compare the performance with Level 2 and 3 models) we will build two <em>unfair baselines</em>.</p><ul><li>A <em>Full</em> model, which will be trained with the totality of the variables</li><li>An <em>Unaware</em> model (FTU), which will be trained will all the variables, except the protected attributes $A$.</li></ul><p>Let&rsquo;s proceed with calculating the <em>Full</em> model.</p><h3 id=full-model x-intersect="currentHeading = '#full-model'">Full model</h3><p>As mentioned previously, the full model will be a simple linear regression in order to predict <code>ZFYA</code> using all of the variables.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>from</span> <span style=color:#555>sklearn.linear_model</span> <span style=font-weight:700>import</span> LinearRegression
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>linreg_unfair <span style=font-weight:700>=</span> LinearRegression()
</span></span></code></pre></div><p>The inputs will then be the totality of the variabes (protected variables $A$, as well as <code>UGPA</code> and <code>LSAT</code>).</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>numpy</span> <span style=font-weight:700>as</span> <span style=color:#555>np</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>X <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>hstack(
</span></span><span style=display:flex><span>    (
</span></span><span style=display:flex><span>        df_train[A],
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(df_train[<span style=color:#b84>&#34;UGPA&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(df_train[<span style=color:#b84>&#34;LSAT&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>    )
</span></span><span style=display:flex><span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(X)
</span></span></code></pre></div><p>As for our target, we are trying to predict ~ZFYA~ (first year average grade).</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>y <span style=font-weight:700>=</span> df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>]
</span></span><span style=display:flex><span>y[:<span style=color:#099>10</span>]
</span></span></code></pre></div><p>We fit the model:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>linreg_unfair <span style=font-weight:700>=</span> linreg_unfair<span style=font-weight:700>.</span>fit(X, y)
</span></span></code></pre></div><p>And perform some predictions on the test subset.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>X_test <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>hstack(
</span></span><span style=display:flex><span>    (
</span></span><span style=display:flex><span>        df_test[A],
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(df_test[<span style=color:#b84>&#34;UGPA&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(df_test[<span style=color:#b84>&#34;LSAT&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>    )
</span></span><span style=display:flex><span>)
</span></span><span style=display:flex><span>X_test
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>predictions_unfair <span style=font-weight:700>=</span> linreg_unfair<span style=font-weight:700>.</span>predict(X_test)
</span></span><span style=display:flex><span>predictions_unfair
</span></span></code></pre></div><p>We will also calculate the /unfair model/ score for future use.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>score_unfair <span style=font-weight:700>=</span> linreg_unfair<span style=font-weight:700>.</span>score(X_test, df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>])
</span></span><span style=display:flex><span><span style=color:#999>print</span>(score_unfair)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>from</span> <span style=color:#555>sklearn.metrics</span> <span style=font-weight:700>import</span> mean_squared_error
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>RMSE_unfair <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>sqrt(mean_squared_error(df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>], predictions_unfair))
</span></span><span style=display:flex><span><span style=color:#999>print</span>(RMSE_unfair)
</span></span></code></pre></div><h3 id=fairness-through-unawareness-ftu x-intersect="currentHeading = '#fairness-through-unawareness-ftu'">Fairness through unawareness (FTU)</h3><p>As also mentioned in <sup id=fnref2:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup>, the second baseline we will use is an <strong>Unaware</strong> model (FTU), which will be trained will all the variables, except the protected attributes $A$.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>linreg_ftu <span style=font-weight:700>=</span> LinearRegression()
</span></span></code></pre></div><p>We will create the inputs as previously, but without using the protected attributes, $A$.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>X_ftu <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>hstack(
</span></span><span style=display:flex><span>    (
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(df_train[<span style=color:#b84>&#34;UGPA&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(df_train[<span style=color:#b84>&#34;LSAT&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>    )
</span></span><span style=display:flex><span>)
</span></span><span style=display:flex><span>X_ftu
</span></span></code></pre></div><p>And we fit the model:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>linreg_ftu <span style=font-weight:700>=</span> linreg_ftu<span style=font-weight:700>.</span>fit(X_ftu, y)
</span></span></code></pre></div><p>Again, let&rsquo;s perform some predictions on the test subset.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>X_ftu_test <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>hstack(
</span></span><span style=display:flex><span>    (np<span style=font-weight:700>.</span>array(df_test[<span style=color:#b84>&#34;UGPA&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>), np<span style=font-weight:700>.</span>array(df_test[<span style=color:#b84>&#34;LSAT&#34;</span>])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>))
</span></span><span style=display:flex><span>)
</span></span><span style=display:flex><span>X_ftu_test
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>predictions_ftu <span style=font-weight:700>=</span> linreg_ftu<span style=font-weight:700>.</span>predict(X_ftu_test)
</span></span><span style=display:flex><span>predictions_ftu
</span></span></code></pre></div><p>As previously, let&rsquo;s calculate this model&rsquo;s score.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>ftu_score <span style=font-weight:700>=</span> linreg_ftu<span style=font-weight:700>.</span>score(X_ftu_test, df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>])
</span></span><span style=display:flex><span><span style=color:#999>print</span>(ftu_score)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>RMSE_ftu <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>sqrt(mean_squared_error(df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>], predictions_ftu))
</span></span><span style=display:flex><span><span style=color:#999>print</span>(RMSE_ftu)
</span></span></code></pre></div><h3 id=latent-variable-model x-intersect="currentHeading = '#latent-variable-model'">Latent variable model</h3><p>Still according to <sup id=fnref3:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup>, a <strong>Level 2</strong> approach will model latent ‘fair’ variables which are parents of observed variables.</p><p>If we consider a predictor parameterised by $\theta$, such as:</p><p>$$
\hat{Y} \equiv g_\theta (U, X_{\nsucc A})
$$</p><p>with $X_{\nsucc A} \subseteq X$ are non-descendants of $A$.
Assuming a loss function $l(.,.)$ and training data $\mathcal{D}\equiv{(A^{(i), X^{(i)}, Y^{(i)}})}$, for $i=1,2\dots,n$, the empirical loss is defined as</p><p>$$
L(\theta)\equiv \sum_{i=1}^n \mathbb{E}[l(y^{(i)},g_\theta(U^{(i)}, x^{(i)}_{\nsucc A}))]/n
$$</p><p>which has to be minimised in order to $\theta$. Each $n$ expectation is with respect to random variable $U^{(i)}$ such that</p><p>$$
U^{(i)}\sim P_{\mathcal{M}}(U|x^{(i)}, a^{(i)})
$$</p><p>where $P_{\mathcal{M}}(U|x,a)$ is the conditional distribution of the background variables as given by a causal model $M$ that is available by assumption.</p><p>If this expectation cannot be calculated analytically, Markov chain Monte Carlo (MCMC) can be used to approximate it as in the following algorithm.</p><p>We will follow the model specified in the original paper, where the latent variable considered is $K$, which represents a student&rsquo;s <strong>knowledge</strong>.
$K$ will affect <code>GPA</code>, <code>LSAT</code> and the outcome, <code>FYA</code>.
The model can be defined by:</p><p>$$
\begin{aligned}
GPA &\sim \mathcal{N}(GPA_0 + w_{GPA}^KK + w_{GPA}^RR + w_{GPA}^SS, \sigma_{GPA}) \
LSAT &\sim \text{Po}(\exp(LSAT_0 + w_{LSAT}^KK + w_{LSAT}^RR + w_L^SS)) \
FYA &\sim \mathcal{N}(w_{FYA}^KK + w_{FYA}^RR + w_{FYA}^SS, 1) \
K &\sim \mathcal{N}(0,1)
\end{aligned}
$$</p><p>The priors used will be:</p><p>$$
\begin{aligned}
GPA_0 &\sim \mathcal{N}(0, 1) \
LSAT_0 &\sim \mathcal{N}(0, 1) \
GPA_0 &\sim \mathcal{N}(0, 1)
\end{aligned}
$$</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>pymc3</span> <span style=font-weight:700>as</span> <span style=color:#555>pm</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>K <span style=font-weight:700>=</span> <span style=color:#999>len</span>(A)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span><span style=font-weight:700>def</span> <span style=color:#900;font-weight:700>MCMC</span>(data, samples<span style=font-weight:700>=</span><span style=color:#099>1000</span>):
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    N <span style=font-weight:700>=</span> <span style=color:#999>len</span>(data)
</span></span><span style=display:flex><span>    a <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>array(data[A])
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    model <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Model()
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    <span style=font-weight:700>with</span> model:
</span></span><span style=display:flex><span>        <span style=color:#998;font-style:italic># Priors</span>
</span></span><span style=display:flex><span>        k <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;k&#34;</span>, mu<span style=font-weight:700>=</span><span style=color:#099>0</span>, sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>, shape<span style=font-weight:700>=</span>(<span style=color:#099>1</span>, N))
</span></span><span style=display:flex><span>        gpa0 <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;gpa0&#34;</span>, mu<span style=font-weight:700>=</span><span style=color:#099>0</span>, sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>)
</span></span><span style=display:flex><span>        lsat0 <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;lsat0&#34;</span>, mu<span style=font-weight:700>=</span><span style=color:#099>0</span>, sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>)
</span></span><span style=display:flex><span>        w_k_gpa <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;w_k_gpa&#34;</span>, mu<span style=font-weight:700>=</span><span style=color:#099>0</span>, sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>)
</span></span><span style=display:flex><span>        w_k_lsat <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;w_k_lsat&#34;</span>, mu<span style=font-weight:700>=</span><span style=color:#099>0</span>, sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>)
</span></span><span style=display:flex><span>        w_k_zfya <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;w_k_zfya&#34;</span>, mu<span style=font-weight:700>=</span><span style=color:#099>0</span>, sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>        w_a_gpa <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;w_a_gpa&#34;</span>, mu<span style=font-weight:700>=</span>np<span style=font-weight:700>.</span>zeros(K), sigma<span style=font-weight:700>=</span>np<span style=font-weight:700>.</span>ones(K), shape<span style=font-weight:700>=</span>K)
</span></span><span style=display:flex><span>        w_a_lsat <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;w_a_lsat&#34;</span>, mu<span style=font-weight:700>=</span>np<span style=font-weight:700>.</span>zeros(K), sigma<span style=font-weight:700>=</span>np<span style=font-weight:700>.</span>ones(K), shape<span style=font-weight:700>=</span>K)
</span></span><span style=display:flex><span>        w_a_zfya <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(<span style=color:#b84>&#34;w_a_zfya&#34;</span>, mu<span style=font-weight:700>=</span>np<span style=font-weight:700>.</span>zeros(K), sigma<span style=font-weight:700>=</span>np<span style=font-weight:700>.</span>ones(K), shape<span style=font-weight:700>=</span>K)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>        sigma_gpa_2 <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>InverseGamma(<span style=color:#b84>&#34;sigma_gpa_2&#34;</span>, alpha<span style=font-weight:700>=</span><span style=color:#099>1</span>, beta<span style=font-weight:700>=</span><span style=color:#099>1</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>        mu <span style=font-weight:700>=</span> gpa0 <span style=font-weight:700>+</span> (w_k_gpa <span style=font-weight:700>*</span> k) <span style=font-weight:700>+</span> pm<span style=font-weight:700>.</span>math<span style=font-weight:700>.</span>dot(a, w_a_gpa)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>        <span style=color:#998;font-style:italic># Observed data</span>
</span></span><span style=display:flex><span>        gpa <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(
</span></span><span style=display:flex><span>            <span style=color:#b84>&#34;gpa&#34;</span>,
</span></span><span style=display:flex><span>            mu<span style=font-weight:700>=</span>mu,
</span></span><span style=display:flex><span>            sigma<span style=font-weight:700>=</span>pm<span style=font-weight:700>.</span>math<span style=font-weight:700>.</span>sqrt(sigma_gpa_2),
</span></span><span style=display:flex><span>            observed<span style=font-weight:700>=</span><span style=color:#999>list</span>(data[<span style=color:#b84>&#34;UGPA&#34;</span>]),
</span></span><span style=display:flex><span>            shape<span style=font-weight:700>=</span>(<span style=color:#099>1</span>, N),
</span></span><span style=display:flex><span>        )
</span></span><span style=display:flex><span>        lsat <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Poisson(
</span></span><span style=display:flex><span>            <span style=color:#b84>&#34;lsat&#34;</span>,
</span></span><span style=display:flex><span>            pm<span style=font-weight:700>.</span>math<span style=font-weight:700>.</span>exp(lsat0 <span style=font-weight:700>+</span> w_k_lsat <span style=font-weight:700>*</span> k <span style=font-weight:700>+</span> pm<span style=font-weight:700>.</span>math<span style=font-weight:700>.</span>dot(a, w_a_lsat)),
</span></span><span style=display:flex><span>            observed<span style=font-weight:700>=</span><span style=color:#999>list</span>(data[<span style=color:#b84>&#34;LSAT&#34;</span>]),
</span></span><span style=display:flex><span>            shape<span style=font-weight:700>=</span>(<span style=color:#099>1</span>, N),
</span></span><span style=display:flex><span>        )
</span></span><span style=display:flex><span>        zfya <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Normal(
</span></span><span style=display:flex><span>            <span style=color:#b84>&#34;zfya&#34;</span>,
</span></span><span style=display:flex><span>            mu<span style=font-weight:700>=</span>w_k_zfya <span style=font-weight:700>*</span> k <span style=font-weight:700>+</span> pm<span style=font-weight:700>.</span>math<span style=font-weight:700>.</span>dot(a, w_a_zfya),
</span></span><span style=display:flex><span>            sigma<span style=font-weight:700>=</span><span style=color:#099>1</span>,
</span></span><span style=display:flex><span>            observed<span style=font-weight:700>=</span><span style=color:#999>list</span>(data[<span style=color:#b84>&#34;ZFYA&#34;</span>]),
</span></span><span style=display:flex><span>            shape<span style=font-weight:700>=</span>(<span style=color:#099>1</span>, N),
</span></span><span style=display:flex><span>        )
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>        step <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>Metropolis()
</span></span><span style=display:flex><span>        trace <span style=font-weight:700>=</span> pm<span style=font-weight:700>.</span>sample(samples, step, progressbar <span style=font-weight:700>=</span> <span style=font-weight:700>False</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    <span style=font-weight:700>return</span> trace
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>train_estimates <span style=font-weight:700>=</span> MCMC(df_train)
</span></span></code></pre></div><p>Let&rsquo;s plot a single trace for $k^{(i)}$.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>matplotlib.pyplot</span> <span style=font-weight:700>as</span> <span style=color:#555>plt</span>
</span></span><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>seaborn</span> <span style=font-weight:700>as</span> <span style=color:#555>sns</span>
</span></span><span style=display:flex><span><span style=font-weight:700>from</span> <span style=color:#555>plotutils</span> <span style=font-weight:700>import</span> <span style=font-weight:700>*</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span><span style=color:#998;font-style:italic># Thin the samples before plotting</span>
</span></span><span style=display:flex><span>k_trace <span style=font-weight:700>=</span> train_estimates[<span style=color:#b84>&#34;k&#34;</span>][:, <span style=color:#099>0</span>]<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>)[<span style=color:#099>0</span>::<span style=color:#099>100</span>]
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>subplot(<span style=color:#099>1</span>, <span style=color:#099>2</span>, <span style=color:#099>1</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>hist(k_trace, color<span style=font-weight:700>=</span>colours[<span style=color:#099>0</span>], bins<span style=font-weight:700>=</span><span style=color:#099>100</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>subplot(<span style=color:#099>1</span>, <span style=color:#099>2</span>, <span style=color:#099>2</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>scatter(<span style=color:#999>range</span>(<span style=color:#999>len</span>(k_trace)), k_trace, s<span style=font-weight:700>=</span><span style=color:#099>1</span>, c<span style=font-weight:700>=</span>colours[<span style=color:#099>0</span>])
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>show()
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>train_k <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>mean(train_estimates[<span style=color:#b84>&#34;k&#34;</span>], axis<span style=font-weight:700>=</span><span style=color:#099>0</span>)<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>)
</span></span><span style=display:flex><span>train_k
</span></span></code></pre></div><p>We can now estimate $k$ using the test data:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>test_map_estimates <span style=font-weight:700>=</span> MCMC(df_test)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>test_k <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>mean(test_map_estimates[<span style=color:#b84>&#34;k&#34;</span>], axis<span style=font-weight:700>=</span><span style=color:#099>0</span>)<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>)
</span></span><span style=display:flex><span>test_k
</span></span></code></pre></div><p>We now build the Level 2 predictor, using $k$ as the input.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>linreg_latent <span style=font-weight:700>=</span> LinearRegression()
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>linreg_latent <span style=font-weight:700>=</span> linreg_latent<span style=font-weight:700>.</span>fit(train_k, df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>])
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>predictions_latent <span style=font-weight:700>=</span> linreg_latent<span style=font-weight:700>.</span>predict(test_k)
</span></span><span style=display:flex><span>predictions_latent
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>latent_score <span style=font-weight:700>=</span> linreg_latent<span style=font-weight:700>.</span>score(test_k, df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>])
</span></span><span style=display:flex><span><span style=color:#999>print</span>(latent_score)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>RMSE_latent <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>sqrt(mean_squared_error(df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>], predictions_latent))
</span></span><span style=display:flex><span><span style=color:#999>print</span>(RMSE_latent)
</span></span></code></pre></div><h3 id=additive-error-model x-intersect="currentHeading = '#additive-error-model'">Additive error model</h3><p>Finally, in <strong>Level 3</strong>, we model <code>GPA</code>, <code>LSAT</code>, and <code>FYA</code> as continuous variables with additive error terms
independent of race and sex<sup id=fnref:3><a href=#fn:3 class=footnote-ref role=doc-noteref>3</a></sup>.</p><p>This corresponds to</p><p>$$
\begin{aligned}
GPA &= b_G + w^R_{GPA}R + w^S_{GPA}S + \epsilon_{GPA}, \epsilon_{GPA} \sim p(\epsilon_{GPA}) \
LSAT &= b_L + w^R_{LSAT}R + w^S_{LSAT}S + \epsilon_{LSAT}, \epsilon_{LSAT} \sim p(\epsilon_{LSAT}) \
FYA &= b_{FYA} + w^R_{FYA}R + w^S_{FYA}S + \epsilon_{FYA} , \epsilon_{FYA} \sim p(\epsilon_{FYA})
\end{aligned}
$$</p><p>We estimate the error terms $\epsilon_{GPA}, \epsilon_{LSAT}$ by first fitting two models that each use race and sex to individually
predict <code>GPA</code> and <code>LSAT</code>. We then compute the residuals of each model (<em>e.g.</em>, $\epsilon_{GPA} =GPA−\hat{Y}<em>{GPA}(R, S)$).
We use these residual estimates of $\epsilon</em>{GPA}, \epsilon_{LSAT}$ to predict $FYA$. In <sup id=fnref4:2><a href=#fn:2 class=footnote-ref role=doc-noteref>2</a></sup> this is called <em>Fair Add</em>.</p><p>Since the process is similar for the individual predictions for <code>GPA</code> and <code>LSAT</code>, we will write a method to avoid repetion.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>def</span> <span style=color:#900;font-weight:700>calculate_epsilon</span>(data, var_name, protected_attr):
</span></span><span style=display:flex><span>    X <span style=font-weight:700>=</span> data[protected_attr]
</span></span><span style=display:flex><span>    y <span style=font-weight:700>=</span> data[var_name]
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    linreg <span style=font-weight:700>=</span> LinearRegression()
</span></span><span style=display:flex><span>    linreg <span style=font-weight:700>=</span> linreg<span style=font-weight:700>.</span>fit(X, y)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    predictions <span style=font-weight:700>=</span> linreg<span style=font-weight:700>.</span>predict(X)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>    <span style=font-weight:700>return</span> data[var_name] <span style=font-weight:700>-</span> predictions
</span></span></code></pre></div><p>Let&rsquo;s apply it to each variable, individually.
First we calculate $\epsilon_{GPA}$:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>epsilons_gpa <span style=font-weight:700>=</span> calculate_epsilon(df, <span style=color:#b84>&#34;UGPA&#34;</span>, A)
</span></span><span style=display:flex><span>epsilons_gpa
</span></span></code></pre></div><p>Next, we calculate $\epsilon_{LSAT}$:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>epsilons_LSAT <span style=font-weight:700>=</span> calculate_epsilon(df, <span style=color:#b84>&#34;LSAT&#34;</span>, A)
</span></span><span style=display:flex><span>epsilons_LSAT
</span></span></code></pre></div><p>Let&rsquo;s visualise the $\epsilon$ distribution quickly:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>matplotlib.pyplot</span> <span style=font-weight:700>as</span> <span style=color:#555>plt</span>
</span></span><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>seaborn</span> <span style=font-weight:700>as</span> <span style=color:#555>sns</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>subplot(<span style=color:#099>1</span>, <span style=color:#099>2</span>, <span style=color:#099>1</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>hist(epsilons_gpa, color<span style=font-weight:700>=</span>colours[<span style=color:#099>0</span>], bins<span style=font-weight:700>=</span><span style=color:#099>100</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>title(<span style=color:#b84>&#34;$\epsilon_</span><span style=color:#b84>{GPA}</span><span style=color:#b84>$&#34;</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>xlabel(<span style=color:#b84>&#34;$\epsilon_</span><span style=color:#b84>{GPA}</span><span style=color:#b84>$&#34;</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>subplot(<span style=color:#099>1</span>, <span style=color:#099>2</span>, <span style=color:#099>2</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>hist(epsilons_LSAT, color<span style=font-weight:700>=</span>colours[<span style=color:#099>1</span>], bins<span style=font-weight:700>=</span><span style=color:#099>100</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>title(<span style=color:#b84>&#34;$\epsilon_</span><span style=color:#b84>{LSAT}</span><span style=color:#b84>$&#34;</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>xlabel(<span style=color:#b84>&#34;$\epsilon_</span><span style=color:#b84>{LSAT}</span><span style=color:#b84>$&#34;</span>)
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>show()
</span></span></code></pre></div><p>We finally use the calculated $\epsilon$ to train a model in order to predict <code>FYA</code>.
We start by getting the subset of the $\epsilon$ which match the training indices.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>X <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>hstack(
</span></span><span style=display:flex><span>    (
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(epsilons_gpa[df_train<span style=font-weight:700>.</span>index])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(epsilons_LSAT[df_train<span style=font-weight:700>.</span>index])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>    )
</span></span><span style=display:flex><span>)
</span></span><span style=display:flex><span>X
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>linreg_fair_add <span style=font-weight:700>=</span> LinearRegression()
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>linreg_fair_add <span style=font-weight:700>=</span> linreg_fair_add<span style=font-weight:700>.</span>fit(
</span></span><span style=display:flex><span>    X,
</span></span><span style=display:flex><span>    df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>)
</span></span></code></pre></div><p>We now use this model to calculate the predictions</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>X_test <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>hstack(
</span></span><span style=display:flex><span>    (
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(epsilons_gpa[df_test<span style=font-weight:700>.</span>index])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>        np<span style=font-weight:700>.</span>array(epsilons_LSAT[df_test<span style=font-weight:700>.</span>index])<span style=font-weight:700>.</span>reshape(<span style=font-weight:700>-</span><span style=color:#099>1</span>, <span style=color:#099>1</span>),
</span></span><span style=display:flex><span>    )
</span></span><span style=display:flex><span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>predictions_fair_add <span style=font-weight:700>=</span> linreg_fair_add<span style=font-weight:700>.</span>predict(X_test)
</span></span><span style=display:flex><span>predictions_fair_add
</span></span></code></pre></div><p>And as previously, we calculate the model&rsquo;s score:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>fair_add_score <span style=font-weight:700>=</span> linreg_fair_add<span style=font-weight:700>.</span>score(X_test, df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>])
</span></span><span style=display:flex><span><span style=color:#999>print</span>(fair_add_score)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>RMSE_fair_add <span style=font-weight:700>=</span> np<span style=font-weight:700>.</span>sqrt(mean_squared_error(df_test[<span style=color:#b84>&#34;ZFYA&#34;</span>], predictions_fair_add))
</span></span><span style=display:flex><span><span style=color:#999>print</span>(RMSE_fair_add)
</span></span></code></pre></div><h3 id=comparison x-intersect="currentHeading = '#comparison'">Comparison</h3><p>The scores, so far, are:</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;Unfair score:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>score_unfair<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;FTU score:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>ftu_score<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;L2 score:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>latent_score<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;Fair add score:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>fair_add_score<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;Unfair RMSE:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>RMSE_unfair<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;FTU RMSE:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>RMSE_ftu<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;L2 RMSE:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>RMSE_latent<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span><span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;Fair add RMSE:</span><span style=color:#b84>\t</span><span style=color:#b84>{</span>RMSE_fair_add<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span></code></pre></div><h2 id=measuring-counterfactual-fairness x-intersect="currentHeading = '#measuring-counterfactual-fairness'">Measuring counterfactual fairness</h2><p>First, we will measure two quantities, the <strong>Statistical Parity Difference</strong> (SPD)<sup id=fnref:4><a href=#fn:4 class=footnote-ref role=doc-noteref>4</a></sup> and <strong>Disparate impact</strong> (DI)<sup id=fnref:5><a href=#fn:5 class=footnote-ref role=doc-noteref>5</a></sup>.</p><h3 id=statistical-parity-difference--disparate-impact x-intersect="currentHeading = '#statistical-parity-difference--disparate-impact'">Statistical Parity Difference / Disparate Impact</h3><p>from fairlearn.metrics import demographic_parity_difference, demographic_parity_ratio</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>parities <span style=font-weight:700>=</span> []
</span></span><span style=display:flex><span>impacts <span style=font-weight:700>=</span> []
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span><span style=font-weight:700>for</span> a <span style=font-weight:700>in</span> A:
</span></span><span style=display:flex><span>    parity <span style=font-weight:700>=</span> demographic_parity_difference(df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>], df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>                                                sensitive_features <span style=font-weight:700>=</span> df_train[a])
</span></span><span style=display:flex><span>    di <span style=font-weight:700>=</span> demographic_parity_ratio(df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>], df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>                                                sensitive_features <span style=font-weight:700>=</span> df_train[a])
</span></span><span style=display:flex><span>    parities<span style=font-weight:700>.</span>append(parity)
</span></span><span style=display:flex><span>    impacts<span style=font-weight:700>.</span>append(di)
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span>df_parities <span style=font-weight:700>=</span> pd<span style=font-weight:700>.</span>DataFrame({<span style=color:#b84>&#39;protected&#39;</span>:A,<span style=color:#b84>&#39;parity&#39;</span>:parities,<span style=color:#b84>&#39;impact&#39;</span>:impacts})
</span></span></code></pre></div><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>import</span> <span style=color:#555>matplotlib.pyplot</span> <span style=font-weight:700>as</span> <span style=color:#555>plt</span>
</span></span><span style=display:flex><span><span style=font-weight:700>from</span> <span style=color:#555>plotutils</span> <span style=font-weight:700>import</span> <span style=font-weight:700>*</span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>fig <span style=font-weight:700>=</span> plt<span style=font-weight:700>.</span>figure()
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>ax <span style=font-weight:700>=</span> fig<span style=font-weight:700>.</span>add_subplot(<span style=color:#099>111</span>)
</span></span><span style=display:flex><span>ax2 <span style=font-weight:700>=</span> ax<span style=font-weight:700>.</span>twinx()
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>fig<span style=font-weight:700>.</span>suptitle(<span style=color:#b84>&#39;Statistical Parity Difference and Disparate Impact&#39;</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>width <span style=font-weight:700>=</span> <span style=color:#099>0.4</span>
</span></span><span style=display:flex><span>df_parities<span style=font-weight:700>.</span>plot(x <span style=font-weight:700>=</span><span style=color:#b84>&#39;protected&#39;</span>, y <span style=font-weight:700>=</span> <span style=color:#b84>&#39;parity&#39;</span>, kind <span style=font-weight:700>=</span> <span style=color:#b84>&#39;bar&#39;</span>, ax <span style=font-weight:700>=</span> ax, width <span style=font-weight:700>=</span> width,
</span></span><span style=display:flex><span>       position<span style=font-weight:700>=</span><span style=color:#099>1</span>, color<span style=font-weight:700>=</span>colours[<span style=color:#099>0</span>], legend<span style=font-weight:700>=</span><span style=font-weight:700>False</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>df_parities<span style=font-weight:700>.</span>plot(x <span style=font-weight:700>=</span><span style=color:#b84>&#39;protected&#39;</span>, y <span style=font-weight:700>=</span> <span style=color:#b84>&#39;impact&#39;</span>, kind <span style=font-weight:700>=</span> <span style=color:#b84>&#39;bar&#39;</span>, ax <span style=font-weight:700>=</span> ax2, width <span style=font-weight:700>=</span> width,
</span></span><span style=display:flex><span>       position <span style=font-weight:700>=</span> <span style=color:#099>0</span>, color <span style=font-weight:700>=</span> colours[<span style=color:#099>1</span>], legend <span style=font-weight:700>=</span> <span style=font-weight:700>False</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>ax<span style=font-weight:700>.</span>axhline(y <span style=font-weight:700>=</span> <span style=color:#099>0.1</span>, linestyle <span style=font-weight:700>=</span> <span style=color:#b84>&#39;dashed&#39;</span>, alpha <span style=font-weight:700>=</span> <span style=color:#099>0.7</span>, color <span style=font-weight:700>=</span> colours[<span style=color:#099>0</span>])
</span></span><span style=display:flex><span>ax2<span style=font-weight:700>.</span>axhline(y <span style=font-weight:700>=</span> <span style=color:#099>0.55</span>, linestyle <span style=font-weight:700>=</span> <span style=color:#b84>&#39;dashed&#39;</span>, alpha <span style=font-weight:700>=</span> <span style=color:#099>0.7</span>, color <span style=font-weight:700>=</span> colours[<span style=color:#099>1</span>])
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>patches, labels <span style=font-weight:700>=</span> ax<span style=font-weight:700>.</span>get_legend_handles_labels()
</span></span><span style=display:flex><span>ax<span style=font-weight:700>.</span>legend(patches, [<span style=color:#b84>&#39;Stat Parity Diff&#39;</span>], loc <span style=font-weight:700>=</span> <span style=color:#b84>&#39;upper left&#39;</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>patches, labels <span style=font-weight:700>=</span> ax2<span style=font-weight:700>.</span>get_legend_handles_labels()
</span></span><span style=display:flex><span>ax2<span style=font-weight:700>.</span>legend(patches, [<span style=color:#b84>&#39;Disparate Impact&#39;</span>], loc <span style=font-weight:700>=</span> <span style=color:#b84>&#39;upper right&#39;</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>labels <span style=font-weight:700>=</span> [item<span style=font-weight:700>.</span>get_text() <span style=font-weight:700>for</span> item <span style=font-weight:700>in</span> ax<span style=font-weight:700>.</span>get_xticklabels()]
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span><span style=font-weight:700>for</span> i <span style=font-weight:700>in</span> <span style=color:#999>range</span>(<span style=color:#999>len</span>(A)):
</span></span><span style=display:flex><span>    labels[i] <span style=font-weight:700>=</span> A[i]
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>ax<span style=font-weight:700>.</span>set_xticklabels(labels)
</span></span><span style=display:flex><span>ax<span style=font-weight:700>.</span>set_xlabel(<span style=color:#b84>&#39;Protected Features&#39;</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>ax<span style=font-weight:700>.</span>set_ylabel(<span style=color:#b84>&#39;Statistical Parity Difference&#39;</span>)
</span></span><span style=display:flex><span>ax2<span style=font-weight:700>.</span>set_ylabel(<span style=color:#b84>&#39;Disparate Impact&#39;</span>)
</span></span><span style=display:flex><span>
</span></span><span style=display:flex><span>plt<span style=font-weight:700>.</span>show()
</span></span></code></pre></div><h3 id=finding-sensitive-features x-intersect="currentHeading = '#finding-sensitive-features'">Finding sensitive features</h3><p>Typically a $SPD > 0.1$ and a $DI &lt; 0.9$ might indicate discrimination on those features.
All <em>protected attributes</em> fail the SPD test and, in our dataset, we have two features (~Hispanic~ and ~Mexican~) which clearly fail the DI test.</p><div class=highlight><pre tabindex=0 style=background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-python data-lang=python><span style=display:flex><span><span style=font-weight:700>for</span> a <span style=font-weight:700>in</span> [<span style=color:#b84>&#34;Mexican&#34;</span>, <span style=color:#b84>&#34;Hispanic&#34;</span>]:
</span></span><span style=display:flex><span>    spd <span style=font-weight:700>=</span> demographic_parity_difference(y_true<span style=font-weight:700>=</span>df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>                                        y_pred<span style=font-weight:700>=</span>df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>                                        sensitive_features <span style=font-weight:700>=</span> df_train[a])
</span></span><span style=display:flex><span>    <span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;SPD(</span><span style=color:#b84>{</span>a<span style=color:#b84>}</span><span style=color:#b84>) = </span><span style=color:#b84>{</span>spd<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span><span style=display:flex><span>    di <span style=font-weight:700>=</span> demographic_parity_ratio(y_true<span style=font-weight:700>=</span>df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>                                  y_pred<span style=font-weight:700>=</span>df_train[<span style=color:#b84>&#34;ZFYA&#34;</span>],
</span></span><span style=display:flex><span>                                  sensitive_features <span style=font-weight:700>=</span> df_train[a])
</span></span><span style=display:flex><span>    <span style=color:#999>print</span>(<span style=color:#b84>f</span><span style=color:#b84>&#34;DI(</span><span style=color:#b84>{</span>a<span style=color:#b84>}</span><span style=color:#b84>) = </span><span style=color:#b84>{</span>di<span style=color:#b84>}</span><span style=color:#b84>&#34;</span>)
</span></span></code></pre></div><div class=footnotes role=doc-endnotes><hr><ol><li id=fn:1><p>McIntyre, Frank, and Michael Simkovic. &ldquo;Are law degrees as valuable to minorities?.&rdquo; International Review of Law and Economics 53 (2018): 23-37.&#160;<a href=#fnref:1 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:2><p>Kusner, Matt J., Joshua Loftus, Chris Russell, and Ricardo Silva. &ldquo;Counterfactual fairness.&rdquo; In Advances in neural information processing systems, pp. 4066-4076. 2017.&#160;<a href=#fnref:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a>&#160;<a href=#fnref1:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a>&#160;<a href=#fnref2:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a>&#160;<a href=#fnref3:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a>&#160;<a href=#fnref4:2 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:3><p>That may in turn be correlated with one-another.&#160;<a href=#fnref:3 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:4><p>See {ref}<code>fairness:demographic-parity-difference</code>.&#160;<a href=#fnref:4 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li><li id=fn:5><p>See {ref}<code>fairness:disparate-impact</code>.&#160;<a href=#fnref:5 class=footnote-backref role=doc-backlink>&#8617;&#xfe0e;</a></p></li></ol></div></div></article><div id=footer-post-container><div id=footer-post><div id=nav-footer style=display:none><ul><li><a href=https://ruivieira.dev/>Home</a></li><li><a href=https://ruivieira.dev/blog/>Blog</a></li><li><a href=https://ruivieira.dev/draw/>Drawings</a></li><li><a href=https://ruivieira.dev/map/>All pages</a></li><li><a href=https://ruivieira.dev/search.html>Search</a></li></ul></div><div id=toc-footer style=display:none><nav id=TableOfContents><ul><li><a href=#building-counterfactually-fair-models>Building counterfactually fair models</a><ul><li><a href=#data>Data</a></li><li><a href=#pre-processing>Pre-processing</a></li><li><a href=#protected-attributes>Protected attributes</a></li><li><a href=#training-and-testing-subsets>Training and testing subsets</a></li></ul></li><li><a href=#models>Models</a><ul><li><a href=#unfair-model>Unfair model</a></li><li><a href=#full-model>Full model</a></li><li><a href=#fairness-through-unawareness-ftu>Fairness through unawareness (FTU)</a></li><li><a href=#latent-variable-model>Latent variable model</a></li><li><a href=#additive-error-model>Additive error model</a></li><li><a href=#comparison>Comparison</a></li></ul></li><li><a href=#measuring-counterfactual-fairness>Measuring counterfactual fairness</a><ul><li><a href=#statistical-parity-difference--disparate-impact>Statistical Parity Difference / Disparate Impact</a></li><li><a href=#finding-sensitive-features>Finding sensitive features</a></li></ul></li></ul></nav></div><div id=share-footer style=display:none></div><div id=actions-footer><a id=menu-toggle class=icon href=# onclick='return $("#nav-footer").toggle(),!1' aria-label=Menu><i class="fas fa-bars fa-lg" aria-hidden=true></i> Menu</a>
<a id=toc-toggle class=icon href=# onclick='return $("#toc-footer").toggle(),!1' aria-label=TOC><i class="fas fa-list fa-lg" aria-hidden=true></i> TOC</a>
<a id=share-toggle class=icon href=# onclick='return $("#share-footer").toggle(),!1' aria-label=Share><i class="fas fa-share-alt fa-lg" aria-hidden=true></i> share</a>
<a id=top style=display:none class=icon href=# onclick='$("html, body").animate({scrollTop:0},"fast")' aria-label="Top of Page"><i class="fas fa-chevron-up fa-lg" aria-hidden=true></i> Top</a></div></div></div><footer id=footer><div class=footer-left>Copyright &copy; 2024 Rui Vieira</div><div class=footer-right><nav><ul><li><a href=https://ruivieira.dev/>Home</a></li><li><a href=https://ruivieira.dev/blog/>Blog</a></li><li><a href=https://ruivieira.dev/draw/>Drawings</a></li><li><a href=https://ruivieira.dev/map/>All pages</a></li><li><a href=https://ruivieira.dev/search.html>Search</a></li></ul></nav></div></footer></div></body><link rel=stylesheet href=https://ruivieira.dev/css/fa.min.css><script src=https://ruivieira.dev/js/jquery-3.6.0.min.js></script><script src=https://ruivieira.dev/js/mark.min.js></script><script src=https://ruivieira.dev/js/main.js></script><script>MathJax={tex:{inlineMath:[["$","$"],["\\(","\\)"]]},svg:{fontCache:"global"}}</script><script type=text/javascript id=MathJax-script async src=https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js></script></html>