Skip to content

Commit

Permalink
fixed images in proposal
Browse files Browse the repository at this point in the history
  • Loading branch information
John Shiu committed Jun 22, 2024
1 parent 4488e04 commit 21ef07d
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 24 deletions.
46 changes: 23 additions & 23 deletions report/docs/report/final_report.html
Original file line number Diff line number Diff line change
Expand Up @@ -402,7 +402,7 @@ <h3 class="anchored" data-anchor-id="evaluation-results">Evaluation Results</h3>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>gt <span class="op">=</span> pd.read_csv(<span class="st">'../data/processed/ground_truth.csv'</span>)</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>gt</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display" data-execution_count="1">
<div class="cell-output cell-output-display" data-execution_count="6">
<div>


Expand Down Expand Up @@ -540,26 +540,26 @@ <h3 class="anchored" data-anchor-id="evaluation-results">Evaluation Results</h3>
<span id="cb2-52"><a href="#cb2-52" aria-hidden="true" tabindex="-1"></a> titleFontSize<span class="op">=</span><span class="dv">12</span></span>
<span id="cb2-53"><a href="#cb2-53" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display" data-execution_count="2">
<div class="cell-output cell-output-display" data-execution_count="7">

<style>
#altair-viz-2e51836bde5f498c91a72176f891a9a9.vega-embed {
#altair-viz-1999d76b84e542058601d1322c45e9f7.vega-embed {
width: 100%;
display: flex;
}

#altair-viz-2e51836bde5f498c91a72176f891a9a9.vega-embed details,
#altair-viz-2e51836bde5f498c91a72176f891a9a9.vega-embed details summary {
#altair-viz-1999d76b84e542058601d1322c45e9f7.vega-embed details,
#altair-viz-1999d76b84e542058601d1322c45e9f7.vega-embed details summary {
position: relative;
}
</style>
<div id="altair-viz-2e51836bde5f498c91a72176f891a9a9"></div>
<div id="altair-viz-1999d76b84e542058601d1322c45e9f7"></div>
<script type="text/javascript">
var VEGA_DEBUG = (typeof VEGA_DEBUG == "undefined") ? {} : VEGA_DEBUG;
(function(spec, embedOpt){
let outputDiv = document.currentScript.previousElementSibling;
if (outputDiv.id !== "altair-viz-2e51836bde5f498c91a72176f891a9a9") {
outputDiv = document.getElementById("altair-viz-2e51836bde5f498c91a72176f891a9a9");
if (outputDiv.id !== "altair-viz-1999d76b84e542058601d1322c45e9f7") {
outputDiv = document.getElementById("altair-viz-1999d76b84e542058601d1322c45e9f7");
}
const paths = {
"vega": "https://cdn.jsdelivr.net/npm/vega@5?noext",
Expand Down Expand Up @@ -631,7 +631,7 @@ <h3 class="anchored" data-anchor-id="evaluation-results">Evaluation Results</h3>
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a>contingency_table.index.names <span class="op">=</span> [<span class="st">'Repository'</span>, <span class="st">'Checklist Item'</span>, <span class="st">'Ground Truth'</span>]</span>
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a>contingency_table.sort_index(level<span class="op">=</span>[<span class="dv">0</span>, <span class="dv">2</span>])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display" data-execution_count="3">
<div class="cell-output cell-output-display" data-execution_count="8">
<div>


Expand Down Expand Up @@ -816,26 +816,26 @@ <h3 class="anchored" data-anchor-id="evaluation-results">Evaluation Results</h3>
<span id="cb4-42"><a href="#cb4-42" aria-hidden="true" tabindex="-1"></a> title<span class="op">=</span><span class="st">"30 Runs on Openja's Repositories for each Checklist Item"</span></span>
<span id="cb4-43"><a href="#cb4-43" aria-hidden="true" tabindex="-1"></a>) </span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display" data-execution_count="4">
<div class="cell-output cell-output-display" data-execution_count="9">

<style>
#altair-viz-76f736c12da445dbb517ce132d68e98d.vega-embed {
#altair-viz-a5a0777d203546aeaa42f821ea918c6b.vega-embed {
width: 100%;
display: flex;
}

#altair-viz-76f736c12da445dbb517ce132d68e98d.vega-embed details,
#altair-viz-76f736c12da445dbb517ce132d68e98d.vega-embed details summary {
#altair-viz-a5a0777d203546aeaa42f821ea918c6b.vega-embed details,
#altair-viz-a5a0777d203546aeaa42f821ea918c6b.vega-embed details summary {
position: relative;
}
</style>
<div id="altair-viz-76f736c12da445dbb517ce132d68e98d"></div>
<div id="altair-viz-a5a0777d203546aeaa42f821ea918c6b"></div>
<script type="text/javascript">
var VEGA_DEBUG = (typeof VEGA_DEBUG == "undefined") ? {} : VEGA_DEBUG;
(function(spec, embedOpt){
let outputDiv = document.currentScript.previousElementSibling;
if (outputDiv.id !== "altair-viz-76f736c12da445dbb517ce132d68e98d") {
outputDiv = document.getElementById("altair-viz-76f736c12da445dbb517ce132d68e98d");
if (outputDiv.id !== "altair-viz-a5a0777d203546aeaa42f821ea918c6b") {
outputDiv = document.getElementById("altair-viz-a5a0777d203546aeaa42f821ea918c6b");
}
const paths = {
"vega": "https://cdn.jsdelivr.net/npm/vega@5?noext",
Expand Down Expand Up @@ -960,26 +960,26 @@ <h4 class="anchored" data-anchor-id="comparison-of-gpt-3.5-turbo-and-gpt-4o">Com
<span id="cb5-55"><a href="#cb5-55" aria-hidden="true" tabindex="-1"></a> titleFontSize<span class="op">=</span><span class="dv">12</span></span>
<span id="cb5-56"><a href="#cb5-56" aria-hidden="true" tabindex="-1"></a>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display" data-execution_count="5">
<div class="cell-output cell-output-display" data-execution_count="10">

<style>
#altair-viz-7bb302ef75b745b78e688e47a21a7c6a.vega-embed {
#altair-viz-8cc4c4e5769643588ef22330585b69e2.vega-embed {
width: 100%;
display: flex;
}

#altair-viz-7bb302ef75b745b78e688e47a21a7c6a.vega-embed details,
#altair-viz-7bb302ef75b745b78e688e47a21a7c6a.vega-embed details summary {
#altair-viz-8cc4c4e5769643588ef22330585b69e2.vega-embed details,
#altair-viz-8cc4c4e5769643588ef22330585b69e2.vega-embed details summary {
position: relative;
}
</style>
<div id="altair-viz-7bb302ef75b745b78e688e47a21a7c6a"></div>
<div id="altair-viz-8cc4c4e5769643588ef22330585b69e2"></div>
<script type="text/javascript">
var VEGA_DEBUG = (typeof VEGA_DEBUG == "undefined") ? {} : VEGA_DEBUG;
(function(spec, embedOpt){
let outputDiv = document.currentScript.previousElementSibling;
if (outputDiv.id !== "altair-viz-7bb302ef75b745b78e688e47a21a7c6a") {
outputDiv = document.getElementById("altair-viz-7bb302ef75b745b78e688e47a21a7c6a");
if (outputDiv.id !== "altair-viz-8cc4c4e5769643588ef22330585b69e2") {
outputDiv = document.getElementById("altair-viz-8cc4c4e5769643588ef22330585b69e2");
}
const paths = {
"vega": "https://cdn.jsdelivr.net/npm/vega@5?noext",
Expand Down
1 change: 1 addition & 0 deletions report/docs/robots.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Sitemap: https://UBC-MDS.github.io/fixml/sitemap.xml
11 changes: 11 additions & 0 deletions report/docs/sitemap.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://UBC-MDS.github.io/fixml/report/proposal.html</loc>
<lastmod>2024-06-22T00:11:21.453Z</lastmod>
</url>
<url>
<loc>https://UBC-MDS.github.io/fixml/report/final_report.html</loc>
<lastmod>2024-06-22T00:11:20.477Z</lastmod>
</url>
</urlset>
2 changes: 1 addition & 1 deletion report/proposal.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ We propose to develop testing suites diagnostic tools based on Large Language Mo

Our solution offers an end-to-end application for evaluating and enhancing the robustness of users' ML systems.

![Main components and workflow of the proposed system. The checklist would be written in [YAML](https://yaml.org/) to maximize readability for both humans and machines. We hope this will encourage researchers/users to read, understand and modify the checklist items, while keeping the checklist closely integrated with other components in our system.](img/proposed_system_overview.png)
![Main components and workflow of the proposed system. The checklist would be written in [YAML](https://yaml.org/) to maximize readability for both humans and machines. We hope this will encourage researchers/users to read, understand and modify the checklist items, while keeping the checklist closely integrated with other components in our system.](../img/proposed_system_overview.png)

One big challenge in utilizing LLMs to reliably and consistently evaluate ML systems is their tendency to generate illogical and/or factually wrong information known as hallucination [@zhang2023sirens].

Expand Down

0 comments on commit 21ef07d

Please sign in to comment.