hazelgrove · mirryi · Jan 21, 2022 · Jan 23, 2022 · Jan 30, 2022 · Jan 30, 2022
diff --git a/14-compiler/.gitignore b/14-compiler/.gitignore
@@ -0,0 +1,2 @@
+*.pdf
+target/
diff --git a/14-compiler/01-motivation.tex b/14-compiler/01-motivation.tex
@@ -0,0 +1,47 @@
+\documentclass[index.tex]{subfiles}
+
+\begin{document}
+\section{Motivation}
+\label{motivation}
+Right now, \Hazel{} has a simple tree-walk evaluator based on environments, but not a compiler that
+can output fast, optimized executables. This PHI proposes such a compiler.
+
+\subsection{Objectives}
+\label{objectives}
+There are a number of criteria we would like the compiler to satisfy: 
+%
+\begin{itemize}
+  \item Compiled programs should match the semantics and output of the existing evaluator. This
+    means the compiler should
+    %
+    \begin{itemize}
+      \item support the incomplete programs, i.e. programs with holes.
+      \item produce programs that have the same final result as an evaluated program.
+    \end{itemize}
+
+  \item Compiled programs should have faster execution speed than evaluated ones.
+  \item Complete \Hazel{} programs should execute as normal functional programs; they should require
+    no extra machinery related to holes.
+\end{itemize}
+%
+In addition, it may be interesting to explore interoperability with the evaluator and live
+programming environment:
+%
+\begin{itemize}
+  \item Dynamic compilation with hand-off between evaluator and compiler.
+  \item Incremental compilation, to reduce overhead of compilation in a live environment.
+  \item Incremental execution, via fill-and-resume-like behaviour.
+\end{itemize}
+
+\subsection{Challenges}
+\label{challenges}
+The presence of holes poses fundamental challenges for compilers, as \emph{indeterminate results}
+produced by \Hazel's semantics are not values. Hence, we are concerned with
+%
+\begin{itemize}
+  \item Speed- and space-efficient representations for holes and indeterminate results, as well as
+    performant operations on them.
+  \item Static analyses for optimizing incomplete programs.
+  \item Incrementality and liveness; see above \cref{objectives}.
+\end{itemize}
+\end{document}
diff --git a/14-compiler/02-approach.tex b/14-compiler/02-approach.tex
@@ -0,0 +1,81 @@
+\documentclass[index.tex]{subfiles}
+
+\begin{document}
+\section{Approach}
+\label{sec:approach}
+This section outlines the approaches we take towards various aspects of compilation and execution
+related to holes.
+
+\subsection{Runtime representation}
+\label{sec:runtime-representation}
+We represent indeterminate results as syntax trees in the runtime, and operations taking
+indeterminate results merely accumulate a new syntax tree.
+%
+\begin{example}
+  In the program $1 + (5 * 6 + \SyEHole{1}{})$, $5 * 6$ produces an ordinary number $30$, but the
+  subsequent $+ \SyEHole{1}{}$ gives an indeterminate syntax tree with root $+$, left child $30$, and
+  right child $\SyEHole{1}{}$. The final result is an indeterminate syntax tree with root $+$, left
+  child $1$, and right child that is the previous tree.
+\end{example}
+%
+\noindent This necessitates dynamics checks before each operation to determine whether operands are
+values or indeterminate results; therefore, there must be some runtime data that discriminates
+between values and indeterminate results.
+
+\subsection{Casting}
+\label{sec:casting}
+
+\subsubsection{Embedding/projection}
+In a first pass, we adopt a type-indexed embedding/projecting pairs approach \cite{benton2005,
+new2018}: casting $x$ from type $\tau$ to the hole type \emph{embeds} $x$ with type information
+about $\tau$ into a proxy; casting to the type $\tau'$ \emph{projects} the proxy, dynamically
+checking if $\tau = \tau'$.
+
+\subsubsection{Coercions}
+In the future, a coercion-based approach should be taken, which has some space efficiency guarantees
+\cite{herman2010, kuhlenschmidt2019}. A ``coercion calculus with holes'' is probably a whole
+research topic in itself.
+
+\subsubsection{Static analysis}
+It might be possible, as with determining if expressions are possibly indeterminate at runtime
+(\Cref{sec:completeness-analysis}), to statically determine where casts might show up. This might
+lend itself to some optimizations, particularly when casts appear as scrutinees of pattern matching
+(\Cref{sec:pattern-matching}).
+
+\subsection{Pattern matching}
+\label{sec:pattern-matching}
+It should be possible to compile pattern matching with holes into ordinary functional pattern
+matching:
+\begin{itemize}
+  \item \emph{Hole patterns} may be compiled into wildcard patterns that immediately stop and return
+    the entire $\textsf{match}$ expression as an indeterminate results.
+  \item In the presence of casts, types must be matched on as data. This is something to consider
+    when designing the runtime representation of indeterminate results
+    (\Cref{sec:runtime-representation}).
+\end{itemize}
+
+\subsection{Completeness analysis}
+\label{sec:completeness-analysis}
+Since we want complete portions (i.e. no holes) of a program to run as ordinary functional programs
+and not require the machinery necessary for handling holes, we perform a static \emph{completeness
+analysis} that determines whether an expressions is guaranteed to be hole-free. To do this, we
+define a notion of \emph{completeness}:
+%
+\begin{definition}[name=Completeness, label=completeness]
+  An expression may be (1) \emph{necessarily complete}, i.e. it must be a value at runtime; (2)
+  \emph{necessarily incomplete}, i.e. it must be an indeterminate result at runtime; or (3)
+  \emph{indeterminately incomplete}, i.e. it may be either a value or an indeterminate result at
+  runtime.
+\end{definition}
+
+\subsubsection{Function-local completeness}
+A basic function-local analysis which treats function parameters as indeterminately
+incomplete by default may be implemented in a single pass over the expression tree. See
+\Cref{fig:lir-completeness-analysis-local} for the formalization.
+
+\subsubsection{CFA-based completeness}
+A more complex analysis may use control-flow analysis techniques on higher-order functions
+\cite{shivers1991, nielson1999}. Other papers to consider reading: \textcite{vardoulaskis2011} (but
+apparently $2^{n}$ complexity), \textcite{gilray2016}.
+
+\end{document}
diff --git a/14-compiler/03-01-middle-ir.tex b/14-compiler/03-01-middle-ir.tex
@@ -0,0 +1,10 @@
+\documentclass[index.tex]{subfiles}
+
+\begin{document}
+\subsection{Middle intermediate representation (MIR)}
+\label{sec:mir}
+
+\subsection{Sequentialization}
+\label{sec:sequentialization}
+
+\end{document}