Bluespec, a high-hevel HDL, offers a simple concurrency model that enables functional reasoning without compromising performance.
Unfortunately, its cost model is hard to formalize: performance depends on user hints and static analysis of conflicts within a design.
We present Kôika, a Bluespec derivative that gives direct control over scheduling decisions that determine performance, while using dynamic analysis to avoid concurrency anomalies.
Our implementation includes formal semantics, mechanized theorems, and a verified compiler.
State function = Rules + Explicit schedule
Core calculus for rule-based language amenable to formal reasoning about functionality and performance
Cycle-accurate operational semantics
Proof of one-rule-at-a-time abstraction
Full Coq mechanization of the semantics
Verified compilation to RTL
Performance case-study of a pipelined processor
HW spec =
Functional spec (ORAAT) + Performance spec (Not in previous work)
in
→ f
→
→ g
→ out
Functional spec : out = g (f (in))
Performance spec : out[t + 2] = g (f (in[t]))
Precise semantics allows for performance proofs
rule divide =
let v = r.rd_0() in
if iseven(v) then
r.wr_0(v >> 1)
rule multiply =
let v = r.rd_1() in
if isodd(v) then
r.wr_1(3 * v + 1)
schedule collatz = [divide; multiply]
rule swap =
s.wr_0(r.rd_0());
r.wr_0(s.rd_0())
rule dyn_abort =
if r.rd_0() == 0 then
t.wr_0(0b1);
if s.rd_0() == 0 then
t.wr_0(0b1)
Registers r
Variables x
External functions f
Constants b
Ports p ∈ {0, 1}
Actions a ⩴
$b
$x
$f($a, …, $a)
skip
let $x := $a in $a
$r.rd_$p()
$r.wr_$p($a)
if $a then $a else $a
…
$$
\newcommand{\app}{\mathbin{+\!\!+}}
\newcommand{\hypsep}{\hspace{1.25em}}
\newcommand{\deduction}[3][]{\frac{#2}{#3}{\small{#1}}}
\newcommand{\logRead}[2]{\left(\textsf{rd}_{#1}, #2\right)}
\newcommand{\logWrite}[3]{\left(\textsf{wr}_{#1}, #2, #3\right)}
\newcommand{\logAppend}[2]{#2 \app [#1]}
\newcommand{\judgeExec}[3]{\Gamma \vdash \computesTo{(\ell, #1)}{\left(#2, #3\right)}}
\newcommand{\sgaRead}[3]{#2.\texttt{rd}_{\texttt{#1}}}
\newcommand{\sgaWrite}[4]{#2.\texttt{wr}_{\texttt{#1}}\left(#4\right)}
\newcommand{\computesTo}[2]{#1 \downarrow #2}
$$
$$
\deduction[\text{RD}{_0}]
{\logWrite{1}{r}{*} \notin L \hypsep
\logWrite{0}{r}{*} \notin L}
{\judgeExec{\sgaRead{0}{r}}{\logAppend{\logRead{0}{r}}{\ell}}{\mathcal{R}\left[r\right]}}
$$
$$
\deduction[\text{WR}{_1}]
{\judgeExec{a}{\ell^\prime}{v} \hypsep
\logWrite{1}{r}{*} \notin L \app \ell^\prime }
{\judgeExec{\sgaWrite{1}{r}{a}}{\logAppend{\logWrite{1}{r}{v}}{\ell^\prime}}{\textsf{tt}}}
$$
Rich specifications serve as a contract between hardware designers and compiler writers and enable verified designs and verified compilation.
One-rule-at-a-time abstraction
Theorem OneRuleAtATime init schedule:
exists rules ⊂ schedule,
interp_s init schedule =
foldl interp_r init rules.
∈
ORAAT property enables local, modular reasoning
Proof mechanization enables semantics exploration
Sequential semantics
↓
Concurrent RTL with deferred checks
Verified circuit optimizations (constant propagation, mux elimination, partial evaluation) reduce the overhead of dynamic scheduling & conflict resolution.
Theorem CompilerCorrectness init schedule:
interp_s init schedule =
interp_rtl init (compile schedule).
≡
Output and downstream synthesis
Kôika outputs code in a safe subset of Verilog.
Downstream tools perform further optimizations and can generate FPGA and ASICs designs.
Hardware artifacts so far
Arithmetic-pipeline performance proof (on paper)
Kôika port of a simple BSV RISCV core (most of RV32i&e, critical path and area overhead ≤ 10%)
Kôika-level simulation, leveraging high-level structure for performance
Verification of performance (pipelining behavior) and timing properties of the RISCV core
Multi-core systems & enclaves, with proofs of safety from timing side-channels
Further language design, including native modularity