Hyperproperties and why they're hard

Computer security as a field is met with a persistent criticism: flashy attacks are hot, boring (hard-fought) defenses are not. Compared to security, programming language theory seems relatively adept at transitioning research to mainstream practice (e.g., Rust and the popularity of Haskell). By contrast, it would appear many of the innovations in security have been relatively engineering-focused in nature: W^X, ASLR, and the like. As someone who does research in this space myself, I have wondered why this is; while I think the reasons are myriad, I think a central challenge is the intrinsic need in security to reason about the ever-elusive hyperproperties.

Properties

The properties of a program are mathematical statements we can make about it’s traces. For example, consider the following program:

while true:          # 0
  if (n % 2 == 0):   # 1
    n := n/2         # 2
  else:              # 3
    n := 3*n + 1     # 4

What could traces of this program’s execution look like? If n is known, we might say it is the sequence of n paired with the appropriate line number. For example, if n starts at 5:

⟪0, n=5⟫, ⟪1, n=5⟫, ⟪4, n=5⟫, ⟪0, n=16⟫, ⟪1, n=16⟫, ⟪2, n=16⟫, ⟪0, n=8⟫,
⟪1, n=8⟫, ⟪2, n=8⟫, ⟪0, n=4⟫, ⟪1, n=4⟫, ⟪2, n=4⟫, ⟪0, n=2⟫, ⟪1, n=2⟫, 
⟪2, n=2⟫, ⟪0, n=1⟫, ...

Each trace is (countably) infinite in size, without loss of generality (as a terminating program may be thought of as infinitely repeating a final skip state). Quantifying over the input is possible by producing an infinite set of (infinite) traces. For example, one property of the above (set of) traces is this: the suffix of every trace is the (infinite loop) ⟪0, n=4⟫, ⟪1, n=4⟫, ⟪2, n=4⟫, ⟪0, n=2⟫, ⟪1, n=2⟫, ⟪2, n=2⟫, ⟪0, n=1⟫, ....

The trace-based nature inherent to this style of semantics is a core challenge in checking properties of the associated programs, because computers may only materialize a finite approximation of infinite objects. Unfortunately, the exact nature of this approximation can deeply impact the kind of properties which may be checked. This is why the different approaches to program modeling and verification often pair a representation with an intended logic:

Model checking allow the user to specify the program using a domain-specific languages (see, e.g., Chapter 2.3 of the TLA+ book) which compiles into an automaton. This is no coincidence: statements in temporal logics also map cleanly to the same automata (an extension of finite automata called Büchi automata), at which point automata-theoretic constructions (intersection, complementation, etc…) may be used.
Abstract interpretation recognizes that the infinite behavior may be finitized by approximating values in the semantics as elements of a lattice allow using a fixed-point theorem to obtain tractability. In this setting, a set of traces is often modeled as an abstract state graph, the deterministic nature of traces often being lost in the (necessary) finite approximation; paths through the abstract state graph then correspond to traces in the (infinite) semantics. Using this style, an abstract interpreter materializes a state graph and then checks a property (e.g., the inclusion of a specific data flow) by examining this abstract representation.
Symbolic execution keeps a symbolic representation of the program state, and thus produces symbolic traces. The problem is that symbolic execution can only materialize finite portions of these traces, and thus it is impossible to reason about infinite behavior. Thus, a symbolic executor includes a “scheduler” which picks how to explore paths to avoid unproductive exploration. A central challenge in symbolic execution is how to avoid getting stuck early on in the trace. This problem is avoided in (say) abstract interpretation by allowing the abstract state graph to discover spurious edges, allowing reasoning from the “middle” of the program, but at the cost of adding traces not present in the original program.
Axiomatic semantics (e.g., Hoare triples) allows the user to annotate the program with specifications, which are subsequently enumerated to “verification conditions.” The semantics, and thus the set of traces, is implicitly represented as the set of valid program states which would satisfy the necessary verification conditions at any specific point in the program. Metatheoretic results justify the way in which we may stitch these verification conditions together to obtain an end-to-end correct program. Still no free lunch here, however, as the programmer often needs to articulate tedious loop invariants in an iterative process with the verification tool.

Safety and Liveness

Surprisingly, all properties of programs can be broken down into a combination of two things: a “safety” property, and a “liveness” property.

Safety properties are things which can be refuted using finite counterexamples. For example, if a program has a use-after-free vulnerability, you will be able to demonstrate some exploit wherein memory is freed and then subsequently (but in a finite time horizon) accessed. I.e., “bad things don’t happen.”
Liveness properties are things which require some finite proof of progress. The quintessential liveness examples are guaranteed service or starvation freedom.

For example, consider the following concurrent code, wherein process 1 and 2 operate concurrently:

n := 0
sem := 1

Process 1:
  while true:
    take(sem)
    i := 0
    while (i != n):
      skip
    n := n + 1
    rel(sem)

Process 2:
  while true:
    take(sem)
    if (n >= 7)
      *NULL; // crash
    rel(sem)

Assuming a fair scheduler, the program will eventually crash. Demonstrating this requires only that we enumerate a thread schedule wherein we choose process 1 to take sem (at least) seven times. A similar argument holds for any arbitrarily-picked natural number, thus the statement we’re making is one of safety.

Things that are crash-like (vulnerabilities, memory errors, etc…) are often safety properties. In fact, safety properties are so common it is not obvious that there are other kinds of properties. At least, this is probably true for computer scientists, pained by their hours of debugging safety properties. By contrast, liveness properties say that “good things” happen eventually. Examples include guaranteed availability and (surprisingly, perhaps) termination.

Formally, a liveness property has the form: “for all traces t0, there exists an extension t1 such that t0⋅t1 ⊨ P.” Termination is a simple instance of this liveness template, where P states that t1 is a stream of skip expressions. The fact that any trace property of a program may be factored into a combination of a safety property and liveness property follows from the topological interpretation where (a) safety properties represent closed sets (of traces) and (b) liveness properties represent open sets. I am not an expert on liveness, and will instead point you at the main paper I’ve read on it.

Properties and Their Proofs

It is hard to understate how much work has been done in checking safety and liveness properties of programs. Because safety properties feel like crashes, and programmers really hate crashes, we have naturally developed a huge amount of our programming culture around methods designed to help us understand and avoid crashes. Broadly, this breaks down into static checking / verification and dynamic auditing / enforcement / logging. For example, the central aim of most program analyses is to check safety properties of programs, including avoidance of crashes, errors, and similar faults, and reifies in vastly-different ways from printf-style debugging to abstract interpretation and program analysis tools (like Astrée, Soot, and DOOP), to program verification frameworks (like CertiCrypt). This is not to say there is no great work in the area (e.g., the Terminator project), but rather that I think most of us have a natural intuition for safety properties as programmers.

Notice that one I have left out until now is type systems. Type systems feel very different than the methods I described above, because they are prescriptive in nature and do not so obviously lead to a trace-based semantics. The semantics of a type system is designed to be compositional in the manner of natural deduction, which feels very different than the trace-based nature of safety and liveness properties.

Hyperproperties

Hyperproperties allow us to talk about multiple traces of a program at the same time. It is not immediately clear why this would be useful, but some of the most motivating examples are from computer security. For example, consider the following program:

Process 1:
  while true:
    s = read(secret)
    if (s == 0):
	  o = 1
    else:
      o = 0

Process 2:
  while true:
    out(o)

secret is a stream of secret (binary) values. Is it possible to infer secret from the (buffered) output? Of course: we just change the 1s to 0s. Formally, though, it’s a bit hard to think about a general property that would allow us to speak about the relationship between the input and the output that would avoid this.

It turns out this kind of obliviousness or “noninterference” is a more general kind of program property, a proper that can simultaneously quantify over two traces at once. The original formulation applied to sequential programs, for example:

if (secret == 0):
  out = 1
else:
  out = 1

The program does not leak information because for every trace t0, there exists another trace t1 such that the outputs are equivalent but the inputs are not. Informally: the output of the program does not depend on the secret input. Extending the definition of noninterference to concurrent systems proved hard because of a basic challenge: once the thread schedule is tainted by a secret value (e.g., by branching on a secret and then doing something schedule-specific), non-local reasoning must be done to ensure the thread schedule’s safety–all the hell of concurrent programming, doubled.

This is the point at which I would say, off-hand, “a particularly cool idea by Zdancewic and Myers is to statically-ensure observational determinism.” But, as thinking over this, I realized: the coolest results for checking hyperproperties I can find are intensional systems (a type system, in this case) to ensure the extensional guarantees necessitated by hyperproperties.

Proofs of Hyperproperties

Unlike proofs of safety and liveness, relatively little work has been done on proofs for hyperproperties. This is no insult to people working in this research direction; finding bugs in one trace at a time is proving hard enough, it seems.

I speculate that this is because each new advance in logics for safety checking corresponded with advances to the intensional representation used to check those properties. Model checking became a huge deal once we realized there were relatively simple automata-theoretic algorithms to do it, but also because of the advent of CEGAR-style methods. By contrast, I think similar advances happen in security coinciding with advances in intensional checking mechanisms for hyperproperties. For example, the aforementioned CertiCrypt and Jif.

A unifying trick to much of this work is to make hyperproperties become just regular properties by having the semantics generate pairs of traces, and then cooking up a version of a hypersafety statement of interest to speak about these pairs. This draws a connection between hyperproperties and the large body of literature on relational program verification. In some of my own work, we tried to use ideas from faceted execution to push the pairs from the trace-level to the value level. In other work I did on model checking for hyperproperties, we just used a product construction—the CEGAR algorithm would need to be updated to refine states appropriately, but we did not attempt that. These efforts are rooted in the intuition that we may exploit the rich body of work from program analyses and the like to immediately reason about these pairs of traces, but there is no free lunch, and making the abstractions work in practice is seriously challenging.

However, I have come think many of these efforts may be ultimately fruitless compared to using secure-by-construction building blocks. Once you’re at the level of needing to reason about noninterference, you’re already in a pretty serious threat model—timing channels and the like are going to be ever-present, and will necessitate increasingly-advanced program logics, CEGAR-style reasoning, and similar technologies, I believe. Any state-based abstraction of the program will be destined to imprecision which will eventually fall apart if the abstraction is not tightly tied to (and derived from) the program’s semantics in a predictable way, but the necessary explosion of the state graph (from the product construction) implies serious complexity-theoretic challenges here, I worry.

Thus, I claim we haven’t and won’t see hyperproperties (as they are now) take off as a general way to give us new innovations in proofs of security properties for programs. Instead, like the innovations in safety properties, I think we’ll see more and more specialized hyperproperties gain steam hand-in-hand with new innovations in checking those properties. But crucially, I think these will be intensional in nature: more security type systems, popular libraries which provide secure communication channels whose API satisfies hypersafety guarantees when used properly, “secure linking” DSLs at the binary layer which ensure specific invariants to guarantee end-to-end correctness. I claim the most useful security innovations (bounds checking, bytecode verification, etc..) have truly been safety properties, constructed and used in a way to guarantee (a specific instance of) hypersafety in practice.