Acknowledgements
This material is taken verbatim from Software Foundations V4.0
Copyright (c) 2016
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Assumed Knowledge:
- Some experience with Functional Programming
Learning Outcomes:
- Become familiar with the basic constructs of proofs in Coq
Welcome
This electronic book is a course on
Software Foundations, the
mathematical underpinnings of reliable software. Topics include
basic concepts of logic, computer-assisted theorem proving, the
Coq proof assistant, functional programming, operational
semantics, Hoare logic, and static type systems. The exposition
is intended for a broad range of readers, from advanced
undergraduates to PhD students and researchers. No specific
background in logic or programming languages is assumed, though a
degree of mathematical maturity will be helpful.
The principal novelty of the course is that it is one hundred
percent formalized and machine-checked: the entire text is
literally a script for Coq. It is intended to be read alongside
an interactive session with Coq. All the details in the text are
fully formalized in Coq, and the exercises are designed to be
worked using Coq.
The files are organized into a sequence of core chapters, covering
about one semester's worth of material and organized into a
coherent linear narrative, plus a number of "appendices" covering
additional topics. All the core chapters are suitable for both
upper-level undergraduate and graduate students.
Overview
Building reliable software is hard. The scale and complexity of
modern systems, the number of people involved in building them,
and the range of demands placed on them render it extremely
difficult to build software that is even more-or-less correct,
much less 100% correct. At the same time, the increasing degree
to which information processing is woven into every aspect of
society continually amplifies the cost of bugs and insecurities.
Computer scientists and software engineers have responded to these
challenges by developing a whole host of techniques for improving
software reliability, ranging from recommendations about managing
software projects and organizing programming teams (e.g., extreme
programming) to design philosophies for libraries (e.g.,
model-view-controller, publish-subscribe, etc.) and programming
languages (e.g., object-oriented programming, aspect-oriented
programming, functional programming, ...) to mathematical
techniques for specifying and reasoning about properties of
software and tools for helping validate these properties.
The present course is focused on this last set of techniques. The
text weaves together five conceptual threads:
(1) basic tools from
logic for making and justifying precise
claims about programs;
(2) the use of
proof assistants to construct rigorous logical
arguments;
(3) the idea of
functional programming, both as a method of
programming that simplifies reasoning about programs and as a
bridge between programming and logic;
(4) formal techniques for
reasoning about the properties of
specific programs (e.g., the fact that a sorting function or
a compiler obeys some formal specification); and
(5) the use of
type systems for establishing well-behavedness
guarantees for
all programs in a given programming
language (e.g., the fact that well-typed Java programs cannot
be subverted at runtime).
Each of these topics is easily rich enough to fill a whole course
in its own right, so tackling all of them together naturally means
that much will be left unsaid. Nevertheless, we hope readers will
find that the themes illuminate and amplify each other and that
bringing them together creates a foundation from which it will be
easy to dig into any of them more deeply. Some suggestions for
further reading can be found in the
Postscript chapter.
Bibliographic information for all cited works can be found in the
Bib chapter.
Logic
Logic is the field of study whose subject matter is
proofs —
unassailable arguments for the truth of particular propositions.
Volumes have been written about the central role of logic in
computer science. Manna and Waldinger called it "the calculus of
computer science," while Halpern et al.'s paper
On the Unusual
Effectiveness of Logic in Computer Science catalogs scores of
ways in which logic offers critical tools and insights. Indeed,
they observe that "As a matter of fact, logic has turned out to be
significiantly more effective in computer science than it has been
in mathematics. This is quite remarkable, especially since much
of the impetus for the development of logic during the past one
hundred years came from mathematics."
In particular, the fundamental notion of inductive proofs is
ubiquitous in all of computer science. You have surely seen them
before, in contexts from discrete math to analysis of algorithms,
but in this course we will examine them much more deeply than you
have probably done so far.
Proof Assistants
The flow of ideas between logic and computer science has not been
in just one direction: CS has also made important contributions to
logic. One of these has been the development of software tools
for helping construct proofs of logical propositions. These tools
fall into two broad categories:
- Automated theorem provers provide "push-button" operation:
you give them a proposition and they return either true,
false, or ran out of time. Although their capabilities
are limited to fairly specific sorts of reasoning, they have
matured tremendously in recent years and are used now in a
huge variety of settings. Examples of such tools include SAT
solvers, SMT solvers, and model checkers.
- Proof assistants are hybrid tools that automate the more
routine aspects of building proofs while depending on human
guidance for more difficult aspects. Widely used proof
assistants include Isabelle, Agda, Twelf, ACL2, PVS, and Coq,
among many others.
This course is based around Coq, a proof assistant that has been
under development, mostly in France, since 1983 and that in recent
years has attracted a large community of users in both research
and industry. Coq provides a rich environment for interactive
development of machine-checked formal reasoning. The kernel of
the Coq system is a simple proof-checker, which guarantees that
only correct deduction steps are performed. On top of this
kernel, the Coq environment provides high-level facilities for
proof development, including powerful tactics for constructing
complex proofs semi-automatically, and a large library of common
definitions and lemmas.
Coq has been a critical enabler for a huge variety of work across
computer science and mathematics:
- As a platform for modeling programming languages, it has become
a standard tool for researchers who need to describe and reason
about complex language definitions. It has been used, for
example, to check the security of the JavaCard platform,
obtaining the highest level of common criteria certification,
and for formal specifications of the x86 and LLVM instruction
sets and programming languages such as C.
- As an environment for developing formally certified software,
Coq has been used, for example, to build CompCert, a
fully-verified optimizing compiler for C, for proving the
correctness of subtle algorithms involving floating point
numbers, and as the basis for CertiCrypt, an environment for
reasoning about the security of cryptographic algorithms.
- As a realistic environment for functional programming with
dependent types, it has inspired numerous innovations. For
example, the Ynot project at Harvard embedded "relational Hoare
reasoning" (an extension of the Hoare Logic we will see later
in this course) in Coq.
- As a proof assistant for higher-order logic, it has been used
to validate a number of important results in mathematics. For
example, its ability to include complex computations inside
proofs made it possible to develop the first formally verified
proof of the 4-color theorem. This proof had previously been
controversial among mathematicians because part of it included
checking a large number of configurations using a program. In
the Coq formalization, everything is checked, including the
correctness of the computational part. More recently, an even
more massive effort led to a Coq formalization of the
Feit-Thompson Theorem — the first major step in the
classification of finite simple groups.
By the way, in case you're wondering about the name, here's what
the official Coq web site says: "Some French computer scientists
have a tradition of naming their software as animal species: Caml,
Elan, Foc or Phox are examples of this tacit convention. In French,
'coq' means rooster, and it sounds like the initials of the
Calculus of Constructions (CoC) on which it is based." The rooster
is also the national symbol of France, and C-o-q are the first
three letters of the name of Thierry Coquand, one of Coq's early
developers.
Functional Programming
The term
functional programming refers both to a collection of
programming idioms that can be used in almost any programming
language and to a family of programming languages designed to
emphasize these idioms, including Haskell, OCaml, Standard ML,
F#, Scala, Scheme, Racket, Common Lisp, Clojure, Erlang, and Coq.
Functional programming has been developed over many decades —
indeed, its roots go back to Church's lambda-calculus, which was
invented in the 1930s, before there were even any computers! But
since the early '90s it has enjoyed a surge of interest among
industrial engineers and language designers, playing a key role in
high-value systems at companies like Jane St. Capital, Microsoft,
Facebook, and Ericsson.
The most basic tenet of functional programming is that, as much as
possible, computation should be
pure, in the sense that the only
effect of execution should be to produce a result: the computation
should be free from
side effects such as I/O, assignments to
mutable variables, redirecting pointers, etc. For example,
whereas an
imperative sorting function might take a list of
numbers and rearrange its pointers to put the list in order, a
pure sorting function would take the original list and return a
new list containing the same numbers in sorted order.
One significant benefit of this style of programming is that it
makes programs easier to understand and reason about. If every
operation on a data structure yields a new data structure, leaving
the old one intact, then there is no need to worry about how that
structure is being shared and whether a change by one part of the
program might break an invariant that another part of the program
relies on. These considerations are particularly critical in
concurrent programs, where every piece of mutable state that is
shared between threads is a potential source of pernicious bugs.
Indeed, a large part of the recent interest in functional
programming in industry is due to its simpler behavior in the
presence of concurrency.
Another reason for the current excitement about functional
programming is related to the first: functional programs are often
much easier to parallelize than their imperative counterparts. If
running a computation has no effect other than producing a result,
then it does not matter
where it is run. Similarly, if a data
structure is never modified destructively, then it can be copied
freely, across cores or across the network. Indeed, the
"Map-Reduce" idiom, which lies at the heart of massively
distributed query processors like Hadoop and is used by Google to
index the entire web is a classic example of functional
programming.
For this course, functional programming has yet another
significant attraction: it serves as a bridge between logic and
computer science. Indeed, Coq itself can be viewed as a
combination of a small but extremely expressive functional
programming language plus with a set of tools for stating and
proving logical assertions. Moreover, when we come to look more
closely, we find that these two sides of Coq are actually aspects
of the very same underlying machinery — i.e.,
proofs are
programs.
Program Verification
Approximately the first third of the book is devoted to developing
the conceptual framework of logic and functional programming and
gaining enough fluency with Coq to use it for modeling and
reasoning about nontrivial artifacts. From this point on, we
increasingly turn our attention to two broad topics of critical
importance to the enterprise of building reliable software (and
hardware): techniques for proving specific properties of
particular
programs and for proving general properties of whole
programming
languages.
For both of these, the first thing we need is a way of
representing programs as mathematical objects, so we can talk
about them precisely, together with ways of describing their
behavior in terms of mathematical functions or relations. Our
tools for these tasks are
abstract syntax and
operational
semantics, a method of specifying programming languages by
writing abstract interpreters. At the beginning, we work with
operational semantics in the so-called "big-step" style, which
leads to somewhat simpler and more readable definitions when it is
applicable. Later on, we switch to a more detailed "small-step"
style, which helps make some useful distinctions between different
sorts of "nonterminating" program behaviors and is applicable to a
broader range of language features, including concurrency.
The first programming language we consider in detail is
Imp, a
tiny toy language capturing the core features of conventional
imperative programming: variables, assignment, conditionals, and
loops. We study two different ways of reasoning about the
properties of Imp programs.
First, we consider what it means to say that two Imp programs are
equivalent in the intuitive sense that they yield the same
behavior when started in any initial memory state. This notion of
equivalence then becomes a criterion for judging the correctness
of
metaprograms — programs that manipulate other programs, such
as compilers and optimizers. We build a simple optimizer for Imp
and prove that it is correct.
Second, we develop a methodology for proving that particular Imp
programs satisfy formal specifications of their behavior. We
introduce the notion of
Hoare triples — Imp programs annotated
with pre- and post-conditions describing what should be true about
the memory in which they are started and what they promise to make
true about the memory in which they terminate — and the reasoning
principles of
Hoare Logic, a "domain-specific logic" specialized
for convenient compositional reasoning about imperative programs,
with concepts like "loop invariant" built in.
This part of the course is intended to give readers a taste of the
key ideas and mathematical tools used in a wide variety of
real-world software and hardware verification tasks.
Type Systems
Our final major topic, covering approximately the last third of
the course, is
type systems, a powerful set of tools for
establishing properties of
all programs in a given language.
Type systems are the best established and most popular example of
a highly successful class of formal verification techniques known
as
lightweight formal methods. These are reasoning techniques
of modest power — modest enough that automatic checkers can be
built into compilers, linkers, or program analyzers and thus be
applied even by programmers unfamiliar with the underlying
theories. Other examples of lightweight formal methods include
hardware and software model checkers, contract checkers, and
run-time property monitoring techniques for detecting when some
component of a system is not behaving according to specification.
This topic brings us full circle: the language whose properties we
study in this part, the
simply typed lambda-calculus, is
essentially a simplified model of the core of Coq itself!
Further Reading
This text is intended to be self contained, but readers looking
for a deeper treatment of a particular topic will find suggestions
for further reading in the
Postscript chapter.
Practicalities
System Requirements
Coq runs on Windows, Linux, and OS X. You will need:
- A current installation of Coq, available from the Coq home
page. Everything should work with version 8.4 (or 8.5).
- An IDE for interacting with Coq. Currently, there are two
choices:
- Proof General is an Emacs-based IDE. It tends to be
preferred by users who are already comfortable with
Emacs. It requires a separate installation (google
"Proof General").
- CoqIDE is a simpler stand-alone IDE. It is distributed
with Coq, so it should "just work" once you have Coq
installed. It can also be compiled from scratch, but on
some platforms this may involve installing additional
packages for GUI libraries and such.
Exercises
Each chapter includes numerous exercises. Each is marked with a
"star rating," which can be interpreted as follows:
- One star: easy exercises that underscore points in the text
and that, for most readers, should take only a minute or two.
Get in the habit of working these as you reach them.
- Two stars: straightforward exercises (five or ten minutes).
- Three stars: exercises requiring a bit of thought (ten
minutes to half an hour).
- Four and five stars: more difficult exercises (half an hour
and up).
Also, some exercises are marked "advanced", and some are marked
"optional." Doing just the non-optional, non-advanced exercises
should provide good coverage of the core material. Optional
exercises provide a bit of extra practice with key concepts and
introduce secondary themes that may be of interest to some
readers. Advanced exercises are for readers who want an extra
challenge (and, in return, a deeper contact with the material).
Please do not post solutions to the exercises in any public place:
Software Foundations is widely used both for self-study and for
university courses. Having solutions easily available makes it
much less useful for courses, which typically have graded homework
assignments. The authors especially request that readers not post
solutions to the exercises anyplace where they can be found by
search engines.
Introduction
The functional programming style brings programming closer to
simple, everyday mathematics: If a procedure or method has no side
effects, then (ignoring efficiency) all we need to understand
about it is how it maps inputs to outputs — that is, we can think
of it as just a concrete method for computing a mathematical
function. This is one sense of the word "functional" in
"functional programming." The direct connection between programs
and simple mathematical objects supports both formal correctness
proofs and sound informal reasoning about program behavior.
The other sense in which functional programming is "functional" is
that it emphasizes the use of functions (or methods) as
first-class values — i.e., values that can be passed as
arguments to other functions, returned as results, included in
data structures, etc. The recognition that functions can be
treated as data in this way enables a host of useful and powerful
idioms.
Other common features of functional languages include
algebraic
data types and
pattern matching, which make it easy to
construct and manipulate rich data structures, and sophisticated
polymorphic type systems supporting abstraction and code reuse.
Coq shares all of these features.
The first half of this chapter introduces the most essential
elements of Coq's functional programming language. The second
half introduces some basic
tactics that can be used to prove
simple properties of Coq programs.
Enumerated Types
One unusual aspect of Coq is that its set of built-in
features is
extremely small. For example, instead of providing
the usual palette of atomic data types (booleans, integers,
strings, etc.), Coq offers a powerful mechanism for defining new
data types from scratch, from which all these familiar types arise
as instances.
Naturally, the Coq distribution comes with an extensive standard
library providing definitions of booleans, numbers, and many
common data structures like lists and hash tables. But there is
nothing magic or primitive about these library definitions. To
illustrate this, we will explicitly recapitulate all the
definitions we need in this course, rather than just getting them
implicitly from the library.
To see how this definition mechanism works, let's start with a
very simple example.
Days of the Week
The following declaration tells Coq that we are defining
a new set of data values — a
type.
The type is called
day, and its members are
monday,
tuesday, etc. The second and following lines of the definition
can be read "
monday is a
day,
tuesday is a
day, etc."
Having defined
day, we can write functions that operate on
days.
One thing to note is that the argument and return types of
this function are explicitly declared. Like most functional
programming languages, Coq can often figure out these types for
itself when they are not given explicitly — i.e., it performs
type inference — but we'll include them to make reading
easier.
Having defined a function, we should check that it works on
some examples. There are actually three different ways to do this
in Coq.
First, we can use the command
Compute to evaluate a compound
expression involving
next_weekday.
(We show Coq's responses in comments, but, if you have a
computer handy, this would be an excellent moment to fire up the
Coq interpreter under your favorite IDE — either CoqIde or Proof
General — and try this for yourself. Load this file,
Basics.v,
from the book's accompanying Coq sources, find the above example,
submit it to Coq, and observe the result.)
Second, we can record what we
expect the result to be in the
form of a Coq example:
This declaration does two things: it makes an
assertion (that the second weekday after
saturday is
tuesday),
and it gives the assertion a name that can be used to refer to it
later.
Having made the assertion, we can also ask Coq to verify it, like
this:
Proof. simpl. reflexivity. Qed.
The details are not important for now (we'll come back to
them in a bit), but essentially this can be read as "The assertion
we've just made can be proved by observing that both sides of the
equality evaluate to the same thing, after some simplification."
Third, we can ask Coq to
extract, from our
Definition, a
program in some other, more conventional, programming
language (OCaml, Scheme, or Haskell) with a high-performance
compiler. This facility is very interesting, since it gives us a
way to construct
fully certified programs in mainstream
languages. Indeed, this is one of the main uses for which Coq was
developed. We'll come back to this topic in later chapters.
Booleans
In a similar way, we can define the standard type
bool of
booleans, with members
true and
false.
Although we are rolling our own booleans here for the sake
of building up everything from scratch, Coq does, of course,
provide a default implementation of the booleans in its standard
library, together with a multitude of useful functions and
lemmas. (Take a look at
Coq.Init.Datatypes in the Coq library
documentation if you're interested.) Whenever possible, we'll
name our own definitions and theorems so that they exactly
coincide with the ones in the standard library.
Functions over booleans can be defined in the same way as
above:
The last two illustrate Coq's syntax for multi-argument
function definitions. The corresponding multi-argument
application syntax is illustrated by the following four "unit
tests," which constitute a complete specification — a truth
table — for the orb function:
We can also introduce some familiar syntax for the boolean
operations we have just defined. The Infix command defines new,
infix notation for an existing definition.
A note on notation: In
.v files, we use square brackets to
delimit fragments of Coq code within comments; this convention,
also used by the
coqdoc documentation tool, keeps them visually
separate from the surrounding text. In the html version of the
files, these pieces of text appear in a
different font.
The special phrases
Admitted and
admit can be used as a
placeholder for an incomplete definition or proof. We'll use them
in exercises, to indicate the parts that we're leaving for you —
i.e., your job is to replace
admit or
Admitted with real
definitions or proofs.
Exercise: 1 star (nandb)
Remove
admit and complete the definition of the following
function; then make sure that the
Example assertions below can
each be verified by Coq. (Remove "
Admitted." and fill in each
proof, following the model of the
orb tests above.) The function
should return
true if either or both of its inputs are
false.
☐
Exercise: 1 star (andb3)
Do the same for the
andb3 function below. This function should
return
true when all of its inputs are
true, and
false
otherwise.
☐
Function Types
Every expression in Coq has a type, describing what sort of
thing it computes. The
Check command asks Coq to print the type
of an expression.
For example, the type of
negb true is
bool.
Functions like negb itself are also data values, just like
true and false. Their types are called function types, and
they are written with arrows.
The type of negb, written bool → bool and pronounced
"bool arrow bool," can be read, "Given an input of type
bool, this function produces an output of type bool."
Similarly, the type of andb, written bool → bool → bool, can
be read, "Given two inputs, both of type bool, this function
produces an output of type bool."
Modules
Coq provides a
module system, to aid in organizing large
developments. In this course we won't need most of its features,
but one is useful: If we enclose a collection of declarations
between
Module X and
End X markers, then, in the remainder of
the file after the
End, these definitions are referred to by
names like
X.foo instead of just
foo. Here, we use this
feature to introduce the definition of the type
nat in an inner
module so that it does not interfere with the one from the
standard library, which comes with a bit of special notational
magic.
Numbers
The types we have defined so far are examples of "enumerated
types": their definitions explicitly enumerate a finite set of
elements. A more interesting way of defining a type is to give a
collection of
inductive rules describing its elements. For
example, we can define the natural numbers as follows:
The clauses of this definition can be read:
- O is a natural number (note that this is the letter "O,"
not the numeral "0").
- S is a "constructor" that takes a natural number and yields
another one — that is, if n is a natural number, then S n
is too.
Let's look at this in a little more detail.
Every inductively defined set (
day,
nat,
bool, etc.) is
actually a set of
expressions. The definition of
nat says how
expressions in the set
nat can be constructed:
- the expression O belongs to the set nat;
- if n is an expression belonging to the set nat, then S n
is also an expression belonging to the set nat; and
- expressions formed in these two ways are the only ones belonging
to the set nat.
The same rules apply for our definitions of
day and
bool. The
annotations we used for their constructors are analogous to the
one for the
O constructor, indicating that they don't take any
arguments.
These three conditions are the precise force of the
Inductive
declaration. They imply that the expression
O, the expression
S O, the expression
S (S O), the expression
S (S (S O)), and
so on all belong to the set
nat, while other expressions like
true,
andb true false, and
S (S false) do not.
We can write simple functions that pattern match on natural
numbers just as we did above — for example, the predecessor
function:
Definition pred (
n :
nat) :
nat :=
match n with
|
O ⇒
O
|
S n' ⇒
n'
end.
The second branch can be read: "if n has the form S n'
for some n', then return n'."
Because natural numbers are such a pervasive form of data,
Coq provides a tiny bit of built-in magic for parsing and printing
them: ordinary arabic numerals can be used as an alternative to
the "unary" notation defined by the constructors S and O. Coq
prints numbers in arabic form by default:
The constructor S has the type nat → nat, just like the
functions minustwo and pred:
These are all things that can be applied to a number to yield a
number. However, there is a fundamental difference between the
first one and the other two: functions like
pred and
minustwo
come with
computation rules — e.g., the definition of
pred
says that
pred 2 can be simplified to
1 — while the
definition of
S has no such behavior attached. Although it is
like a function in the sense that it can be applied to an
argument, it does not
do anything at all!
For most function definitions over numbers, just pattern matching
is not enough: we also need recursion. For example, to check that
a number
n is even, we may need to recursively check whether
n-2 is even. To write such functions, we use the keyword
Fixpoint.
We can define oddb by a similar Fixpoint declaration, but here
is a simpler definition that is a bit easier to work with:
(You will notice if you step through these proofs that
simpl actually has no effect on the goal — all of the work is
done by
reflexivity. We'll see more about why that is shortly.)
Naturally, we can also define multi-argument functions by
recursion.
Adding three to two now gives us five, as we'd expect.
The simplification that Coq performs to reach this conclusion can
be visualized as follows:
As a notational convenience, if two or more arguments have
the same type, they can be written together. In the following
definition, (n m : nat) means just the same as if we had written
(n : nat) (m : nat).
You can match two expressions at once by putting a comma
between them:
The _ in the first line is a wildcard pattern. Writing _ in a
pattern is the same as writing some variable that doesn't get used
on the right-hand side. This avoids the need to invent a bogus
variable name.
Exercise: 1 star (factorial)
Recall the standard mathematical factorial function:
factorial(0) = 1
factorial(n) = n * factorial(n-1) (if n>0)
Translate this into Coq.
☐
We can make numerical expressions a little easier to read and
write by introducing
notations for addition, multiplication, and
subtraction.
Notation "x + y" := (
plus x y)
(
at level 50,
left associativity)
:
nat_scope.
Notation "x - y" := (
minus x y)
(
at level 50,
left associativity)
:
nat_scope.
Notation "x * y" := (
mult x y)
(
at level 40,
left associativity)
:
nat_scope.
Check ((0 + 1) + 1).
(The
level,
associativity, and
nat_scope annotations
control how these notations are treated by Coq's parser. The
details are not important, but interested readers can refer to the
optional "More on Notation" section at the end of this chapter.)
Note that these do not change the definitions we've already made:
they are simply instructions to the Coq parser to accept
x + y
in place of
plus x y and, conversely, to the Coq pretty-printer
to display
plus x y as
x + y.
When we say that Coq comes with nothing built-in, we really mean
it: even equality testing for numbers is a user-defined
operation!
The
beq_nat function tests
natural numbers for
equality,
yielding a
boolean. Note the use of nested
matches (we could
also have used a simultaneous match, as we did in
minus.)
The leb function tests whether its first argument is less than or
equal to its second argument, yielding a boolean.
Exercise: 1 star (blt_nat)
The
blt_nat function tests
natural numbers for
less-
than,
yielding a
boolean. Instead of making up a new
Fixpoint for
this one, define it in terms of a previously defined function.
☐
Proof by Simplification
Now that we've defined a few datatypes and functions, let's
turn to stating and proving properties of their behavior.
Actually, we've already started doing this: each
Example in the
previous sections makes a precise claim about the behavior of some
function on some particular inputs. The proofs of these claims
were always the same: use
simpl to simplify both sides of the
equation, then use
reflexivity to check that both sides contain
identical values.
The same sort of "proof by simplification" can be used to prove
more interesting properties as well. For example, the fact that
0 is a "neutral element" for
+ on the left can be proved just
by observing that
0 + n reduces to
n no matter what
n is, a
fact that can be read directly off the definition of
plus.
Theorem plus_O_n :
∀n :
nat, 0 +
n =
n.
Proof.
intros n.
simpl.
reflexivity.
Qed.
(You may notice that the above statement looks different in
the
.v file in your IDE than it does in the HTML rendition in
your browser, if you are viewing both. In
.v files, we write the
∀ universal quantifier using the reserved identifier
"forall." When the
.v files are converted to HTML, this gets
transformed into an upside-down-A symbol.)
This is a good place to mention that
reflexivity is a bit
more powerful than we have admitted. In the examples we have seen,
the calls to
simpl were actually not needed, because
reflexivity can perform some simplification automatically when
checking that two sides are equal;
simpl was just added so that
we could see the intermediate state — after simplification but
before finishing the proof. Here is a shorter proof of the
theorem:
Theorem plus_O_n' :
∀n :
nat, 0 +
n =
n.
Proof.
intros n.
reflexivity.
Qed.
Moreover, it will be useful later to know that
reflexivity
does somewhat
more simplification than
simpl does — for
example, it tries "unfolding" defined terms, replacing them with
their right-hand sides. The reason for this difference is that,
if reflexivity succeeds, the whole goal is finished and we don't
need to look at whatever expanded expressions
reflexivity has
created by all this simplification and unfolding; by contrast,
simpl is used in situations where we may have to read and
understand the new goal that it creates, so we would not want it
blindly expanding definitions and leaving the goal in a messy
state.
The form of the theorem we just stated and its proof are
almost exactly the same as the simpler examples we saw earlier;
there are just a few differences.
First, we've used the keyword
Theorem instead of
Example.
This difference is purely a matter of style; the keywords
Example and
Theorem (and a few others, including
Lemma,
Fact, and
Remark) mean exactly the same thing to Coq.
Second, we've added the quantifier
∀ n:nat, so that our
theorem talks about
all natural numbers
n. In order to prove
theorems of this form, we need to to be able to reason by
assuming the existence of an arbitrary natural number
n. This
is achieved in the proof by
intros n, which moves the quantifier
from the goal to a
context of current assumptions. In effect, we
start the proof by saying "Suppose
n is some arbitrary
number..."
The keywords
intros,
simpl, and
reflexivity are examples of
tactics. A tactic is a command that is used between
Proof and
Qed to guide the process of checking some claim we are making.
We will see several more tactics in the rest of this chapter and
yet more in future chapters.
Other similar theorems can be proved with the same pattern.
Theorem plus_1_l :
∀n:
nat, 1 +
n =
S n.
Proof.
intros n.
reflexivity.
Qed.
Theorem mult_0_l :
∀n:
nat, 0 *
n = 0.
Proof.
intros n.
reflexivity.
Qed.
The
_l suffix in the names of these theorems is
pronounced "on the left."
It is worth stepping through these proofs to observe how the
context and the goal change. You may want to add calls to
simpl before
reflexivity to
see the simplifications that Coq performs on the terms before
checking that they are equal.
Although simplification is powerful enough to prove some fairly
general facts, there are many statements that cannot be handled by
simplification alone. For instance, we cannot use it to prove
that
0 is also a neutral element for
+ on the right.
Theorem plus_n_O :
∀n,
n =
n + 0.
Proof.
intros n.
simpl.
(Can you explain why this happens? Step through both proofs
with Coq and notice how the goal and context change.)
When stuck in the middle of a proof, we can use the
Abort
command to give up on it for the moment.
Abort.
The next chapter will introduce induction, a powerful
technique that can be used for proving this goal. For the moment,
though, let's look at a few more simple tactics.
Proof by Rewriting
This theorem is a bit more interesting than the others we've
seen:
Instead of making a universal claim about all numbers
n and
m,
it talks about a more specialized property that only holds when
n
= m. The arrow symbol is pronounced "implies."
As before, we need to be able to reason by assuming the existence
of some numbers
n and
m. We also need to assume the hypothesis
n = m. The
intros tactic will serve to move all three of these
from the goal into assumptions in the current context.
Since
n and
m are arbitrary numbers, we can't just use
simplification to prove this theorem. Instead, we prove it by
observing that, if we are assuming
n = m, then we can replace
n with
m in the goal statement and obtain an equality with the
same expression on both sides. The tactic that tells Coq to
perform this replacement is called
rewrite.
Proof.
intros n m.
intros H.
rewrite → H.
reflexivity. Qed.
The first line of the proof moves the universally quantified
variables
n and
m into the context. The second moves the
hypothesis
n = m into the context and gives it the name
H.
The third tells Coq to rewrite the current goal (
n + n = m + m)
by replacing the left side of the equality hypothesis
H with the
right side.
(The arrow symbol in the
rewrite has nothing to do with
implication: it tells Coq to apply the rewrite from left to right.
To rewrite from right to left, you can use
rewrite ←. Try
making this change in the above proof and see what difference it
makes.)
Exercise: 1 star (plus_id_exercise)
Remove "
Admitted." and fill in the proof.
☐
The
Admitted command tells Coq that we want to skip trying
to prove this theorem and just accept it as a given. This can be
useful for developing longer proofs, since we can state subsidiary
lemmas that we believe will be useful for making some larger
argument, use
Admitted to accept them on faith for the moment,
and continue working on the main argument until we are sure it
makes sense; then we can go back and fill in the proofs we
skipped. Be careful, though: every time you say
Admitted (or
admit) you are leaving a door open for total nonsense to enter
Coq's nice, rigorous, formally checked world!
We can also use the
rewrite tactic with a previously proved
theorem instead of a hypothesis from the context. If the statement
of the previously proved theorem involves quantified variables,
as in the example below, Coq tries to instantiate them
by matching with the current goal.
Exercise: 2 stars (mult_S_1)
☐
Proof by Case Analysis
Of course, not everything can be proved by simple
calculation and rewriting: In general, unknown, hypothetical
values (arbitrary numbers, booleans, lists, etc.) can block
simplification. For example, if we try to prove the following
fact using the
simpl tactic as above, we get stuck.
The reason for this is that the definitions of both
beq_nat and
+ begin by performing a
match on their first
argument. But here, the first argument to
+ is the unknown
number
n and the argument to
beq_nat is the compound
expression
n + 1; neither can be simplified.
To make progress, we need to consider the possible forms of
n
separately. If
n is
O, then we can calculate the final result
of
beq_nat (n + 1) 0 and check that it is, indeed,
false. And
if
n = S n' for some
n', then, although we don't know exactly
what number
n + 1 yields, we can calculate that, at least, it
will begin with one
S, and this is enough to calculate that,
again,
beq_nat (n + 1) 0 will yield
false.
The tactic that tells Coq to consider, separately, the cases where
n = O and where
n = S n' is called
destruct.
The
destruct generates
two subgoals, which we must then
prove, separately, in order to get Coq to accept the theorem. The
annotation "
as [| n']" is called an
intro pattern. It tells
Coq what variable names to introduce in each subgoal. In general,
what goes between the square brackets is a
list of lists of
names, separated by
|. In this case, the first component is
empty, since the
O constructor is nullary (it doesn't have any
arguments). The second component gives a single name,
n', since
S is a unary constructor.
The
- signs on the second and third lines are called
bullets,
and they mark the parts of the proof that correspond to each
generated subgoal. The proof script that comes after a bullet is
the entire proof for a subgoal. In this example, each of the
subgoals is easily proved by a single use of
reflexivity, which
itself performs some simplification — e.g., the first one
simplifies
beq_nat (S n' + 1) 0 to
false by first rewriting
(S n' + 1) to
S (n' + 1), then unfolding
beq_nat, and then
simplifying the
match.
Marking cases with bullets is entirely optional: if bullets are
not present, Coq simply asks you to prove each subgoal in
sequence, one at a time. But it is a good idea to use bullets.
For one thing, they make the structure of a proof apparent, making
it more readable. Also, bullets instruct Coq to ensure that a
subgoal is complete before trying to verify the next one,
preventing proofs for different subgoals from getting mixed
up. These issues become especially important in large
developments, where fragile proofs lead to long debugging
sessions.
There are no hard and fast rules for how proofs should be
formatted in Coq — in particular, where lines should be broken
and how sections of the proof should be indented to indicate their
nested structure. However, if the places where multiple subgoals
are generated are marked with explicit bullets at the beginning of
lines, then the proof will be readable almost no matter what
choices are made about other aspects of layout.
This is also a good place to mention one other piece of somewhat
obvious advice about line lengths. Beginning Coq users sometimes
tend to the extremes, either writing each tactic on its own line
or writing entire proofs on one line. Good style lies somewhere
in the middle. One reasonable convention is to limit yourself to
80-character lines.
The
destruct tactic can be used with any inductively defined
datatype. For example, we use it next to prove that boolean
negation is involutive — i.e., that negation is its own
inverse.
Note that the
destruct here has no
as clause because
none of the subcases of the
destruct need to bind any variables,
so there is no need to specify any names. (We could also have
written
as [|], or
as [].) In fact, we can omit the
as
clause from
any destruct and Coq will fill in variable names
automatically. This is generally considered bad style, since Coq
often makes confusing choices of names when left to its own
devices.
It is sometimes useful to invoke
destruct inside a subgoal,
generating yet more proof obligations. In this case, we use
different kinds of bullets to mark goals on different "levels."
For example:
Theorem andb_commutative :
∀b c,
andb b c =
andb c b.
Proof.
intros b c.
destruct b.
-
destruct c.
+
reflexivity.
+
reflexivity.
-
destruct c.
+
reflexivity.
+
reflexivity.
Qed.
Each pair of calls to reflexivity corresponds to the
subgoals that were generated after the execution of the destruct
c line right above it. Besides - and +, Coq proofs can also
use * (asterisk) as a third kind of bullet. If we ever encounter
a proof that generates more than three levels of subgoals, we can
also enclose individual subgoals in curly braces ({ ... }):
Theorem andb_commutative' :
∀b c,
andb b c =
andb c b.
Proof.
intros b c.
destruct b.
{
destruct c.
{
reflexivity. }
{
reflexivity. } }
{
destruct c.
{
reflexivity. }
{
reflexivity. } }
Qed.
Since curly braces mark both the beginning and the end of a
proof, they can be used for multiple subgoal levels, as this
example shows. Furthermore, curly braces allow us to reuse the
same bullet shapes at multiple levels in a proof:
Theorem andb3_exchange :
∀b c d,
andb (
andb b c)
d =
andb (
andb b d)
c.
Proof.
intros b c d.
destruct b.
-
destruct c.
{
destruct d.
-
reflexivity.
-
reflexivity. }
{
destruct d.
-
reflexivity.
-
reflexivity. }
-
destruct c.
{
destruct d.
-
reflexivity.
-
reflexivity. }
{
destruct d.
-
reflexivity.
-
reflexivity. }
Qed.
Before closing the chapter, let's mention one final
convenience. As you may have noticed, many proofs perform case
analysis on a variable right after introducing it:
intros x y.
destruct y as [|
y].
This pattern is so common that Coq provides a shorthand for it: we
can perform case analysis on a variable when introducing it by
using an intro pattern instead of a variable name. For instance,
here is a shorter proof of the
plus_1_neq_0 theorem above.
If there are no arguments to name, we can just write [].
Exercise: 2 stars (andb_true_elim2)
Prove the following claim, marking cases (and subcases) with
bullets when you use
destruct.
☐
Exercise: 1 star (zero_nbeq_plus_1)
☐
More on Notation (Optional)
(In general, sections marked Optional are not needed to follow the
rest of the book, except possibly other Optional sections. On a
first reading, you might want to skim these sections so that you
know what's there for future reference.)
Recall the notation definitions for infix plus and times:
Notation "x + y" := (
plus x y)
(
at level 50,
left associativity)
:
nat_scope.
Notation "x * y" := (
mult x y)
(
at level 40,
left associativity)
:
nat_scope.
For each notation symbol in Coq, we can specify its
precedence
level and its
associativity. The precedence level
n is
specified by writing
at level n; this helps Coq parse compound
expressions. The associativity setting helps to disambiguate
expressions containing multiple occurrences of the same
symbol. For example, the parameters specified above for
+ and
* say that the expression
1+2*3*4 is shorthand for
(1+((2*3)*4)). Coq uses precedence levels from 0 to 100, and
left,
right, or
no associativity. We will see more examples
of this later, e.g., in the
Lists
chapter.
Each notation symbol is also associated with a
notation scope.
Coq tries to guess what scope is meant from context, so when it
sees
S(O*O) it guesses
nat_scope, but when it sees the
cartesian product (tuple) type
bool*bool it guesses
type_scope. Occasionally, it is necessary to help it out with
percent-notation by writing
(x*y)%nat, and sometimes in what Coq
prints it will use
%nat to indicate what scope a notation is in.
Notation scopes also apply to numeral notation (
3,
4,
5,
etc.), so you may sometimes see
0%nat, which means
O (the
natural number
0 that we're using in this chapter), or
0%Z,
which means the Integer zero (which comes from a different part of
the standard library).
Fixpoints and Structural Recursion (Optional)
Here is a copy of the definition of addition:
When Coq checks this definition, it notes that
plus' is
"decreasing on 1st argument." What this means is that we are
performing a
structural recursion over the argument
n — i.e.,
that we make recursive calls only on strictly smaller values of
n. This implies that all calls to
plus' will eventually
terminate. Coq demands that some argument of
every Fixpoint
definition is "decreasing."
This requirement is a fundamental feature of Coq's design: In
particular, it guarantees that every function that can be defined
in Coq will terminate on all inputs. However, because Coq's
"decreasing analysis" is not very sophisticated, it is sometimes
necessary to write functions in slightly unnatural ways.
Exercise: 2 stars, optional (decreasing)
To get a concrete sense of this, find a way to write a sensible
Fixpoint definition (of a simple function on numbers, say) that
does terminate on all inputs, but that Coq will reject because
of this restriction.
☐
More Exercises
Exercise: 2 stars (boolean_functions)
Use the tactics you have learned so far to prove the following
theorem about boolean functions.
Now state and prove a theorem negation_fn_applied_twice similar
to the previous one but where the second hypothesis says that the
function f has the property that f x = negb x.
☐
Exercise: 2 stars (andb_eq_orb)
Prove the following theorem. (You may want to first prove a
subsidiary lemma or two. Alternatively, remember that you do
not have to introduce all hypotheses at the same time.)
☐
Exercise: 3 stars (binary)
Consider a different, more efficient representation of natural
numbers using a binary rather than unary system. That is, instead
of saying that each natural number is either zero or the successor
of a natural number, we can say that each binary number is either
- zero,
- twice a binary number, or
- one more than twice a binary number.
(a) First, write an inductive definition of the type
bin
corresponding to this description of binary numbers.
(Hint: Recall that the definition of
nat from class,
Inductive nat :
Type :=
|
O :
nat
|
S :
nat → nat.
says nothing about what
O and
S "mean." It just says "
O is
in the set called
nat, and if
n is in the set then so is
S
n." The interpretation of
O as zero and
S as successor/plus
one comes from the way that we
use nat values, by writing
functions to do things with them, proving things about them, and
so on. Your definition of
bin should be correspondingly simple;
it is the functions you will write next that will give it
mathematical meaning.)
(b) Next, write an increment function
incr for binary numbers,
and a function
bin_to_nat to convert binary numbers to unary numbers.
(c) Write five unit tests
test_bin_incr1,
test_bin_incr2, etc.
for your increment and binary-to-unary functions. Notice that
incrementing a binary number and then converting it to unary
should yield the same result as first converting it to unary and
then incrementing.