What is Formal Specification
Formal specification is simply a matter of being more explicit and specific in defining the requirements of software. At the simplest level this can take the form of Design by Contract, where functions and procedures specify pre- and post-conditions and loops and objects include a set of invariant properties. In the more rigorous case formal specification involves building an explicit mathematical definition of the requirements of the software. Using such a definition one can prove the correctness of the system, or simply prove theorems about properties of the system. An implementation can also be checked against such a formalised specification, verifying that the implemented code does indeed do precisely what the requirements claim. At the most rigorous level the initial formal requirement specification can be expanded, through (mathematically rigorous) refinement, to ever more specific and detailed specifications resulting eventually in executable code.
All of these different levels allow a significantly greater degree of analysis of the software, be it improved static and runtime checking with contracts, to more complex data flow analysis and proof with more complete specifications. In the same way that static types allow more rigorous checking at compile time, catching a lot of simple errors, contracts and specifications allow even more analysis and checking, catching even more errors at the early stages of development when they are most easily and efficiently fixed.
A Simple Example
In order to be both short and easily comprehensible, examples of formal specification are necessarily simplistic. This example should at least provide an idea of what a formal specification might look like however.
The specification for a simple stack, in SPARK, a version of Ada that adds formal specification semantics, would look something like the following:
package Stack
--# own State: Stack_Type;
--# initializes State;
is
--# type Stack_Type is abstract
--# function Not_Full(S: Stack_Type) return Boolean;
--# function Not_Empty(S: Stack_Type) return Boolean;
--# function Append(S: Stack_Type; X: Integer) return Stack_Type;
procedure Push(X: in Integer);
--# global in out State;
--# derives State from State, X;
--# pre Not_Full(State);
--# post State = Append(State~, X);
procedure Pop(X: out Integer);
--# global in out State;
--# derives State, X from State;
--# pre Not_Empty(State)
--# post State~ = Append(State, X);
end Stack;
The package body, containing the implementation, can then follow.
The own simply declares the variable State to be a package global variable, and initializes means that the variable State must be initialized internally in this package. We then declare an abstract type (to be defined in implementation) and some simple functions. The in and out keywords in the procedure declarations tag the parameters: in means that the parameters current value will be read in the procedure, and out means the parameter will be written to in the procedure. The global keyword declares the the package global variable State will be used in the procedure (and both read from and written to). The derives keyword provides explicit declarations of which input will be used in determining variables that will be output or written to. The pre- and post-conditions should be largely self explanatory.
As you can see, mostly all we are doing is making explicit exactly how we intend the procedures to operate. This specificity provides automated verification tools the information necessary to properly analyse and validate implemented code: Prior to a compile step you can run a verification and flow analysis tool that catches many subtle errors.
A similar object, specified in an algebraic specification language like CASL might look something like this:
spec
Stack[sort Elem] =
sort
stack
ops
empty: stack
push: stack * Elem -> stack
pop: stack ->? stack
top: stack ->? Elem
axioms
not def pop(empty)
not def top(empty)
forall s : stack; e : Elem
· pop(push(s,e)) = s
· top(push(s,e)) = e
end
CASL offers syntax that can compact this considerably, but this longer version is probably more clear.
We declare a specification for a stack of generic elements, of sort stack; and with operations empty (which is in a sense the constructor creating and empty stack); push which maps a stack and an element to a "new" stack; pop a partial function (denoted by the question mark) which maps a stack to "new" stack; and top another partial function which maps a stack to an element. We then define axioms: first, as pop and top are partial functions, we say they are not defined on empty stacks; secondly we declare that pop and top should behave as (partial) inverses for push for all stacks and elements.
This is sufficient information to completely specify stacks as a universal algebra and brings a great deal of powerful mathematical machinery to bear on the problem, much of which has been coded into a set of analysis tools for CASL.
While these examples are very simple, they should give you the flavour of how formal specification can work. Both SPARK and CASL support structured specifications, allowing you to build up libraries of specification, making them scalable to very large problems.
Why Use Formal Specification?
Why use static types? Why do any compile time checking? Why produce design documents for your software? Not every project is worth writing design documents for, and some projects are better off being developed quickly using a dynamically typed language and runtime checking. Not every project really needs formal specification, but there are a great many software projects that could benefit greatly from some level of formal specification - a great many more than make use of it at present.
I would suggest that it is probably unimportant whether your paint program, music player, word processor or desktop weather applet uses formal specification. What about your spreadsheet application? Bothering with formal specification for the GUI might be a waste of time. Bothering to do some specification for the various mathematical routines, ensuring their correctness, would potentially be worth the extra trouble. Formal specification doesn't need to be used for a whole project, only those parts of it that are sensitive to error. Likewise any network services could easily benefit from formal specification on the network facing portions of the code to significantly reduce the possibility of exploits: it is far easier to audit and verify code that has been properly specified. Security software, and implementations of cryptographic protocols, are far safer if formally specified: with cryptography the protocols are often rigorously checked, and many exploits relate to errors where the implementation fails to correctly follow the protocol. Finally mission critical business software, where downtime can costs millions of dollars, could most assuredly benefit form the extra assurances that formal specification and verification can provide.
None the less, barring a few industries, formal specification has seen little use in the past 25 years. Developers and software engineers offer many excuses and complaints as to why formal specification isn't suitable, isn't required, or is too hard. Many of these complaints are founded in a poor understanding of how formal specification works and how it can be used. Let's consider some of the common complaints:
But Formal Specification is too much Work...
Formal specification can be as much work as you choose to make it. You can do as little as adding contracts to critical functions or classes, or you can write the entire project from the top down by progressive refinement of (and verification against) a specification for which you have created formal proofs of all important properties. There is a sliding scale from a dynamically typed script with no documentation or comments, all the way up to a completely explicitly specified and proven system. You can choose the level of specificity and verification that the given component requires.
Writing contracts is only a little more work than static typing: you are simply declaring formally what you intend the function to do, something you ought to have a clear idea of before you write the function anyway. Writing a basic specification is only a little more work that writing a design document: you are simply formalising the diagrams and comments into explicit statements. For a little extra work in the initial stages you gain considerably in debugging and testing. Extended static checking based on contracts can catch many simple errors that would otherwise go unnoticed, and debugging is significantly accelerated by the added information contracts supply about runtime errors. If you've gone to the trouble of writing a formal specification you can statically verify various behaviours and properties of the system before it is even implemented. Once implemented you can validate the implementation against the specification. When performing testing you can use the specification to help determine the best coverage with the least testing.
Formal specification does work in practice, and the gains in efficiency in testing and debugging can often outweigh the potential extra overhead during design. Design by Contract, for instance, is often cited as saving time in the development process, and there are many real world success stories where systems were developed faster by using contracts. A 20 person team completed in 5 months what took a team of 100 people using standard C++ almost a year. For projects where robustness is critical, full formal specification has been used successfully. Companies like Praxis High Integrity Systems use formal methods successfully in the rail, aerospace, nuclear, finance and telecommunications industries (among others), and have sold their formal method tool-set to a variety of large companies. Reading through some of their recent work and case studies makes it clear that formal specification and verification can prove to be more efficient and faster to use.
But Formal Specification is Not Practical...
On the contrary, formal specification is used all the time in many industries, and there are various companies like Praxis High Integrity Systems, B-Core, Eiffel Software, Kind Software, and Escher Technologies who provide formal specification tools as a major part of their business.
Formal specification has proved to be practical in the real world. What it has lacked is mature tools to make development using formal specification faster and easier (try writing code without a good compiler, try doing formal specification without good static checking tools), developer awareness, and developers skilled in the languages and techniques of formal specification. It doesn't take much to learn - learning a specification language is not much harder than learning a programming language. Just as programming appears to be a difficult and daunting task to a person who doesn't know any programming languages, specification looks difficult to people who don't know specification languages.
But Formal Specification Still Can't Guarantee Perfect Software...
Correct, there are no guarantees. Complete proofs are often too complicated to perform, so theorem proving for specific properties is usually undertaken instead. Static checking can be enhanced significantly with suitable specifications, but not all errors can be caught. Your software is only going to be as stable and secure as the system it is running on. If you can't be perfect why try at all? Because significant gains can be made and the costs may be smaller than you think. Even if the whole system can't be proved, managing to prove key properties may well be all the assurance of correctness that is required for the application at hand. Even if you can't catch all the errors statically, reducing the number and severity of runtime errors can bring significant efficiency gains in the testing phase. Furthermore the cost of errors in software found after release can be sufficiently high that any reduction is worthwhile. Finally if you can't trust your compiler and operating system, then everything is compromised. At some level you have to trust something - we all trust a great deal of software right now that we expect to have flaws - reducing the source of errors to trusted components is still a significant gain.
But Formal Specification Is Only Viable for Trivial Examples...
Not even close to the truth. Formal specification and formal methods have been used for large projects involving hundreds of thousands of lines of code. Just look at many of the examples provided above: these were not small projects, and they were not toy trivial examples; these were large scale real world software systems. Examples of formal specification tend to be small simplistic cases because they are meant to be just that: easily comprehensible examples. In the same way that programming languages often use "Hello World" and quicksort as examples, formal specification languages use equally simple but demonstrative examples. Don't take this to mean that the language and method doesn't scale.
Conclusions
Formal specification isn't a silver bullet, and it isn't the right choice for every project. On the other hand you can use as much specification and verification as is suitable for your project. You can apply formal specification only to the portions that are critical, and develop the rest normally. More importantly, the techniques, languages and tools for formal specification continue to improve. The more powerful the methods and the tools, the more projects for which formal specification becomes a viable option. In an increasingly network oriented computing world where security and software assurance is becoming increasingly important, the place for formal specification is growing. If you are a software engineer it makes sense to be aware of options available to you.