Exception handling

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of some condition that changes the normal flow of execution. For signaling conditions that are part of the normal flow of execution see the concepts of signal and event handler.

In general, current state will be saved in a predefined location and execution will switch to a predefined handler. Depending on the situation, the handler may later resume the execution at the original location, using the saved information to restore the original state. For example, an exception which will usually be resumed is a page fault, while a division by zero usually cannot be resolved transparently.

From the processing point of view, hardware interrupts are similar to resumable exceptions, although they are usually not related to the current program flow.

From the point of view of the author of a routine, raising an exception is a useful way to signal that the routine could not execute normally. For example, when an input argument is invalid (a zero denominator in division) or when a resource it relies on is unavailable (a missing file, or a hard disk error). In systems without exceptions, the routine would need to return some special error code. However, this is sometimes complicated by the semi predicate problem, in which users of the routine need to write extra code to distinguish normal return values from erroneous ones.

In runtime engine environments such as Java or .NET there exist tools that attach to the runtime engine and every time that an exception of interest occurs they record debugging information that existed in memory at the time the exception was thrown (call stack and heap values). These tools are called Automated Exception Handling or Error Interception tools and they provide 'root-cause' information for exceptions.

Contemporary applications face many design challenges when considering exception handling strategies. Particularly in modern enterprise level applications, exceptions must often cross process boundaries and machine boundaries. Part of designing a solid exception handling strategy is recognizing when a process has failed to the point where it cannot be economically handled by the software portion of the process. At such times, it is very important to present exception information to the appropriate stakeholders. [1]

Contents

[edit] Exception safety

A piece of code is said to be exception-safe if run-time failures within the code will not produce ill effects, such as memory leaks, garbled stored data or invalid output. Exception-safe code must satisfy invariants placed on the code even if exceptions occur. There are several levels of exception safety:

  1. Failure transparency, also known as the no throw guarantee: Operations are guaranteed to succeed and satisfy all requirements even in presence of exceptional situations. If an exception occurs, it will not throw the exception further up. (Best level of exception safety)
  2. Commit or rollback semantics, also known as strong exception safety or no-change guarantee: Operations can fail, but failed operations are guaranteed to have no side effects so all data retain original values.[2]
  3. Basic exception safety: Partial execution of failed operations can cause side effects, but invariants on the state are preserved. Any stored data will contain valid values even if data has different values now from before the exception.
  4. Minimal exception safety also known as no-leak guarantee: Partial execution of failed operations may store invalid data but will not cause a crash, and no resources get leaked.
  5. No exception safety: No guarantees are made. (Worst level of exception safety)

For instance, consider a smart vector type, such as C++'s std::vector or Java's ArrayList. When an item x is added to a vector v, the vector must actually add x to the internal list of objects and also update a count field that says how many objects are in v. It may also need to allocate new memory if the existing capacity isn't large enough. This memory allocation may fail and throw an exception. Because of this, a vector that provides failure transparency would be very difficult or impossible to write. However, the vector may be able to fairly easily offer the strong exception guarantee; in this case, either the insertion of x into v will succeed, or v will remain unchanged. If the vector provides only the basic exception safety guarantee, if the insertion fails, v may or may not contain x, but at least it will be in a consistent state. However, if the vector makes only the minimal guarantee, it's possible that the vector may be invalid. For instance, perhaps the size field of v was incremented but x wasn't actually inserted, making the state inconsistent. Of course, with no guarantee, the program may crash; perhaps the vector needed to expand but couldn't allocate the memory and blindly ploughs ahead as if the allocation succeeded, touching memory at an invalid address.

Usually at least basic exception safety is required. Failure transparency is difficult to implement, and is usually not possible in libraries where complete knowledge of the application is not available.

[edit] Exception support in programming languages

See also: Exception handling syntax

Many computer languages, such as Ada, Object Pascal (e.g. FreePascal, Delphi, and the like), C++, D, ECMAScript, Eiffel, Java, Objective-C, Ocaml, PHP (as of version 5), PL/1, Prolog, Python, REALbasic, ML, Ruby, Actionscript and most .NET languages, have built-in support for exceptions and exception handling. In those languages, the advent of an exception (more precisely, an exception handled by the language) unwinds the stack of function calls until an exception handler is found. That is, if function f contains a handler H for exception E, calls function g, which in turn calls function h, and an exception E occurs in h, then functions h and g will be terminated and H in f will handle E.

Excluding minor syntactic differences, there are only a couple of exception handling styles in use. In the most popular style, an exception is initiated by a special statement (throw, or raise) with an exception object (e.g. with Java or Object Pascal) or a value of a special extendable enumerated type (e.g. with Ada). The scope for exception handlers starts with a marker clause (try, or the language's block starter such as begin) and ends in the start of the first handler clause (catch, except, rescue). Several handler clauses can follow, and each can specify which exception types it handles and what name it uses for the exception object.

A few languages also permit a clause (else) that is used in case no exception occurred before the end of the handler's scope was reached. More common is a related clause (finally, or ensure) that is executed whether an exception occurred or not, typically to release resources acquired within the body of the exception-handling block. Notably, C++ does not need and does not provide this construct, and the Resource Acquisition Is Initialization technique is used to free such resources instead.

In its whole, exception handling code might look like this (in Java-like pseudocode; note that an exception type called EmptyLineException would need to be declared somewhere):

 try {
     line = console.readLine();
     if (line.length() == 0) {
         throw new EmptyLineException("The line read from console was empty!");
     }
     console.printLine("Hello %s!" % line);
     console.printLine("The program ran successfully");
 } catch (EmptyLineException e) {
     console.printLine("Hello!");
 } catch (Exception e) {
     console.printLine("Error: " + e.message());
 } finally {
     console.printLine("The program terminates now");
 }

As a minor variation, some languages use a single handler clause, which deals with the class of the exception internally.

Languages such as Perl and C don't use the term exception handling, but include facilities that allow implementing similar functionality.

The C++ derivative Embedded C++ excludes exception handling support as it can substantially increase the size of the object code.

[edit] Exception handling based on Design by Contract

A different view of exceptions is based on the principles of Design by Contract and is supported in particular by the Eiffel language. The idea is to provide a more rigorous basis for exception handling by defining precisely what is "normal" and "abnormal" behavior. Specifically, the approach is based on two concepts:

  • Failure: the inability of an operation to fulfill its contract. For example an addition may produce an arithmetic overflow (it does not fulfill its contract of computing a good approximation to the mathematical sum); or a routine may fail to meet its postcondition.
  • Exception: an abnormal event occurring during the execution of a routine (that routine is the "recipient" of the exception) during its execution. Such an abnormal event results from the failure of an operation called by the routine.

The "Safe Exception Handling principle" as introduced by Bertrand Meyer in Object-Oriented Software Construction then holds that there are only two meaningful ways a routine can react when an exception occurs:

  • Failure, or "organized panic": the routine fails, triggering an exception in its caller (so that the abnormal event is not ignored!), after fixing the object's state by re-establishing the invariant (the "organized" part).
  • Retry: try the algorithm again, usually after changing some values so that the next attempt will have a better chance to succeed.

Here is an example expressed in Eiffel syntax. It assumes that a routine send_fast is normally the better way to send a message, but it may fail, triggering an exception; if so, the algorithm next uses send_slow, which will fail less often. If send_slow fails, the routine send as a whole should fail, causing the caller to get an exception.

send (m: MESSAGE)
      -- Send m through fast link if possible, otherwise through slow link.
   local
      tried_fast, tried_slow: BOOLEAN
   do
      if tried_fast then
         tried_slow := True
         send_slow (m)
      else
         tried_fast := True
         send_fast (m)
      end
   rescue
      if not tried_slow then
         retry
      end
   end

The boolean local variables are initialized to False at the start. If send_fast fails, the body (do clause) will be executed again, causing execution of send_slow. If this execution of send_slow fails, the rescue clause will execute to the end with no retry (no else clause in the final if), causing the routine execution as a whole to fail.

This approach has the merit of defining clearly what a "normal" and "abnormal" cases are: an abnormal case, causing an exception, is one in which the routine is unable to fulfill its contract.

It defines a clear distribution of roles: the do clause (normal body) is in charge of achieving, or attempting to achieve, the routine's contract; the rescue clause is in charge of reestablishing the context and restarting the process if this has a chance of succeeding, but not of performing any actual computation.

[edit] Checked exceptions

The designers of Java devised[3][4] checked exceptions[5], which are a special set of exceptions. The checked exceptions that a method may raise are part of the method's signature. For instance, if a method might throw an IOException, it must declare this fact explicitly in its method signature. Failure to do so raises a compile-time error.

This is related to exception checkers that exist at least for OCaml[6]. The external tool for OCaml is both transparent (i.e. it does not require any syntactic annotations) and facultative (i.e. it is possible to compile and run a program without having checked the exceptions, although this is not suggested for production code).

The CLU programming language had a feature with the interface closer to what Java has introduced later. A function could raise only exceptions listed in its type, but any leaking exceptions from called functions would automatically be turned into the sole runtime exception, failure, instead of resulting in compile-time error. Later, Modula-3 had a similar feature.[7] These features don't include the compile time checking which is central in the concept of checked exceptions, and hasn't (as of 2006) been incorporated into major programming languages other than Java.[8]

[edit] Pros and cons

Checked exceptions can, at compile time, greatly reduce (but not entirely eliminate) the incidence of unhandled exceptions surfacing at runtime in a given application[citation needed]; the unchecked exceptions (RuntimeExceptions and Errors) can still go unhandled.

However, some see checked exceptions as a nuisance, syntactic salt that either requires large throws declarations, often revealing implementation details and reducing encapsulation, or encourages the (ab)use of poorly-considered try/catch blocks that can potentially hide legitimate exceptions from their appropriate handlers. The problem is more evident if you consider what happens to code over time. An interface may be declared to throw exceptions X & Y. In a later version of the code you may find you detect a new kind of error, and want to throw exception Z, but this is an interface change making the new code incompatible with the earlier uses. Furthermore, consider the adapter pattern where one body of code declares an interface that is then implemented by a different body of code so that code can be plugged in and called by the first. The adapter code may have a rich set of exceptions to describe problems, but is forced to use the exception types declared in the interface.

Others do not consider this a nuisance as it is possible to reduce the number of declared exceptions by either declaring a superclass of all potentially thrown exceptions or by defining and declaring exception types that are suitable for the level of abstraction of the called method,[9] and mapping lower level exceptions to these types, preferably wrapped using the exception chaining in order to preserve the root cause. In addition, it's very possible that in the example above of the changing interface that the calling code would need to be modified as well, since in some sense the exceptions a method may throw are part of the method's implicit interface anyway.

A simple throws Exception declaration or catch (Exception e) is always sufficient to satisfy the checking. While this technique is sometimes useful, it effectively circumvents the checked exception mechanism, so it should only be used after careful consideration. Additionally, throws Exception forces all calling code to do the same.

One prevalent view is that unchecked exception types should not be handled, except maybe at the outermost levels of scope, as they often represent scenarios that do not allow for recovery: RuntimeExceptions frequently reflect programming defects,[10] and Errors generally represent unrecoverable JVM failures. The view is that, even in a language that supports checked exceptions, there are cases where the use of checked exceptions is not appropriate.

[edit] Exception synchronity

Somewhat related with the concept of checked exceptions is exception synchronity. Synchronous exceptions happen at a specific program statement whereas asynchronous exceptions can raise practically anywhere.[11][12] It follows that asynchronous exception handling can't be required by the compiler. They are also difficult to program with. Examples of naturally asynchronous events include pressing Ctrl-C to interrupt a program, and receiving a signal such as "stop" or "suspend" from another thread of execution.

Programming languages typically deal with this by limiting asynchronity, for example Java has lost thread stopping and resuming.[13] Instead, there can be semi-asynchronous exceptions that only raise in suitable locations of the program or synchronously.

[edit] Condition systems

Common Lisp, Dylan and Smalltalk have a Condition system which encompasses the aforementioned exception handling systems. In those languages or environments the advent of a condition (a "generalisation of an error" according to Kent Pitman) implies a function call, and only late in the exception handler the decision to unwind the stack may be taken.

Conditions are a generalization of exceptions. When a condition arises, an appropriate condition handler is searched for and selected, in stack order, to handle the condition. Conditions which do not represent errors may safely go unhandled entirely; their only purpose may be to propagate hints or warnings toward the user.[14]

[edit] Continuable exceptions

This is related to the so-called resumption model of exception handling, in which some exceptions are said to be continuable: it is permitted to return to the expression that signaled an exception, after having taken corrective action in the handler. The condition system is generalized thus: within the handler of a non-serious condition (a.k.a. continuable exception), it is possible to jump to predefined restart points (a.k.a. restarts) that lie between the signaling expression and the condition handler. Restarts are functions closed over some lexical environment, allowing the programmer to repair this environment before exiting the condition handler completely or unwinding the stack even partially.

[edit] Restarts separate mechanism from policy

Condition handling moreover provides a separation of mechanism from policy. Restarts provide various possible mechanisms for recovering from error, but do not select which mechanism is appropriate in a given situation. That is the province of the condition handler, which (since it is located in higher-level code) has access to a broader view.

An example: Suppose there is a library function whose purpose is to parse a single syslog file entry. What should this function do if the entry is malformed? There is no one right answer, because the same library could be deployed in programs for many different purposes. In an interactive log-file browser, the right thing to do might be to return the entry unparsed, so the user can see it -- but in an automated log-summarizing program, the right thing to do might be to supply null values for the unreadable fields, but abort with an error if too many entries have been malformed.

That is to say, the question can only be answered in terms of the broader goals of the program, which are not known to the general-purpose library function. Nonetheless, exiting with an error message is only rarely the right answer. So instead of simply exiting with an error, the function may establish restarts offering various ways to continue -- for instance, to skip the log entry, to supply default or null values for the unreadable fields, to ask the user for the missing values, or to unwind the stack and abort processing with an error message. The restarts offered constitute the mechanisms available for recovering from error; the selection of restart by the condition handler supplies the policy.

[edit] See also

[edit] References

[edit] External links

Personal tools