The imperative programming paradigm is an abstraction of real computers which in turn are based on the Turing machine and the Von Neumann machine with its registers and store (memory). At the heart of these machines is the concept of a modifiable store. Variables and assignments are the programming language analog of the modifiable store. The store is the object that is manipulated by the program. Imperative programming languages provide a variety of commands to provide structure to code and to manipulate the store. Each imperative programming language defines a particular view of hardware. These views are so distinct that it is common to speak of a Pascal machine, C machine or a Java machine. A compiler implements the virtual machine defined by the programming language in the language supported by the actual hardware and operating system.
In imperative programming, a name may be assigned to a value and later reassigned to another value. The collection of names and the associated values and the location of control in the program constitute the state. The state is a logical model of storage which is an association between memory locations and values. A program in execution generates a sequence of states(See Figure N.1). The transition from one state to the next is determined by assignment operations and sequencing commands.
S0 -- O0 -> S1 - ... -> Sn-1 -- On-1-> Sn |
Unless carefully written, an imperative program can only be understood in terms of its execution behavior. The reason is that during the execution of the code, any variable may be referenced, control may be transferred to any arbitrary point, and any variable binding may be changed. Thus, the whole program may need to be examined in order to understand even a small portion of code.
Since the syntax of C, C++ and Java are similar, in what follows, comments made about C apply also to C++ and Java.
Aside. Most descriptions of imperative programming languages are tied to hardware and implementation considerations where a name is bound to an address, a variable to a storage cell, and a value to a bit pattern. Thus, a name is tied to two bindings, a binding to a location and to a value. The location is called the l-value and the value is called the r-value. The necessity for this distinction follows from the implementation of the assignment. For example,The following examples illustrate the general form for variable declarations in imperative programming languages.X := X+2 the X on the left of the assignment denotes a location while the X on the right hand side denotes the value. Assignment changes the value at a location.
A variable may be bound to a hardware location at various times. It may be bound at compile time (rarely), at load time (for languages with static allocation) or at run time (for languages with dynamic allocation). From the implementation point of view, variable declarations are used to determine the amount of storage required by the program.
Pascal style declaration: | var name : Type; |
C style declaration: | Type name; |
Pascal | V := E |
C | V = E |
APL | V <-- E |
Scheme | (setq V E) |
Aside. The use of the assignment symbol, =, in C confuses the distinction between definition, equality and assignment. The equal symbol, =, is used in mathematics in two distinct ways. It is used to define and to assert the equality between two values. In C it neither means define nor equality but assign. In C the double equality symbol, ==, is used for equality, while the form: type variable; is used for definitions.The assignment command is what distinguishes imperative programming languages from other programming languages. The assignment typically has the form:
The command is read ``assign the name V to the value of the expression E until the name V is reassigned to another value''. The assignment binds a name and a value.
Aside. The word ``assign'' is used in accordance with its English meaning; a name is assigned to an object, not the reverse. The name then stands for the object. The name is the assignee. This is in contrast to wide spread programming usage in which a value assigned to a variable.The assignment is not the same as a constant definition because it permits redefinition. For example, the two assignments:
X := 3; X := X + 1 |
Several kinds of assignments are possible. Because of the frequent occurrence of assignments of the form: X := X op E, C provides an alternative notation of the form: X op= E. A multiple assignment of the form:
causes several names to be assigned to the same value. This form of the assignment is found in C. A simultaneous assignment of the form:
causes several assignments of names to values to occur simultaneously. The simultaneous assignment permits the swapping of values without the explicit use of an auxiliary variable.
From the point of view of axiomatic semantics, the assignment is a predicate transformer. It is a function from predicates to predicates. From the point of view of denotational semantics, the assignment is a function from states to states. From the point of view of operational semantics, the assignment changes the state of an abstract machine.
Figure N.M: A set of unstructured commands command ::= identifier := expression | command; command | label : command | GOTO label | IF boo_exp THEN GOTO label
The unstructured commands contain the assignment command, sequential composition of commands, a provision to identify a command with a label, and unconditional and conditional GOTO commands. The unstructured commands have the advantage, they have direct hardware support and are completely general purpose. However, the programs are flat without hierarchical structure with the result that the code may be difficult to read and understand. The set of unstructured commands contains one of the most powerful commands, the GOTO. It is also the most criticized. The GOTO can make it difficult to understand a program by producing `spaghetti' like code. So named because the control seems to wander around in the code like strands of spaghetti.
The GOTO commands are explicit transfer of control from one point in a program to another program point. These jump commads come in unconditional and conditional forms:
goto label if conditional expression goto label |
The sequence of instructions next executed begin with the command labeled with LABELi.
If the conditional expression is true then execution transfers to the sequence of commands headed by the command labeled with LABELi otherwise it continues with the command following the conditional goto.
Figure N.M: A set of structured commands command ::= SKIP | identifier:= expression | IF guarded_command [ []guarded_command ]+ FI | DO guarded_command [ []guarded_command ]+ OD | command ;command guarded_command ::= guard --> command guard ::= boolean expression
The IF and DO commands which are defined in terms of
guarded commands require some explanation. The IF command allows for
a choice between alternatives while the DO command provides for iteration.
In their simplest forms, an IF statement corresponds to an If condition then
command and a DO statement corresponds to a While condition Do command.
IF guard --> command FI | = | if guard then command |
DO guard --> command OD | = | while guard do command |
Control structures are syntactic structures that define the order in which assignments are performed. Imperative programming languages provide a rich assortment of sequence control mechanisms. Three control structures are found in traditional imperative langauges: sequential composition, alternation, and iteration.
Selection: Selection permits the specification of a sequence of commands by cases. The selection of a particular sequence is based on the value of an expression. The if and case commands are the most common representatives of alternation.
Iteration: Iteration specifies that a sequence of commands may be executed zero or more times. At run time the sequence is repeatedly composed with itself. There is an expression whose value at run time determines the number of compositions. The while, repeat and for commands are the most common representatives of iteration.
Abstraction: A sequence of commands may be named and the name used to invoke the sequence of commands. Subprograms, procedures, and functions are the most common representatives of abstration.
-- Ada if condition then commands { elsif condition then commands } [ else commands] endif |
case expression is when choice | choice => commands when choice | choice => commands [when others => commands] end case; |
while-do | while condition do body |
repeat-until | repeat body until condition |
for-do | for index := lowerBound, upperBound, step do
body |
The iterative commands are often used to traverse the elements of a data structure - search for a item etc. This insight leads to the concept of generators and iterators.
Definition: An iterator is a generalized looping structure whose iterations are determined by a generator.An iterator is used with the an extended form of the for loop where the iterator replaces the initial and final values of the loop index. For example, given a binary search tree and a generator which performs inorder tree traversal, an iterator would iterate for each item in the tree following the inorder tree traversal.
FOR Item in Tree DO S; |
i := 0; while (i < length) and (list[i] <> value) do i := i+1 |
The Ada language provides the special operators and then and or else so that the programmer can specify short-circuit evaluation.
Procedure definition: | name( parameter list) { body } |
Procedure invocation: | name( argument list ) |
The semantics of the procedure call is determined by the semantics of the
procedure body. For many languages with non-recursive procedures, the
semantics may be viewed as simple textual substitution.
Parameters and arguments have a simple syntax
Parameter list: | t0 name1, ..., tn-1 namen-1 |
Argument list: | expression1, ..., expressionn-1 |
An in parameter designates that the body of the procedure may not
modify the value of the argument (often implemented as a copy of the
argument). An out parameter designates that value of the argument
is undefined on entry to the procedure and when the procedure terminates, the
argument is assigned to a value (often copied to the argument). An
in-out parameter designates that the value of the parameter may be
defined on entry to the procedure and may be modified when the procedure
terminates.
Parameter | Argument | ||
Pascal | in: in-out: |
name : type var name : type |
name expression |
Ada | in out in-out |
name : in type name : out type name : in out type |
expression name name |
C | in: in-out: |
type name type *name (internal reference to the in-out parameter must be *name) |
expression &name |
Coroutine C1 ... resume C2 ... |
Coroutine C2 ... resume C3 ... |
Coroutine C3 ... resume C1 ... |
There is a single thread of control that moves from coroutine to coroutine. The multiple calls to a coroutine do not necessarily require multiple activation records.
In addition to coroutines there are concurrent or parallel processes
In Ada, the exit sequencer terminates an enclosing loop. All enclosing loops upto and including the named loop are exited and execution follows with the command following the named loop.
Ada uses the return sequencer to terminate the execution of the body of a procedure or function and in the case of a function, to return the result of the computation.
Exception handers are sequencers that take control when an exception is raised.
A sequencer is a construct that allows more general control flows to be programmed.
is used in C to exit a function call and return the value computed by the function.
An escape of the form:
is used to exit n enclosing constructs. The exit command can be used in conjunction with a general loop command to produce while and repeat as well as more general looping constructs.
In C a break command sends control out of the enclosing loop to the command following the loop while the continue command transfers control to the beginning of the enclosing loop.
There are two basic types of exceptions which arise during program execution. They are domain failure, and range failure.
Definition: An exception condition is a condition that prevents the completion of an operation. The recognition of the exception is called raising the exception.Once an exception is raised it must be handled. Handling exceptions is important for the construction of robust programs. A program is said to be robust if it recovers from exceptional conditions.
Definition: The action to resolve the exception is called handling the exception. The propagation of an exception is the passing of the exception to the context where it can be handled.The simplest method of handling exceptions is to ignore it and continue execution with the next instruction. This prevents programmer from learning about the exception and may lead to erroneous results.
The most common method of handling exceptions is to abort execution. This is not exceptable for file I/O but may be acceptable for an array index being out of bounds or for division by zero.
The next level of error handling is to return a value outside the range of the operation. This could be a global variable, a result parameter or a function result. This approach requires explicit checking by the programmer for the error values. For example, the eof boolean is set to true when the program has read the last item in a file. The eof condition can then be checked before attempting to read from a file. The disadvantage of this approach is that a program tends to get cluttered with code to test the results. A more serious consequence is that a programmer may forget to include a test with the result that the exception is ignored.
One approach is to treat exception handlers as subroutines to which control is passed and after the execution of the handler control returns to the point following the call to the handler. This is the approach taken in PL/1. It implies that the handler ``fixed'' the state that raised the condition.
Another approach is that the exception handler's function is to provide a clean-up operation prior to termination. This is the approach taken in Ada. The unit in which the exception occurred terminates and control passes to the calling unit. Exceptions are propagated until an exception handler is found.
At the root of differences between mathematical notations and imperative
programs is the notion of referential transparency (substitutivity of
equals for equals). Manipulation of formulas in algebra, arithmetic, and logic
rely on the principle of referential transparency. Imperative programming
languages violate the principle. For example:
integer f(x:integer) { y := y+1; f := y + x } |
I/O functions of necessity involve side effects. The following
expressions involving the C function getint may return different values
even though algebraically they appear to have the same value.
2 * getint() getint() + getint() |
One way aliases occur is when two or more arguments to a subprogram are the
same. When a data object is passed by ``reference'' it is referenced both
by its name in the calling environment and its parameter name in the called
environment. In the following subprogram, the parameters are in-out
parameters.
aliasingExample (m, n : in out integer); { n := 1; n := m + n } |
Aliasing interferes with optimizing phase of a compiler.
Optimization sometimes requires the reordering of steps or the deletion of
unnecessary steps. The following assignments which appear to be independent of
each other illustrate an order depencency.
x := a + b y := c + d |
The purpose of the equivalence command in FORTRAN is the creation of aliases. It permits the efficient use of memory (historically a scarce commodity) and can be used as a crude form of a variant record. Another way in which aliasing can occur is when a data object may be a component of several data objects (referenced through pointer linkages).
var p, q : ^T; ... new(p); q := p |
type pointer = ^Integer var p : Pointer; procedure Dangling; var q : Pointer; begin; new(q); q^ := 23; p := q; dispose(q) end; begin new(p); Dangling(p) end; |
The problem of aliasing arises as soon as a language supports variables and assignment. If more than one assignment is permitted on the same variable x, the fact that x=a cannot be used at any other point in the program to infer a property of x from a property of a. Aliasing and global variables only magnify the issue.
COBOL (COmmon Business Oriented Language) was designed (by a committee of representatives of computer manufactures and the Department of Defense) at the initiative of the U. S. Department of Defense in 1959 and implemented in 1960 to meet the need for business data processing applications. COBOL featured records, files and fixed decimal data. It also provided a ``natural language'' like syntax so that programs would be able to be read and understood by non-programmers. COBOL won wide acceptance in the business data processing community and continues to be in wide use.
ALGOL 60 (ALGorithmic Oriented Language) was designed in 1960 by an international committee for use in scientific problem solving. Unlike FORTRAN it was designed independently of an implementation, a choice which lead to an elegant language. The description of ALGOL 60 introduced the BNF notation for the definition of syntax and is a model of clarity and completeness. Although ALGOL 60 failed to win wide acceptance, it introduced block structure, structured control statements and recursive procedures into the imperative programming paradigm.
ALGOL 68 was designed to be a general purpose language which remedied PL/I's defects by using a small number of constructs and rules for combining any of the constructs with predictable results--orthogonality. The description of ALGOL 68 issued in 1969 was difficult to understand since it introduced a notation and terminology that was foreign to the computing community. ALGOL 68 introduced orthogonality and data extensibility as a way to produce a compact but powerful language. The ``ALGOL 68 Report'' considered to be one of the most unreadable documents ever printed and implementation difficulties prevented ALGOL 68's acceptance.
Pascal was developed by Nicklaus Wirth partly as a reaction to the problems encountered with ALGOL 68 and as an attempt to provide a small and efficient implementation of a language suitable for teaching good programming style. C, which was developed about the same time, was an attempt to provide an efficient language for systems programming.
Modula-2 and Ada extended Pascal with features to support module based program development and abstract data types. Ada was developed as the result of a Department of Defense initiative while Modula-2 was developed by Nicklaus Wirth. Like PL/1 and Algol-68, Ada represents an attempt to produce a complete language representing the full range of programming tasks.
Simula 67 added coroutines and classes to ALGOL 60 to provide a language more suited to solving simulation problems. The concept of classes in object-oriented programming can be traced back to Simula's classes. Small-talk combined classes, inheritance, and ease of use to provide an integrated object-oriented development environment. C++ is an object-oriented programming language derived from C. Java, a simplified C++, is an object-oriented languages designed to dynamically load modules at runtime and to reduce programming errors.