Modula-3: Group Project

Chapter IV

Other Features of Modula-3

There are several features in Modula-3 that support structuring of large systems. First is the separation of interface from implementation. This allows for system evolution as implementations evolve without affecting the clients of those interfaces; no one is dependent on *how* you implement something, only *what* you implement. As long as the *what* stays constant, the *how* can change as much as is needed. Secondly, it provides a simple single-inheritance object system. There is a fair amount of controversy over what the proper model for multiple inheritance (MI) is. I have built systems that use multiple-inheritance extensively and have implemented programming environments for a language that supports MI. Experience has [1][1] ```1 [i][i] taught me that MI can complicate a language tremendously (both conceptually and in terms of implementation) and can also complicate applications. Modula-3 has a particularly simple definition of an object. In Modula-3, an object is a record on the heap with an associated method suite. The data fields of the object define the state and the method suite defines the behavior. The Modula-3 language allows the state of an object to be hidden in an implementation module with only the behavior visible in the interface. This is different than C++ where a class definition lists both the member data and member function. The C++ model reveals what is essentially private information (namely the state) to the entire world. With Modula-3 objects, what should be private can be really be private. Programmers wind up adopting such conventions, as the explicit use of reference counts to determine when it is safe to deallocate storage. Unfortunately, programmers are not very good about following conventions. The net result is that programs develop storage leaks or the same piece of storage is mistakenly used for two different purposes. Also, in error situations, it may be difficult to free the storage. In 'C', a 'longjmp' may cause storage to be lost if the procedure being unwound doesn't get a chance to clean up. Exception handling in C++ has the same problems. In general, it is very difficult to manually reclaim storage in the face of failure. Having garbage collection in the language removes all of these problems. Better yet, the garbage collector that is provided with SRC implementation of Modula-3 has excellent performance. .It is the result of several years of production use and tuning. Most modern systems and applications have some flavor of asynchrony in them. Certainly all GUI-based applications are essentially asynchronous. The user drives inputs to a GUI-based application. Multiprocessor and multi-machine applications are essentially asynchronous as well. Given this, it is surprising that very few languages provide any support at all for managing concurrency. Instead, they "leave it up to the programmer". More often then not, programmers do this through the use of timers and signal handlers. While this approach suffices for fairly simple applications, it quickly falls apart as applications grow in complexity or when an application uses two different libraries, both of which try to implement concurrency in their own way. If you have ever programmed with TX or Motif, then you are aware of the problems with nested event loops. There needs to be some standard mechanism for concurrency. Modula-3 provides such a standard interface for creating threads. In addition, the language itself includes support for managing locks. The standard libraries provided in the SRC implementation are all thread-safe. Trestle, which is a library providing an interface to X is not only thread-safe, but also itself, uses threads to carry out long operations in the background. With a Trestle-based application, you can create a thread to carry out some potentially long-running operation in response to a mouse-button click. This thread runs in the background without tying up the user interface. It is a lot simpler and error prone than trying to accomplish the same thing with signal handlers and timers. Generic interfaces and modules are a key to reuse. One of the principal uses is in defining container types such as stacks, lists, and queues. They allow container objects to be independent of the type of entity contained. Thus, one needs to define only a single "Table" interface that is then instantiated to provide the needed kind of "Table", whether an integer table or a floating-point table or some other type of table is needed. Modula-3 generics are cleaner than C++ parameterized types, but provide much of the same flexibility. Below is a detailed description of these features that are embedded in modula-3 to make it the exceptional language that it is today.

Interfaces

One of Modula-2's most successful features is the provision for explicit interfaces between modules. Interfaces are retained with essentially no changes in Modula-3. An interface to a module is a collection of declarations that reveal the public parts of a module; things in the module that are not declared in the interface are private. A module imports the interfaces it depends on and exports the interface (or, in Modula-3, the interfaces) that it implements. Interfaces make separate compilation type-safe; but it does them an injustice to look at them in such a limited way. Interfaces make it possible to think about large systems without holding the whole system in your head at once. Programmers who have never used Modula-style interfaces tend to underestimate them, observing, for example, that anything that can be done with interfaces can also be done with C-style include files. This misses the point: many things can be done with include files that cannot be done with interfaces. For example, the meaning of an include file can be changed by defining macros in the environment into which it is included. Include files tempt programmers into shortcuts across abstraction boundaries. To keep large programs well structured, you either need super-human will power, or proper language support for interfaces.

Supporting Machine-Level Programming

One of the important lessons from C was that there are times that real systems need to be programmed essentially at the machine level. This power has been nicely integrated into the Modula-3. Any module that is marked as 'unsafe' has full access to machine-dependent operations such as pointer arithmetic, unconstrained allocation and deallocation of memory, and machine-dependent arithmetic. These capabilities are exploited in the implementation of the Modula-3 IO system. The lowest-levels of the IO system are written to make heavy use of machine-dependent operations to eliminate bottlenecks. In addition, existing (non-Modula-3) libraries can be imported. Many existing C libraries make extensive use of machine-dependent operations. These can be imported as "unsafe" interfaces. Then, safer interfaces can be built on top of these while still allowing access to the unsafe features of the libraries for those applications that need them.

Objects

The better we understand our programs, the bigger the building blocks we use to structure them. After the instruction came the statement, after the statement came the procedure, after the procedure came the interface. The next step seems to be the abstract type. At the theoretical abstract type is a type defined by the specifications of its operations instead of by the representation of its data. As realized in modern programming languages, a value of an abstract type is represented by an "object" whose operations are implemented by a suite of procedure values called the object's "methods". A new object type can be defined as a subtype of an existing type, in which case the new type has all the methods of the old type, and possibly new ones as well (inheritance). The new type can provide new implementations for the old methods (overriding). The farsighted designers of Simula invented objects in the mid-sixties . Objects in Modula-3 are very much like objects in Simula: they are always references, they have both data fields and methods, and they have single inheritance but not multiple inheritances. Small examples are often used to get across the basic idea: truck as a subtype of vehicle; rectangle as a subtype of polygon. Modula-3 aims at larger systems that illustrate how object types provide structure for large programs. In Modula-3 the main design effort is concentrated into specifying the properties of a single abstract type---a stream of characters, a window on the screen. Then dozens of interfaces and modules are coded that provide useful subtypes of the central abstraction. The abstract type provides the blueprint for a whole family of interfaces and modules. If the central abstraction is well designed then useful subtypes can be produced easily, and the original design cost will be repaid with interest. The combination of object types with Modula-2 opaque types produces something new: the partially opaque type, where some of an object's fields are visible in a scope and others are hidden. Because the committee had no experience with partially opaque types, the first version of Modula-3 restricted them severely; but after a year of experience it was clear that they were a good thing, and the language was revised to remove the restrictions. It is possible to use object-oriented techniques even in languages that were not designed to support them, by explicitly allocating the data records and method suites. This approach works reasonably smoothly when there are no subtypes; however it is through sub typing that object-oriented techniques offer the most leverage. The approach works badly when sub typing is needed: either you allocate the data records for the different parts of the object individually (which is expensive and notationally cumbersome) or you must rely on unchecked type transfers, which is unsafe. Whichever approach is taken, the subtype relations are all in the programmer's head: only with an object-oriented language is it possible to get object-oriented static type checking.

Example-2 defines two types: "File.T", which is an object with two methods to get and put a single character resp; and "FileServer.T", an object that manages file objects. A server someplace defines a concrete implementation of these abstract types.

INTERFACE File;

                    TYPE T = NetObj.T OBJECT

                    METHODS

                      getChar(): CHARACTER;

                      putChar(c: CHARACTER);

        END;

                    END File;

                    INTERFACE FileServer;

                    IMPORT File;

                    TYPE T = NetObj.T OBJECT

                    METHODS

                      create(name: Text): File.T;

                      open(name: Text): File.T;

                    END;

                    END FileServer;

Generics

A generic module is a template in which some of the imported interfaces are regarded as formal parameters, to be bound to actual interfaces when the generic is instantiated. For example, a generic hash table module could be instantiated to produce tables of integers, tables of text strings, or tables of any desired type. The different generic instances are compiled independently: the source program is reused, but the compiled code will generally be different for different instances. To keep Modula-3 generics simple, they are confined to the module level: generic procedures and types do not exist in isolation, and generic parameters must be entire interfaces. In the same spirit of simplicity, there is no separate type checking associated with generics. Implementations are expected to expand the generic and type checks the result. The alternative would be to invent a polymorphism type system flexible enough to express the constraints on the parameter interfaces that are necessary in order for the generic body to compile. This has been achieved for ML and CLU, but it has not yet been achieved satisfactorily in the Algol family of languages, where the type systems are less uniform. (The rules associated with Ada generics are too complicated for our taste.) Modula-3 generics are cleaner than C++ parameterized types, but provide much of the same flexibility.

Threads

Dividing a computation into concurrent processes (or threads of control) is a fundamental method of separating concerns. For example, suppose you are programming a terminal emulator with a blinking cursor: the most satisfactory way to separate the cursor blinking code from the rest of the program is to make it a separate thread. Or suppose you are augmenting a program with a new module that communicates over a buffered channel. Without threads, the rest of the program will be blocked whenever the new module blocks on its buffer, and conversely, the new module will be unable to service the buffer whenever any other part of the program blocks. If this is unacceptable (as it almost always is) there is no way to add the new module without finding and modifying every statement of the program that might block. These modifications destroy the structure of the program by introducing undesirable dependencies between what would otherwise be independent modules.The provisions for threads in Modula-2 are weak, amounting essentially to coroutines. Hoare's monitors are a sounder basis for concurrent programming. Monitors were used in Mesa, where they worked well; except that the requirement that a monitored data structure be an entire module was irksome. For example, it is often useful for a monitored data structure to be an object instead of a module. Mesa relaxed this requirement, made a slight change in the details of the semantics of Hoare's Signal primitive, and introduced the Broadcast primitive as a convenience [Lampson]. The Mesa primitives were simplified in the Modula-2+ design, and the result was successful enough to be incorporated with no substantial changes in Modula-3. A threads package is a tool with a very sharp edge. A common programming error is to access a shared variable without obtaining the necessary lock. This introduces a race condition that can lie dormant throughout testing and strike after the program is shipped. Theoretical work on process algebra has raised hopes that the rendezvous model of concurrency may be safer than the shared memory model, but the experience with Ada, which adopted the rendezvous, lends at best equivocal support for this hope, Ada still allows shared variables, and apparently they are widely used.

Safety

A language feature is unsafe if its misuse can corrupt the runtime system so that further execution of the program is not faithful to the language semantics. An example of an unsafe feature is array assignment without bounds checking: if the index is out of bounds, then an arbitrary location can be clobbered and the address space can become fatally corrupted. An error in a safe program can cause the computation to abort with a run-time error message or to give the wrong answer, but it can't cause the computation to crash in rubble of bits. Safe programs can share the same address space, each safe from corruption by errors in the others. To get similar protection for unsafe programs requires placing them in separate address spaces. As large address spaces become available, and programmers use them to produce tightly coupled applications, safety becomes more and more important. Unfortunately, it is generally impossible to program the lowest levels of a system with complete safety. Neither the compiler nor the runtime system can check the validity of a bus address for an I/O controller, nor can they limit the ensuing havoc if it is invalid. This presents the language designer with a dilemma. If he holds out for safety, then low-level code will have to be programmed in another language. But if he adopts unsafe features, then his safety guarantee becomes void everywhere. The languages of the BCPL family are full of unsafe features; the languages of the Lisp family generally have none (or none that are documented). In this area Modula-3 follows the lead of Cedar by adopting a small number of unsafe features that are allowed only in modules explicitly labeled unsafe. In a safe module, the compiler prevents any errors that could corrupt the runtime system; in an unsafe module, it is the programmer's responsibility to avoid them.

Garbage Collection

One of the most important features in Modula-3 is garbage collection. Garbage collection really enables robust, long-lived systems. Without garbage collection, you need to define conventions about who owns a piece of storage. For instance, if I pass you a pointer to a structure, are you allowed to store that pointer somewhere? A classic unsafe runtime error is to free a data structure that is still reachable by active references (or "dangling pointers"). The error plants a time bomb that explodes later, when the storage is reused. If on the other hand the programmer fails to free records that have become unreachable, the result will be a "storage leak" and the computation space will grow without bound. Problems due to dangling pointers and storage leaks tend to persist long after other errors have been found and removed. The only sure way to avoid these problems is the automatic freeing of unreachable storage, or garbage collection. Modula-3 therefore provides "traced references", which are like Modula-2 pointers except that the storage they point to is kept in the "traced heap" where it will be freed automatically when all references to it are gone. Another great benefit of garbage collection is that it simplifies interfaces. Without garbage collection, an interface must specify whether the client or the implementation has the responsibility for freeing each allocated reference, and the conditions under which it is safe to do so. This can swamp the interface in complexity. For example, Modula-3 supports text strings by a simple required interface Text, rather than with a built-in type. Without garbage collection, this approach would not be nearly as attractive. New refinements in garbage collection have appeared continually for more than twenty years, but it is still difficult to implement efficiently. For many programs, the programming time saved by simplifying interfaces and eliminating storage leaks and dangling pointers makes garbage collection a bargain, but the lowest levels of a system may not be able to afford it. For example, in SRC's Topaz system, the part of the operating system that manages files and heavyweight processes relies on garbage collection, but the inner "nub" that implements virtual memory and thread context switching does not. Essentially all Topaz application programs rely on garbage collection. For programs that cannot afford garbage collection, Modula-3 provides a set of reference types that are not traced by the garbage collector. In most other respects, traced and untraced references behave identically.

Exceptions

An exception is a control construct that exits many scopes at once. Raising an exception exits active scopes repeatedly until a handler is found for the exception, and transfers control to the handler. If there is no handler, the computation terminates in some system-dependent way---for example, by entering the debugger. There are many arguments for and against exceptions, most of which revolve around inconclusive issues of style and taste. One argument in their favor that has the weight of experience behind it is that exceptions are a good way to handle any runtime error that is usually, but not necessarily, fatal. If exceptions are not available, each procedure that might encounter a runtime error must return an additional code to the caller to identify whether an error has occurred. This can be clumsy, and has the practical drawback that even careful programmers may inadvertently omit the test for the error return code. The frequency with which returned error codes are ignored has become something of a standing joke in the Unix/C world. Raising an exception is more robust, since it stops the program unless there is an explicit handler for it. Like all languages in the Algol family, Modula-3 is strongly typed. The basic idea of strong typing is to partition the value space into types, restrict variables to hold values of a single type, and restrict operations to apply to operands of fixed types. In actuality, strong typing is rarely so simple. For example, each of the following complications is present in at least one language of the Algol family: a variable of type [0.9] may be safely assigned to an INTEGER, but not vice-versa (sub typing). Operations like absolute value may apply both to REALs and to INTEGERs instead of to a single type (overloading). The types of literals (for example, NIL) can be ambiguous. The type of an expression may be determined by how it is used (target-typing). Type mismatches may cause automatic conversions instead of errors (as when a fractional real is rounded upon assignment to an integer). We adopted several principles in order to make Modula-3's type system as uniform as possible. First, the+-re are no ambiguous types or target typing: the type of every expression is determined by its sub expressions, not by its use. Second, there are no automatic conversions. In some cases the representation of a value changes when it is assigned (for example, when assigning to a packed field of a record type) but the abstract value itself is transferred without change. Third, the rules for type compatibility are defined in terms of a single subtype relation. The subtype relation is required for treating objects with inheritance, but it is also useful for defining the type compatibility rules for conventional types.

Simplicity

In the early days of the Ada project, a general in the Ada Program Office opined that "obviously the Department of Defense is not interested in an artificially simplified language such as Pascal". Modula-3 represents the opposite point of view. We used every artifice that we could find or invent to make the language simple. C. A. R. Hoare has suggested that as a rule of thumb a language is too complicated if it can't be described precisely and readably in fifty pages. The Modula-3 committee elevated this to a design principle: we gave ourselves a "complexity budget" of fifty pages, and chose the most useful features that we could accommodate within this budget. In the end, we were over budget by six lines plus the syntax equations. This policy is a bit arbitrary, but there are so many good ideas in programming language design that some kind of arbitrary budget seems necessary to keep a language from getting too complicated.

To bring to a close, the features were directed toward two main goals. Interfaces, objects, generics, and threads provide fundamental patterns of abstraction that help to structure large programs. The isolation of unsafe code, garbage collection, and exceptions help make programs safer and more robust. Of the techniques that we used to keep the language internally consistent, the most important was the definition of a clean type system based on a subtype relation. There is no special novelty in any one of these features individually, but there is simplicity and power in their combination

The links to this information are located at these sites:

        1.http://www.research.digital.com/SRC/modula-3/html/home.html

        2ftp://ftp.vlsi.polymtl.ca/pub/m3/m3-faq.ps

        3[Modula-3 home page]

[1][1] All information was obtain from www.modula-3.com

[i][i] All iinf ormation was cited from www. Modula-3 .com