THE STORED PROGRAM CONCEPT The use of a computer's memory for holding both programs and data is called the stored program concept. It was not until the late 1940's that sharing memory between programs and data was incorporated into computer systems by John von Neumann. COMPILERS AND INTERPRETERS The computer's Central Processing Unit (CPU for short), can only execute programs written in machine code. It cannot directly execute programs written in high-level languages such as BASIC, Pascal, COBOL or FORTRAN. Therefore, special systems software has to be written for the computer, to enable it somehow to 'understand' programs written in the high-level languages which people prefer to use. One way of getting the computer to execute high-level language programs is by means of a program called an interpreter. This program reads in a high-level language statement, executes it, and then moves on to the next instruction and repeats the process. This is normally how BASIC programs run on microcomputers. The problem with using an interpreter for executing high-level language programs is that they run much more slowly than programs written in machine code. There is a way of getting high-level language programs to run quickly on computers, and that is by translating them into machine code. This translation process is performed by a program called a compiler. A compiler takes each high-level language statement in turn, and translates it into a sequence of machine code instructions. The machine code instructions generated are called the object code, and the high-level language statements from which they were generated are called the source code. The object code can, with a little further processing, be directly executed by the processor. Translating programs into machine code is normally preferred to interpreting them, because of the speedup in execution time. However, once a program is translated into the machine code for one machine, its object code cannot be executed on a machine with a different type of processor. Normally, to execute a program on two different types of machines, the source program is transferred from one machine to the other, and then compiled separately on both machines. The object code, though different, will then appear to execute identically on both machines. A different approach to program portability is to translate the source program, not into machine code, but into an intermediate code (sometimes called S-code). This S-code is then interpreted by an S-code interpreter (often called a soft machine or virtual machine). The S-code can then be run on any computer with an appropriate soft machine. Programs compiled into S-code can run as efficiently as programs compiled into machine code for the following reasons: 1) They take up less space in the computer's memory, and so can be handled more efficiently by the operating system (see below for a brief description of operating systems). This is because a high level language statement normally translates to one S-code instruction instead of several machine code instructions. 2) It is much easier for a systems programmer to write an optimised S-code interpreter than to write an optimising compiler (one that generates efficient machine code). Associated with every processor's instruction set, there is a special kind of compiler called an assembler. This translates programs written in assembly language into the machine code for that processor. Each statement in assembly language normally compiles into one machine code instruction. OPERATING SYSTEMS The operating system is what makes the computer usable. It allocates the computer's resources and performs various housekeeping functions. Operating systems for large computers (minicomputers and mainframes) typically allow several users each to run a number of processes simultaneously, with two or more processes possibly competing for the same devices. A process is a program in the process of being executed. There may be several users running the same program. Although only one copy of the program is loaded into memory, there is a process for each user who is running that program. The operating system determines which process is currently being executed, and holds each process in the form of a record containing, among other things, the current program counter of the process, and the addresses of the data areas used by that process. Operating systems are normally required to perform the following functions: 1) Processor management. There are likely to be several programs in memory, all of which are in a runnable state. Instead of executing one process to completion after another (as in a batch system), each process is given a share of the processor's time, and is temporarily suspended when that share is used up, or when the process requests the use of a device which is already in use by another process. This ensures that no process gains exclusive use of the processor for long periods of time. Another consideration in processor management is that the process not be left idle for long periods of time. 2) Memory management. There may be several different programs residing in memory at the same time. A new program loaded into memory must either be loaded into space not already occupied by a program or data, or overwrite a program or data. Before data are overwritten, they must be stored on disk, as they are likely to change during the execution of the program which uses them, whereas programs will (or should!) not change. In any event, the operating system must keep track of where the programs and data associated with a suspended process are stored. Similar problems arise when data are brought into memory from disk. A memory management system which allows disk storage to be used as an extension to main memory is called a virtual memory system. 3) File handling. The operating system must also handle the storage of files on disk. The files are typically organised into a directory structure (a directory contains information on a group of files: for example, their location on disk, their size, their creation date). Directories often have a hierarchical structure, with their owners being able to create subdirectories. 4) The operating system also handles the usage of peripheral devices (such as lineprinters, tape drives, disk drives) by processes running on the computer. TEXT EDITORS, LINKERS, AND DEBUGGERS The text editor is used for entering programs and data into the computer, and for modifying them. There are two types of editors: screen editors and line editors. Line editors only allow you to modify lines one at a time. Screen editors allow you to modify any line currently on the screen, simply by moving the cursor there. Good screen editors can also be used as word processors. The output from the compiler (the object code) usually contains references to addresses (e.g. in goto statements, and load and store instructions) which are relative to the beginning of the program. The object code can also contain references to data outside itself. If the program were always loaded at memory location zero upward and there were no external references, linking would not be necessary. The linker inserts the correct (or absolute) addresses. The output from the linker is directly executable by the processor. Programs seldom run correctly first time. In order to debug them, a debugger is often provided. This enables the programmer to monitor the execution of the program. Debuggers allow the programmer to inspect the memory locations used by the program, modify memory locations, insert breakpoints into the program, and to step through the program one instruction at a time. There are other items of system software sometimes associated with computer systems. These include data communication networks, and database management systems. As these are not always present in computer systems, they will not be discussed here. ANALYSIS, DESIGN AND PROGRAMMING OF COMPUTER SYSTEMS 1) Defining what is required by the system. This stage is often called requirements analysis. 2) Determining the solution method. This may involve deciding which programming language to use, which programming techniques to employ, etc. It may be decided at this stage not to use a computer system at all. 3) Specification of the input data and the output results. 4) Design of the computer program. There are a variety of techniques (or methodologies) used in this stage. 5) Coding the design in a computer language. 6) Testing and debugging the computer program. 7) Maintaining the program. At some later stage, it may be necessary to make modifications to the program. Often, at this stage, obscure bugs not detected in stage 6 may manifest themselves. Requirements for good software are that it run efficiently, be easy to understand, and easy to modify. To achieve this, it is neccessary to document the program well, and to use a good design methodology. The items of documentation normally required for a large computer system are: 1) The user specification and user guide. These enable the user to learn how to use the system, and provide a definition of what the system does. 2) The support documentation. This describes how the system is designed and implemented. It allows the programmers maintaining the system to understand how the system works and why the various design decisions were taken. 3) Comments in the code. These are essential. How many of them there should be depends to a large extent on what language the system is written in. In assembly languages, one comment per line of code is normal. Programs written in unstructured languages such as BASIC normally require more comments than those written in structured languages such as Pascal. There are several program design methodologies. The two most popular are probably Jackson Structured Design and Stepwise Refinement (also known as the top-down approach) Jackson Structured Design. This methodology involves basing the program structure around the data structures and the relationships between them. There are five stages. 1) Define the data structures. (The data structures allowed are sequences of different data structures (records), repetition of the same data structure (arrays), and alternatives between several different data structures (variant records).) 2) Determine the relationships between the data structures. 3) Determine the program structures. (The program structures allowed are instruction sequences, iteration, and conditional statements. Notice how they correspond to the three different data structure types.) 4) Write down executable steps, numbering each. Then, determine where they fit into the program structure. 5) Form schematic logic. Schematic logic corresponds very closely to the eventual program structure. Stepwise refinement is described by the following pseudocode: stepwise refinement(module) = if module cannot be broken down further then code it in the chosen programming language; return(module) else define what the module does as a sequence of different operations (and note these for documentation purposes); for each operation do stepwise refinement(operation);