Development History

Bruce I. Blum Introduction

This chapter presents a historical overview of the OCIS. Because the OCIS is a software system, this history is necessarily one of software development. There are two basic themes in this chapter. The first, which is consistent with the central focus of this book, concentrates on what the OCIS does and how it is used. The second is that of software implementation —especially in the context of clinical information systems. In this view the OCIS is considered a case study.

The development of the OCIS did not occur in a vacuum; there were many interacting and overlapping events that affected its design and application. For example, a new development environment TEDIUM1 was used in the second phase of OCIS implementation. This impacted the development process and also provided new insights into the process. Also, experience with the OCIS was carried over to the implementation of clinical information systems elsewhere in the hospital.

As one of the lead designers, I clearly left my mark on the OCIS. At the same time, the development of the OCIS taught me much about medical informatics and software engineering. I would like to capture some of that exchange in this chapter. Therefore, I shall use the first person singular to indicate my recollections and opinions, and I shall present the history from a very personal (as opposed to institutional) perspective.

The remainder of the chapter is divided into four sections. The first contains an overview of the software development process (or software process as it is now called). The goal is to establish a common understanding of how systems are implemented. The next section presents a summary of the OCIS development history. This is described from a functional perspective, that is, what did the OCIS do and when did it do it? The third section contains an overview of the tools

'TEDIUM is a trademark of Tedious Enterprises, Inc.

that the development team used: MUMPS and TEDIUM. The final section evaluates OCIS with a special emphasis on the development activity.

For persons with a limited computer background, reading the first two sections may be sufficient. The section on the software tools is oriented to technical readers and is not necessary for an understanding of the project history. The final evaluation section reports on data collected to help us calibrate the development effort. It is technically oriented, and we believe that it will be of interest to many of the readers; most of the clinical evaluations, however, are included in the chapters in Part II.

The Software Process Software engineering is the discipline concerned with the process of implementing and maintaining large computer applications. The goals of software engineering are to deliver and maintain the necessary functionality and quality in the required time period within the budgeted costs. There are several “life cycle models” that have been designed to facilitate the management of the process. The first subsection describes the most commonly used development approach, the “waterfall model.” The next subsection examines the software process from a different “essential” perspective, and the final subsection concludes with some observations about software development and the OCIS.

The Waterfall Software Life Cycle Model This model gets its name from the “waterfall diagram” presented in Figure 1. As shown there, the model organizes the development process into a series of steps or phases. The output of one phase becomes the input to the next. In this way the phases are managed serially; one cannot begin a new phase until all the issues in the previous one have been resolved. Naturally, as development progresses, changes to earlier assumptions are made, and feedback is required. This feedback is shown as the dotted lines in the figure. The phases in the model include all activities from the initial analysis through the product’s retirement. Hence it is called a life cycle model.

The labels in the boxes (i.e., the names of the phases) vary from diagram to diagram. However, they always follow the flow used in any intellectual or construction process: Decide what is to be done, decide how to do it, do it, and evaluate the results. This is how we build bridges and airplanes, conduct scientific research, and implement computer applications.

Using the terms of Figure 1, the process begins with an analysis step in which it is determined what the software product should support. The result of this step is a requirements document that describes what the system should do; it does not indicate how the system should achieve these goals. That is, it tells “what,” but not “how.”

Figure 1. The traditional software life cycle model.

The design describes how the software product should be implemented. At the highest level it defines how the functions are to be distributed among the modules (the functional specification). Once this has been determined, a detailed design follows. (It is documented as design specifications.) As the designs are refined, implementation (i.e., coding and debugging) follows. Naturally, some portions of the system are implemented before others; that is, implementation begins before the entire design phase is completed. However, in this model one cannot begin the implementation of a module until its design is complete.

After products are implemented, they are tested. First the programs are tested as individual entities (unit testing); then the tested products are integrated and retested in larger components. Finally, the entire system is tested to see if it conforms to the initial requirements statement. In effect, the design process is one of decomposing the whole into smaller and smaller parts, while the testing is the inverse ©f this process. Design begins with a single requirements document and ends with a large collection of computer programs; testing begins with these programs and ends with a single tested system.

This waterfall model was adapted from the hardware development life cycle model. It has been used with software for almost two decades. It is particularly effective for large projects in which it is necessary to coordinate the implementa tion of hardware and software components; similar management tools can be applied to all components. The model is also interesting because we have a great deal of empirical data that describe the process.

In planning for a software project, one typically uses a 40-20-40 rule that states that

40% of the effort should be allocated for analysis of the requirements and design; 20% of the effort should be allocated to the programming process, that is, coding and debugging;

40% of the effort should be allocated to testing of the finished product.

Notice that the process of writing code is a very small portion of the entire project.

As the result of studying many large projects we also have found that:

Most errors are design errors; relatively few errors are made in translating a design into code.

The later in the life cycle that an error is identified, the more expensive it will be to correct it. The cost ratio can be as high as 1 : 100.

The more persons working on a project, the lower the productivity of each individual. That is, a programmer working alone may be able to produce ten times more product in a given time than when working as a member of a very large team. Communication among team members and the management of large development efforts both reduce individual performance.

There is evidence to suggest that the productivity of an individual is the same regardless of the programming language used. That is, the same number of lines of output will be produced per effort day for assembly language, COBOL, or MUMPS. An industry average of two delivered lines of code per effort hour is common; this effort includes all analysis, design, testing, and documentation.

Again, these results indicate that the writing of programs is a very small part of the process.

The discussion so far has been limited to the development phases of the life cycle. Once the product has been accepted, it is placed in operational use. Any changes to the product that follow are considered maintenance —or as indicated in the figure, evolution. Evolution may represent anywhere from half to three-quarters of the total life cycle cost. That is, an organization will spend, on the average, twice as much to maintain a software product as it did to implement it.

A naive view of maintenance is that it is an activity devoted to the correction of errors. However, studies of project histories have shown that

20% of the maintenance effort is to correct errors, that is, corrective maintenance;

25% of the maintenance effort is to adapt the software to meet changed requirements, that is, adaptive maintenance;

55% of the maintenance effort is to enhance existing features or to add new functions, that is, perfective maintenance.

Thus, maintenance (or evolution) is a positive sign that the software product is used and is evolving with the organization’s needs.

Most of the results just reported are based on experience with the waterfall process model. One of the major advantages of that model is that we have experience in its application. However, it should be recognized that this process model assumes the writing of a requirements document that can serve as a contract for the development stages. In effect, once the requirements are defined, they will be fixed until the finished system is available to be tested against these requirements.

When the hardware and software development are conducted in parallel, this is a reasonable approach. Clearly, one cannot start to fabricate hardware until the design drawings are complete. But software is not hardware. It can be changed easily; the cost of fabrication is negligible. Thus, although it may be convenient to manage software in this waterfall model style, there are practical, alternative process models. This is especially true for applications in which the requirements are understood poorly or are subject to change. To see how the software process can be modified to accommodate such applications, the next subsection reexamines the software process from another perspective.

An Essential Software Process Model One of the major concerns of the waterfall model is the management of the process. It allocates what has to be done into well-defined tasks that can be monitored and controlled. One danger of this model is that one may lose sight of the essential steps of the process by concentrating on the objects being managed. In this subsection I present what I call an essential process model. I view the software process as an intellectual activity in which one identifies needs and builds programs to produce a software system that meets those needs.

In this essential model the emphasis is on the process of determining what the software is to do. The process of building the programs is not considered very important. Recall, that most errors are design —and not programming—errors. Thus, if we know what it is we want to do, we should be able to implement a system that meets our needs. In this context implementation is not a problemsolving activity in the domain of application; it simply represents housekeeping, that is, a delay between the time that a solqtion has been postulated and the time that a realization of that solution is available for evaluation.

What I have written in the previous paragraphs reflects a personal view. That is why I use the first person singular; one may not find other computer scientists referring to this essential model. Having warned the reader of my personal biases, I now return to reporting on concepts that are accepted throughout the software engineering community. As a framework for the discussion, I use the model shown in Figure 2.

This essential model concentrates on the problem-solving activities of the software process. It follows the canonical problem-solving process of deciding what to do, deciding how to do it, doing it, and evaluating the results. In this case the software process is described as a sequence of transformations that go from a real

Figure 2. An essential software process model.

-world need to a system that satisfies that need. The transformation from the real world to the problem statement is the decision of what to do, and the transformation into the implementation statement involves the next three steps. (The perspective of this model is the application domain; the aggregation of the implementation steps as a single transformation reflects my application view of program development as housekeeping.)

In this model the real world is viewed as a poorly understood dynamic object with operations that can be improved through the use of a software system. Naturally, this software system includes software plus hardware and people. Recognizing this need and identifying an appropriate set of responses to it are subjective processes. There are no “correct” requirements documents. Thus, this model starts with a transformation from the real world to a statement of the problem to be solved with a software system. This problem statement is what previously has been called the requirements specification.

Once the problem statement exists, it is used as the source document for the implementation. The transformation from the problem statement to the implementation statement (i.e., the code and documentation) represents the bulk of the development process: deciding how to do it, doing it, and evaluating the final result. The final transformation in this model is the creation of a system (as opposed to a collection of computer programs). This involves training, facility modifications, changes to operating procedures, and so on.

After the system is complete, it is embedded in the real world, thereby changing the real world, its needs, and the problem statement. Thus, this model is not a “life cycle” model; rather it is the description of a single iteration of the process. For real-world problems in which the problem is poorly understood, small problems are solved in early iterations, and the experience gained is used in defining subsequent problem statements. This is sometimes referred to as evolutionary development, and the preliminary implementations are often called prototypes.

The model in Figure 2 also indicates two quality measures with their associated evaluation processes:

Correctness. This is a logical property that describes how well a product meets its parent specification. Correctness is always with respect to some formal document. One can measure the correctness of the system with respect to the requirements specification; that is, does the system perform all the functions as they were described in the specification? In the same way, a computer program can be correct only with respect to its design specification; program correctness is never a property of the program alone. The process of determining correctness is called verification. (The root is the Latin veritas meaning truth.) Correspondence. This is a subjective property of the final system that measures how well the product meets the needs of the environment. Correspondence can be determined only after the system is complete. The process of predicting correspondence is called validation. (The root is the Latin validus meaning strong or worth.)

Note that the two properties are independent of each other. One can have a system that is correct but does not correspond (e.g., the product was delivered as specified, but it does not do what is needed), or one can have a system that corresponds but is incorrect (e.g., the delivered system has some useful reports that were not included in the requirements specification.)

As shown in Figure 2, validation begins as the project starts. All validation activities are subjective (cognitive), and the validation techniques are built upon examinations of the design concepts in order to discover whether the right product is being implemented. Verification, on the other hand, can be objective, but it cannot begin until there are documents against which to test the correctness. Testing, by the way, simply implies the discovery of errors. A good test will identify errors; a bad test will not. Thus, the absence of errors during testing may give little insight into the quality of the software. One can only determine correctness by means of a proof with respect to some previous statement. And, of course, a correct system may not correspond.

The advantages of this essential model are that:

It presents a problem-solving perspective. The resultant system is viewed as a tool that provides some useful function to its users. The fact that it is a software product is obscured.

The implementation of the software is considered to be a simple transformation from the application perspective; naturally, this may not be the case when considered from the perspective of the software development team.

It is not a life cycle model. The life cycle is viewed as a series of iterations of this model. Some early iterations may produce prototypes that are discarded; later iterations represent the maintenance activities.

The obvious shortcoming of this model is that it does not address any of the management issues.

Developing Clinical Information Systems The previous two subsections on the software process are valid for all application domains. This subsection makes some observations that are specific to the domain of clinical information systems (CIS). This class of system maintains clinical data in a large, permanent database to support patient care and medical decision making. The applications typically use off-the-shelf hardware and software systems, and they have few real-time or computational demands. Relative to other application domains, the computer technology for the CIS is mature. This implies that, once the requirements for a CIS have been established, it should not be difficult to implement a product that meets these requirements.

Another way of considering the relative maturity of the CIS technology is to view it in the context of risk. Figure 3 presents two dimensions of risk. The application risk is a measure of the certainty that a valid requirements specification can be written, that is, one that will define a system that corresponds to the environment’s needs. Technical risk, on the other hand, measures the certainty that it is possible to produce a product that will meet the requirements, that is, that a correct implementation can be produced. Because the CIS uses a mature technology, almost all applications have low technical risk.

One lowers application risk through experience in the specific domain. If the users or developers have a clear understanding of their needs, the risk will be low. If, on the other hand, there is no experience with this particular application, then the risk can be lowered only by gaining experience. This may be done by reading about the experience of others with similar applications (the justification for writing this book) or by learning from hands-on experience with prototypes and similar applications.

Note that two groups must learn about the application in order to lower the risk. The developers must understand the organization’s needs in order to specify a useful product. The users also need to understand what the technology can do and how it will alter their work. Thus, there is a dual problem-solving activity. Each of the problem solvers begins with mental models of what the “ultimate system” will do. It is only after experience with using the CIS that these mental models

Figure 3. Two dimensions of risk.

are refined. Recall that, in the essential model, the implemented system is embedded in the real world, thereby altering it and the problem statement; we seldom are farsighted enough to anticipate the changes introduced by the new system.

Thus, we may conclude that the development of a complex CIS in a relatively new domain will have high application risk. Implementation will require learning by trial and iteration. The developers will need to have experience in both computer systems and their applications in a medical setting. Because of the high cost of failure, evolutionary development is mandated. Small failures can be accepted as learning experiences, but a major failure would be recognized for what it is.

The OCIS, of course, is an example of a CIS. It is a complex system that charts new waters. Although its developers had a general understanding of what it was to do, there was much to learn. Therefore, the OCIS development history followed the guidelines outlined in the previous paragraphs. How it did this is the subject of the next section.

OCIS Development History I begin this section with some personal observations. In 1974 the Applied Physics Laboratory (APL), that part of Johns Hopkins University for which I work, offered to help The Johns Hopkins Hospital design and implement a small clinical information system for their outpatient medical clinic. This system would manage a minimal medical record (the minirecord) that would provide an on-line summary problem list and medication summary for use when the full chart was not available. APL provided a programmer plus dial-up computer support for this prototype Minirecord System.

Of course, I was that programmer. At the time, I had over a dozen years of experience with the development of moderately large information systems. But this was my first application in a medical environment, and I did not know this application domain. I was lucky, and I learned rapidly. Fortunately, the Minirecord System prototype was a success. Consequently, when Dr. Donald Simborg, then the director of the Clinical Information Systems Division of the Department of Biomedical Engineering, left the School of Medicine to move to California, I was invited by Dr. Richard Johns, the department chairman, to assume that position.

At the same time, work was proceeding on the construction of a new building for a comprehensive cancer center at Johns Hopkins. The Johns Hopkins Oncology Center (JHOC), as it came to be called, would contain 60 inpatient beds, large outpatient facilities, a major radiation therapy complex, and laboratory space for research. In 1975 a scaled-down version of the JHOC was operating out of a Johns Hopkins clinic for outpatients and a unit in the Baltimore City Hospitals for inpatients.

As a result of their experience in these units, the medical oncologists had a good model of the type of care they intended to provide. Most of the antitumor drugs were given in precisely timed sequences, toxicities and complications were anticipated and monitored, and there was experience in the use of plots and displays to suggest trends. The goal was to formalize the care of all medial oncology patients by (a) establishing protocols —some of which were for research and others specialized for individualized care—and (b) using modern data management techniques to monitor the patients’ status.

It was immediately obvious that these goals could be met only if the care process was linked to an automated system. The manual preparation of graphs had demonstrated their utility; the use of a manual approach, however, would not scale up to a 60-bed center. Thus, the JHOC directorship had two choices: constrain the care to the level that could be supported by a manual system, or apply automation to the care process. Notice that, to a certain extent, this automation would eliminate the potential labor costs associated with the data technicians required to manage and plot all the clinical data. However, because it would not be practical to maintain this type of data management manually, automation was not really a labor-saving tool. Rather, it was an enabling device; it enabled the JHOC to provide the data management support that it considered necessary.

Because it was obvious that patient care in the JHOC would require automated support, Dr. Albert Owens, the director, planned for a computer room with cables to each patient room and clinical area. Dr. Raymond Lenhard, the chief of medical oncology, was given responsibility for the information management activities. And, in 1975,1 was given an appointment in the Oncology Center and asked to guide in the development of what we came to call the Oncology Clinical Information System (OCIS).

Thus, in 1975, with no domain knowledge, I was responsible for a division in the School of Medicine plus the implementation of the OCIS. Fortunately, I had the assistance of two domain specialists — Drs. Johns and Lenhard —to guide me in the various development activities. I had two titles, no computer, no staff, and considerable optimism. The latter, of course, is a prerequisite for the computing profession.

Before continuing with the narrative, it will be helpful to present an organizational overview. As already stated, I worked for APL; I was given an interdivisional assignment with the School of Medicine that continued until 1983. Within the School of Medicine I was responsible for a division in the Department of Biomedical Engineering (BME). The division staff doubled in 1977 when Elizabeth McColligan joined us after completing a master’s degree in medical computing. She also succeeded me as the division director when I decided to return to APL. In 1978, as work on the OCIS expanded, we began to add more computer professionals to the BME staff.

Most of the JHOC Information Center staff were Hospital employees. (The Johns Hopkins Hospital is administratively and legally distinct from the School of Medicine.) Therefore, most of the people hired to work in the information center were hospital employees. All jobs that were considered to be operational and/or long term were assigned to the JHOC staff (i.e., hospital employees). Short-term software development activities were assigned to the BME staff (i.e., university employees). The goal was to have the people who desired a long-term association with the JHOC remain with it by means of either their initial hire or a later transfer. In actual practice, work assignments were similar for the JHOC and BME development team members. However, few of the OCIS staff members had any prior computer experience when hired.

The reason that it is important to recognize that there were several organizations cooperating in the OCIS development is that this fact impacted on the development process. The OCIS implementation was only one of my responsibilities; my other assignments allowed me time to work out the TEDIUM concepts, participate in professional meetings, and consult in the development of other clinical systems. As time went on, the BME staff members were given new assignments: The Minirecord System became the Core Record System, and new systems were prototyped or implemented for the Department of Anesthesiology and Critical Care, the Department of Social Work, and the University Health Service. Each of these activities borrowed from the OCIS experience and, at the same time, contributed to it.

Finally, it is useful to recognize that the informality of this multiorganizational development also contributed to the OCIS success. There were no rigid contracts, and all staff members were peers independent of their home organization. Because I recognized that I would return to APL, operational responsibility for the OCIS was transferred to the JHOC during a three-year phaseover. Because the BME staff members perceived that new projects would always materialize, they were comfortable in doing the development and then providing consultation as they moved on to another assignment.

To conclude this introduction with an epilogue, the Clinical Information System Division of BME no longer exists. I left the School of Medicine because I chose to concentrate on software engineering. APL supported me in the culminating activity of writing Clinical Information Systems, and now I am active in computer science research. Elizabeth McColligan left BME to take a position with the Ohio State University. The rest of the division’s BME staff has moved on to other positions; several staff members now have jobs within the Hospital. Thus, even though the division had great success, it no longer exists.

The Oncology Center was not affected by either the changes in the BME’s or its own staff. The Center’s director, Dr. Owens, became President of The Johns Hopkins Hospital in 1987. Dr. Lenhard left the Oncology Center in 1984 to head a hospital-wide clinical information system organization; he is now Vice President of Information Systems. Each brought to his present position an awareness of how the effective management of clinical information can improve the delivery of care. They all left behind a well-managed operational system.

John Enterline succeeded me as Technical Director of the Information Center. His background as a statistician brought a different perspective that added significantly to the growth of the OCIS and computer application within the JHOC. OCIS is now a permanent part of the JHOC, and the remainder of this section details its early history as summarized in Figure 4. The Phase III history and status, together with a discussion of the plans for further development, are presented in Chapter 9.

The Prototype System Recall that work on the OCIS began in 1975 before the building was available or there was a computer. The first objective was to determine what the system was to do and how it could be implemented. It was decided that a prototype would be programmed on the APL computer. The goal was to gain some insight into how the OCIS should operate. Ideally, the prototype would also provide some useful functionality to the JHOC.

Dr. Lenhard, in addition to his responsibilities as the chief of Medical Oncology, was responsible for all clinical information management within the JHOC. One of the largest of these tasks was the maintenance of The Johns Hopkins Hospital (JHH) Tumor Registry. The Tumor Registry is a registry of all patients diagnosed at JHH as having cancer. It is a hospital-wide registry, required by the American College of Surgeons (ACS), and is not restricted to patients treated at The JHOC. (Only a small subset of the cancer patients treated at JHH are cared for in The JHOC; The JHOC, however, provides consultation services for all such patients. Patients treated in The JHOC generally are limited to those with diagnoses actively being studied by the Center’s faculty.)

The JHH Tumor Registry was managed by Anne Kammer. In 1975 the registry was maintained as paper records with an automated summary stored on magnetic tape. This automated file contained the information required by the ACS, along with some additional data. It was organized in the form of 80 character unit records. Each record contained the patient identifier, the history (dates and status), topography (location of the tumor), and morphology (cell type and stage) for each primary tumor site. (In a tumor registry each primary site is independent, and a patient may have more than one primary tumor.)

The first task was to define a format for an extended tumor registry that would meet all the ACS requirements and also satisfy the anticipated reporting needs for

Figure 4. Overview of OCIS development history.

both JHH and the JHOC researchers. A compact coded system would meet the needs of the first two groups, but it was doubtful that it could satisfy the needs of the faculty. Thus, after several iterations, a tumor registry abstract— essentially the same as the OCIS abstract shown in Chapter 2 — was designed. The next step was to write the programs for entering and updating these abstracts and to convert the existing tumor registry tape to the new format.

The prototype tumor registry program was written in PL/1 and run on an IBM mainframe located in Laurel, Maryland, some 25 miles away from the JHOC. Since this was 1975, punch cards were used as the input medium. An APL/JHU/JHH courier picked up the punch cards from the Tumor Registry each day, delivered them to the APL computing center for a run, and returned a listing of update errors to the Tumor Registry to be corrected in time for the next day’s processing. Routine reports were printed, and occasional searches were run.

From today’s perspective this was a very primitive system. However, it did several things. First, it did something useful. Before the prototype, the tumor registry was on a single magnetic tape. There were no reports or error-checking tools. The only output was a tape listing. The new prototype replaced a manual system with an automated system that offered some reporting facilities. It also verified the old data and provided a limited search facility for data going back to 1964. Although there was a small user community for the tumor registry, that small group was satisfied that a computer system would be of value.

The second thing that the prototype did was to establish the format for the patient identification node of the OCIS. The format chosen for this prototype defined what information—coded and textual—was considered necessary for all patients treated at the JHOC. For these patients, the complete abstract would be filled out; for the other JHH Tumor Registry patients, only the basic (i.e., readily available or ACS required) fields needed to be entered.

Work on the OCIS started in July 1975; by the end of the year the tumor registry system was operational. A perception of progress had been planted. Once the abstract was in production, work began on the design of the formats for the flow sheets and plots. Several iterations were tried, and we gradually converged on the formats shown in Chapter 2. Clearly, the processing cycle was so slow that none of these displays could be of any value in a clinical setting.

But we did learn from our prototype. Even though the JHOC did not get its computer until August 1977, we nevertheless felt that we had enough understanding of the problem to present three papers on the OCIS in 1977. Of course, these publications reflect the then state of the art in refereeing papers on the subject; there were not too many reports with evaluated results to choose from.

The Phase I System Once we felt that we had learned as much as we could from the prototype system, we began to work on the Phase I system that would operate from a dedicated computer located in the JHOC. The new facilities for Jthe Center were opened mid-1976, and by October of that year all units were in operation.

An Information Center was established to manage the data processing within the JHOC. Hospital systems were used for billing and administrative purposes; the Information Center was concerned only with clinical applications and the Tumor Registry. It was located on the main floor of the new building and had a small computer room, cables from that room to 250 locations throughout the JHOC, but no computer.

Financial support for the Information Center operations came out of the patient care income. Some research funds were available to support special applications, but—as one might expect—after moving into the new building there was little money available for a computer. Fortunately, a gift from the Educational Foundation of America freed the funds to purchase the Phase I computer.

A PDP-11/70 was ordered with 256K words of memory, support for 32 terminals, and 44 million bytes of mass storage. The cost was on the order of a quarter of a million dollars, a fact that helps explain why there were so few clinical information systems in the 1970s. In any event, a computer was on order, a programming language was chosen, and implementation of the Phase I system began.

It was decided to develop the OCIS in MUMPS. There were several reasons for this choice. First, an examination of the language suggested that it would be easy to learn and use. Second, the Clinical,Laboratory was developing its system using MUMPS, and the director of that activity, Dr. Robert Miller, offered us computer time and consulting assistance.

The commitment to MUMPS proved to be a good choice. The OCIS computer was installed in August 1977, the data from the prototype system were converted to the new system by December, and in April 1978, the terminals at the clinical stations provided on-line access to the patient data. The Phase I system continued to grow in functionality until 1979, when it was decided to freeze the system so that everyone could concentrate on a Phase II system. However, it was not until June 1980 that work stopped on Phase I.

The Phase I system relied on the insights gained from working with the prototype. In theory, we knew what we wanted to do. Briefly, we expected to capture all tabularly represented data—test results, drug doses, tumor measurements -and list them in flow sheets or plots. Textual information would be printed as abstracts, and special types of data, such as the microbiology reports, would require custom programs.

The goal was to present all this information to the clinical staff in formats that were appropriate for the patient’s disease, therapy, and status. In addition, because most patients were treated using a preestablished therapy plan (protocol), we wanted to have the patient database interact with the plans in printing out daily care plans that listed the tests, recommended therapies, and follow-up for each patient.

We designed the system to operate in a “cafeteria” style. That is, the staff could use whatever features of the system found helpful. The Information Center staff would be available to insulate the clinical users from the system details. Not that the system was hard to use; it was just a question of familiarity with computer applications.

The Phase I system in 1979 was almost the same, functionally, as the later Phase II system. The following features were available on line from any inpatient unit or outpatient clinic:

Abstracts and the Tumor Registry. A database of 25,000 patients was available on line; queries generally were managed by modifying report programs.

Clinical Data Display. The data were available at CRTs in three formats: horizontal and vertical flow sheets and listings of the day’s data. (The printed output included the flow sheets plus plots.) There also were special programs for microbacteriology reporting.

Patient Census. Summary patient data for each admission, protocol start, and outpatient treatment were recorded.

Daily Care Plans. Starting in April 1979, two units used protocol-directed care for the ordering of all morning tests, as well as for the printing of therapy reminders.

Transfusion Summary. This was the first increment in the hemopheresis system. A summary of each patient’s transfusion history was available.

The outpatient units also had access to scheduling programs. Other management and administrative applications were implemented, and—on paper—the Phase I OCIS was indeed a very powerful system. However, what was installed was really a set of programs and not yet a system. It would take years to transform the OCIS into an effective clinical system. This last observation requires some explanation.

For a clinical information system to be effective, it must provide timely access to the data and it must be available without fail. Obviously, no one-computer system can meet the availability requirement. The timeliness condition, however, depends on more than equipment; it relies on people and procedures.

When the Information Center was in its planning stages, Dr. Lenhard recognized that the clinical staff would need the assistance of clinically oriented data technicians who could serve as the links between the users and the OCIS. The title of “clinical data coordinator” was descriptive, but it represented a concept that required several years of experimentation to refine.

In addition to deciding what was reasonable to expect of the clinical data coordinators, we also had to establish procedures for the operational flow. Because the MUMPS system required little operator maintenance, the computer operators doubled as the data entry staff. The system was up six days a week, two shifts a day. Within its equipment constraints, it was reliable. It could process up to 20 concurrent tasks, but as the number of tasks approached 20, the response time degraded. In effect, the Phase I system was a victim of its success. It did many things well, showed great promise, but still fell short of the performance required.

Therefore, a second computer was ordered in 1978. This two-computer system would add reliability to the functionality of the Phase I system. Naturally, it would be called the Phase II system.

The Phase II System The goal of the Phase II system was to provide the same functionality as the earlier system in a two-computer configuration. In preparing the prototype, we had gone from nothing to a demonstration. We did this quickly, and the potential users were impressed. When we prepared the Phase I system, we went from nothing to an operational system, and this too we did quickly. As work began, the users felt that this would be a very good system once the equipment limitations were overcome. They looked forward to having a Phase II system that would meet their needs.

However, in implementing a Phase II system, we would be going from something that existed to something that worked even better. Surprisingly, this turned out to be a much more difficult task than I had expected. When we started, I told the users that development would take one year and that, when we were done, the new system would look just like the old one. I was wrong. It took three years from the time the second computer was installed (August 1979) until the Phase I system was retired and the new two-computer system was operational (August 1982). What was installed had greater functionality, but its lateness was its key attribute.

This was a hard lesson to learn. The new system consisted of 3500 programs with a data model of 850 relations. A data-processing group would be impressed to know that this was done in three years by a staff of under eight full-time equivalents, who also had to maintain the Phase I system, convert a database with 37,000 patients to a new format, and provide almost uninterrupted service to the clinical staff. But the users went through two years of wondering when the system would be finished, and —no matter how heroic our EDP efforts were—the users were right to complain.

What made the problem so difficult was that the language used for the Phase I system (MUMPS-11) was no longer being supported. It was replaced by the new Standard MUMPS. There were three choices: translate from MUMPS-11 to Standard MUMPS and thereby lose some of the benefits of the new standard,

reprogram the entire system in Standard MUMPS, or do something else. We chose the last option.

At the time the decision was made, we were involved in the maintenance of the Phase I system. MUMPS was designed to operate in very small computers, and it therefore used a compact notation to save space. This notation made programs difficult to read, allowed the use of clever tricks, and was a barrier to the maintenance of any code not authored by the maintaining programmer. We have seen from an earlier section that the long-term cost of maintenance is much greater than that of the initial development. By the age of only two years, the Phase I OCIS already had become very difficult to maintain.

Another problem was that the system seemed to have been built as a collection of individual clusters with various levels of completeness. In an information system, one expects that every input will be validated, that user prompts will be based on common conventions, and that help messages will be available. However, such were not always the case with the Phase I system. One could guess from the interface who wrote the code. Moreover, the temptation to introduce 95% complete code into production often led to the acceptance of invalid inputs and a resulting degradation of the database.

In parallel with this conversion decision, some work that offered new insights was underway in BME. Through a generous contract with IBM, BME was lent a Series/1 computer so that we could experiment with it for a patient management application. Recognizing the repetition involved in developing such an application, we began to build a specialized tool called SIMPLE. (The “we” here were Ken Bakalar, Martin Trocki, and myself.) It turned out that what was simple to do was not very interesting. Nevertheless, the result was a different approach to software development that I later came to call TEDIUM.

Thus, in 1979, as we began to plan for the Phase II system, I was convinced that reprogramming the system in MUMPS would be difficult to control and would result in a system that would be hard to maintain. I also had the germ of an idea— TEDIUM—that would make everything much easier (someday). And so, in 1979 we set out to maintain a large system (Phase I), develop a new programming environment (TEDIUM) for a new language (Standard MUMPS), and implement a two-computer version of that system (Phase II). All in one year. No wonder I was wrong.

Fortunately, we knew what we wanted to do; we had our Phase I experience. Among the programming staff, Alan Sacker served as the system manager and kept the systems going. Liz McColligan led the effort on the daily care plans, and Ken Bakalar designed the scheduling language. (Incidently, we received a great deal of help from Dr. Robert Friedman and Jacki Horowitz of Boston University in the design of the protocol system.) Ping Chang and Sue Powell worked on the plots and flows, Martin Trocki did the microbacteriology reports, Chris Brunn designed the outpatient system, and Jean Causey developed the pharmacy system.

In 1982, as the Phase II went into operation and the BME staff was phased out, new people were added: Farideh Momeni (who is now the lead OCIS designer)

contributed a variety of new features, Mark Borinski helped develop the blood product system, and Jerry Rouch maintained and augmented the abstract and tumor registry system.

This list includes only the development staff. Of equal importance were the contributions of the OCIS users, who made us understand the problems that we had to solve, and of the Information Center staff—clinical data coordinators, computer operators, and administrative support—who made OCIS work. However, in the interest of not making a long chapter even longer, I shall simply acknowledge their collective help.

What are the lessons from the Phase II history? First, the programming profession is always optimistic. An often cited rule of thumb for estimating is to get the best estimate and then double it. (Of course, if the best estimate already has been doubled. . . .) Second, there are two things that we must understand before we undertake a development effort: what we are to do, and how we will implement it. In this case we understood only the former, and the cost of developing TEDIUM and learning to use it impacted the schedule considerably. If we did not have a solid understanding of the first, then I am sure that we would have failed completely.

Finally, one should never underestimate the cost of maintaining and converting an ongoing system. Even though we nominally began work on Phase II in 1979, it was not until mid-1980 that most of the staff was freed from the maintenance activities and available to work on the new system. Also, because the Phase I system was new, it could be installed incrementally. Although some new Phase II features were operational in early 1982, its final installation required the complete replacement of the Phase I system.

Beyond Phase II Obviously, the Phase II system was completed, and it is successful. It is the topic of the remainder of this book. To conclude the historical narrative, in 1982 we began to interface the OCIS with the clinical laboratory so that results could be batched and transmitted electronically. Later, computer interfaces were developed between the JHOC laboratories and the OCIS. Once this was done, clinicians used the OCIS as the primary mechanism for viewing results; it was the most complete and timely source. ,

The pharmacy system was developed using the unit dose model first designed by Dr. Simborg. (Its implementation history is contained in Chapter 6.) Links were installed between the OCIS and the associated hospital systems. The OCIS computers were upgraded, new and more powerful Standard MUMPS versions were installed, and both mass memory and the number of on-line terminals grew. A listing of the current configuration is shown in Figure 1 of Chapter 1. In parallel with this, personal computers proliferated throughout the Oncology Center, and small networks were installed in the laboratories and the administrative areas. The Phase III system, which integrates these resources with the Hospi tal network and the School of Hygiene computer, is described in Chapter 9, along with a brief overview of the future plans for the OCIS and computing in the Oncology Center.

The Software Tools The history of the OCIS development is closely associated with the tools used in its implementation. The speed with which the early versions were produced and the ease with which the current system is maintained cannot be understood without some introduction to the tools used in its development.

The OCIS is implemented as a set of Standard MUMPS programs. These programs run under a MUMPS operating system. These same programs run under a variety of other operating systems, such as DEC VMS, IBM VM, UNIX,2 and MS-DOS. The OCIS was designed using TEDIUM. TEDIUM is a program generator, written in Standard MUMPS, that generates Standard MUMPS code. The generated code does not require TEDIUM to operate.

In the following two subsections, I present a brief introduction to MUMPS and TEDIUM. The discussion contains some technical detail, and readers who do not have an interest in computing and system development may wish to move directly to the section on evaluation.

An Introduction to MUMPS The Massachusetts General Hospital Utility Multi-Programming System (MUMPS) was developed in the late 1960s when computers were expensive and resources were limited. Targeted for the new breed of minicomputers, MUMPS was intended to provide interactive support and data management facilities for the kinds of complex file structures that are common in medical applications.

In the 1970s most MUMPS systems used the DEC PDP 11 computers, and there were a variety of proprietary MUMPS and MUMPS-like systems. (The DEC version for the PDP 11 was called MUMPS-11.) In the mid-1970s the MUMPS Users’ Group (MUG) organized a standardization activity, and the new Standard MUMPS was available in time for the Phase II conversion. In what follows, we consider only that standarrd version and describe only those features that contribute to an understanding of the OCIS.

MUMPS is an interpretative language. With this type of language there is a program called the interpreter, that reads the program as data and executes its commands. An interpretive language can be contrasted with a compiled language, in which the program is transformed into a machine-processible form by a program called a compiler. This object code is loaded into the computer, and it directly executes each instruction.

2UNIX is a trademark of AT&T Bell Laboratories.

With an interpretative language one is essentially a level away from the machine instructions. The program interacts with the interpreter, and the interpreter interacts with the computer. Thus it is possible to stop a program, examine the values being processed (i.e., the machine state), alter the program, and continue execution. In fact, because the interpreter treats the program as data, one even can pass data to the interpreter for execution. In MUMPS this is called indirection.

The advantage of a compiled language is that all processes are converted to efficient machine code. Thus, performance with a compiled language should be better. An interpreter must parse, verify, and execute each line of code every time it is processed. In general, one uses an interpretative language when testing or debugging a program or when one wants to delay a decision about what a program is to do (or how it is to do it). The latter is called delayed, or run-time, binding. Compiled programs always rely on early, or compile-time, binding. That is, all control paths are determined when the programs are compiled, and no changes can be made to take advantage of run-time knowledge.

Except for situations in which delayed binding is necessary, as is the case with some artificial intelligence applications, the efficiency of compilation is preferred. Therefore, some languages offer an interpretative debugger, along with a compiler for the operational system. Most MUMPS systems now use a form of compilation. The system converts each program into a parsed and verified intermediate form called pseudocode or p-code. This code can be processed rapidly by the interpreter, which is also available to support the user’s interactive needs, such as debugging support.

The command structure of MUMPS is organized around the line. Each command can be abbreviated as a single letter. For example,

WRITE “Hello world” could be written W “Hello world”

Both will write the message, “Hello world.” Assignment uses a SET command (similar to the BASIC LET), and many variable/expression pairs may be supplied. For example,

SX= 1,Y=2*X,Z=(X+Y)/Y MUMPS commands allow the user to append a conditional to a command. That command will be executed only if the condition has been met. Each line has an optional label separated by a tab character (<tab>) from the command string. The GO command transfers control to another line (or program). For example,

<tab>S 1=0

LINE< tab >S 1=1+1 DEPROGRAM G:I< 10 LINE will cause DEPROGRAM to be executed 10 times. The DO command is similar to a call command, and "PROGRAM is a program, stored on disk, to be called. The same loop can be more succinctly written with the FOR instruction:

F 1= 1:1:10 D "PROGRAM The syntax of the FOR reads, for I going from 1 by 1 to 10, do what follows through the end of this line.

The circumflex (which is referred to as an up arrow by MUMPS programmers) indicates that the object referenced is on the disk. The above statements without the circumflex, that is, D PROGRAM, would transfer control to the label PROGRAM (assumed to be already in working memory). In either case control returns to the next command when the called section of code encounters a QUIT command or reaches the end of the program text.

The same convention is used to differentiate between data in working memory (X(l)) and data stored on disk ("X(l)). The latter is called a global. Thus, one can read or write from disk using the same commands that one uses when dealing with data in working memory. For example,

S X=1,Y="Z(1)*X,"Z(Y)=X sets the local variable X to 1, sets the local variable Y to the global Z value with the index 1, and stores in the global Z with the index Y the value of X. Notice that the line includes one read from disk and one write to disk.

Although the syntax and commands are interesting, the real power of MUMPS is in its data manager. Data (in both working memory and permanent storage) are organized as a sparse array. In a sparse array, storage is allocated only when an array value exists. For example, in a traditional language, a 3 by 4 matrix, say M(3,4), will be allocated 12 storage locations. In the sparse array of MUMPS, however, storage is allocated only when a value is assigned through either a SET or READ command. Thus,

S M(1,1) = 1 ,M(3,2)=2

will set only two locations in the matrix M. The remaining 10 matrix elements are undefined, unless, of course, they had been set prior to this line.

The sparse array concept allocates storage only for those array elements to be stored. Because MUMPS is a nondeclarative language, the programmer need not declare how much storage is to be allocated for an array (or global). The statement S M(1) = 1,M(1,1)=2,M(1,2,3)=3

establishes M as a hierarchy with entries at three levels. This statement is similar to the previous statement in which M was viewed as a matrix. The difference is in how the programmer will use M. (Note that in the second case M(l,l) has a value associated with it, whereas M(1,2) points to the node M(1,2,3) but may not have a value associated with it.)

As we have seen, the sparse array is a very powerful and flexible structure. Its greatest strength, however, lies in the fact that index terms may be character strings. In this way, one can extend the sparse array to implement a hierarchical patient file:

"PAT (“ 1234567”)=“J ones, Mary #WF35”

"PAT(“ 1234567”, “ADR”, 1)=“10 Main Street”

"PAT (“1234567”, “ADR”, 2)=“ Any to wn, MW 12345”

"PAT(‘T234567”, “PROB, “87/01/12”) = “Head cold”

In the first node, the # is used as a separator; the patient name field is of arbitrary length, and the # indicates the end of the name and the start of the race/sex/age field. (There are many MUMPS commands and functions that facilitate operating on character strings like this.)

The circumflex indicates that the data will be stored on disk. All entries (records) will be stored in lexical or numerical order. For that reason, the date is shown in the form of year/month/day. Given this ordering of the globals, one can construct a name index as follows:

"PATNAME(“Jones,Mary”,“1234567”)=””

In this case there are two index terms; the second is required to allow for two different Mary Jones entries. Because one is interested only in the index terms, the value assigned to this sparse array node is the null string, that is, a character string with zero length. (Because all globals are stored in lexical key order, MUMPS does not need a sort routine. Sorting is done by writing a global with the desired index.)

The global notation can be extended to identify which partition in the MUMPS system stores a global. In the OCIS case, development is done in a partition called DEV, and the production system is maintained in a partition called PRD. To test programs in DEV, one can read production globals with the notation

"[“PRD”]GLOBAL(... .

This notation can be extended to multiple machines as well. In the OCIS, some globals are stored on the OLD machine and some on the NEW machine. Thus, the OCIS tends to use the general form for a MUMPS distributed system of

"[“OLD”][“PRD”]GLOBAL(....

From this very brief introduction to MUMPS it is clear that it is a very powerful and flexible language. What are its shortcomings? First, the ability to do so much so easily is an invitation to create programs that are difficult to maintain. For example, in the previous illustrations, what does the sparse array M stand for? It would be almost impossible to understand its structure without documentation. Yet the dynamic, friendly MUMPS development environment does not encourage the creation and maintenance of good documentation. Of course, this is a human problem that is not limited to the MUMPS programming language.

The problem is compounded, however, by the terse MUMPS style; it is difficult to manage a large system and understand the code of other programmers.

Because of the limitations of MUMPS, the difficulty in maintaining a consistent system style, and the need to provide tools that would facilitate the evolution of the OCIS, I chose to implement the Phase II system using a development environment called TEDIUM. Before describing TEDIUM, let me observe that this was a very bold, perhaps even foolish, high-risk decision that turned out to be a good one.

One objective of this chapter is to offer lessons learned. Consequently, I conclude this subsection with the following observation. MUMPS, by itself or with some of the available productivity tools, offers an excellent environment for some types of system development. As with most languages and environments, there are deficiencies. Yet one should not rush to design new environments that correct those deficiencies. As the stunt driver tells the enraptured teenagers in his audience, please don’t go home and try this with the family sedan. With that word of warning, let me now tell you how l “solved” the problem of developing software.

An Introduction to TEDIUM The Environment for Developing Information Utility Machines (TEDIUM) was first implemented in 1979. It was written in TEDIUM and MUMPS, and it went through several iterations. The version of the system that I report on here was frozen in 1982. I shall not try to describe the history of TEDIUM or how I envisioned its use in the Phase II development activity. Rather, I shall concentrate on how I understand TEDIUM today. The goal is to provide a background for understanding the OCIS evaluation of the next section.

In several places I have made the point that, in software development, there are two separate problems to be solved: what the application is to do, and how to detail its implementation. In theory, if one can specify fully what the application is to do and also define all the programming conventions to be followed, then one should be able to give these instructions to a programmer and, without further dialogue, receive a correct program. In this view, the designer decides what is to be done, and the programmer translates this decision into code.

TEDIUM follows this division of the problem domains. The designer works within the context of a system style. This defines all the standards and conventions to be used by the programmer; for example, every input procedure must perform validity checking, and every user interaction must offer a help message. Within this style, the designer specifies what is to be done. These specifications are very compact; the reuse information already specified and constructs already defined. They are called, therefore, minimal specifications.

The minimal specifications describe what the program is to do from the perspective of the application. That is, the designer is asked to visualize how the user will interact with the system. Implementation details are avoided unless (a) they are required for efficiency purposes or (b) the TEDIUM specification language has no effective way to describe the desired actions. Thus, the designer is not a programmer; he is an analyst who should describe the system as it is to be used.

Once the minimal specification and system style are available, there is little creative activity left. If, in theory, a programmer could take these documents and, without further instruction, create a correct program, then there is no reason why the programmer’s actions should not be formalized and automated. This is what TEDIUM attempts to do. It views specification as a value-adding activity, but it considers programming to be only housekeeping. The latter does not add to our determination of what the application should do. Therefore, TEDIUM uses program generation to create programs from the specifications. That is, it automates the programming.

The ability of TEDIUM to limit the development process has several advantages. Referring back to the section on software engineering, we know that the number of “lines of code” produced per effort day is independent of language. Consequently, a higher level language will result in more functionality for a fixed effort. Second, because the greatest life cycle costTs associated with maintenance, a compact and application-oriented specification should be relatively easy to maintain.

The greatest advantage of the TEDIUM program generation approach, however, is derived from the fact that most TEDIUM products are of high application risk. That is, it is not clear that the designer knows what is required, and therefore the resultant specification may not correspond to the user’s needs. With TEDIUM the designer may specify an application (or part of it), rapidly generate the code, and test the behavior of that specification.

The process of incrementally testing a prototype system is often called rapid prototyping. One develops a test system, experiments with it, and then uses the knowledge gained for the implementation of the next prototype or the final product. In most cases each prototype is discarded, and only the experience with its use is retained.

TEDIUM uses a different model for prototyping. In this case a sculpture metaphor is used. One begins with a model of what is desired. This model is shaped, perhaps recast, until the result is aesthetically pleasing, that is, it corresponds. At this point the final prototype becomes the delivered product. (The program generator ensures that all conventions defined by the system style are included in each prototype; thus, each prototype is functionally complete, i.e., it is never simply the shell of a target application.) The method is called system sculpture.

On the basis of this description, TEDIUM may be viewed as an application design language that generates programs. The reason it can generate programs is that the domain of information systems is relatively mature; we understand how to implement them. The TEDIUM version used for the OCIS generates MUMPS code. That code is perceived to be as efficient as custom-crafted code. Fortunately, there are enough good MUMPS implementations available so that slight losses in efficiency do not impact overall performance.

Figure 5. Three levels of application representation.

Even though the generated code is MUMPS, there is little in TEDIUM that depends on MUMPS. The objects that TEDIUM works on are called applications. All specifications are stored in an application database (ADB), and the contents of the ADB are shared by all specifications in that application. The ADB contains two levels of information. This is shown in Figure 5.

The top level, called the descriptive level, describes what the system should do. This is presented from three perspectives.

Requirements. What the application is expected to do. This often is a definition of the system’s objectives and goals without any operational specifics. Processes. Descriptions of the process that the application is to support. Frequently, these describe procedural flows.

Data Groups. Descriptions of the entities to be modeled in the database.

There are links among the three perspectives, and each is described in both outline and text form.

At the next, or conceptual level there are two types of specification. The semantic data model describes the database as a relational data model with additional constraints. For example, the patient data structure might be defined, in part, as PAT(PATID)=NAME,RACE,SEX,AGE This represents a symbolic specification for the MUMPS global example given above. With this definition, the TEDIUM commands can refer to NAME, RACE, or SEX by name without concern for how the variables are stored or accessed; that is, the housekeeping is eliminated.

In this illustration, the variable PATID is underlined. This is an example of a semantic constraint. It implies that the table (relation) PAT serves as a dictionary for all valid values of PATID. Every input of PATID, anywhere in the application, will be validated against the PAT table automatically. Again, we see how TEDIUM concisely captures the designers’ intent and distributes it throughout all the programs in the application.

The second class of specification is for programs, which determine the functions to be implemented. There are two types of program specifications:

Generic Specifications. These specify often-used functions such as the creation of a menu or the implementation of a file management program for a table. These tend to be declarative, but procedural commands can be inserted. Common Specifications. These are written using the TEDIUM command language; they are always procedural.

The TEDIUM command language permits the designer to specify the program flow with very little housekeeping overhead. For example, the command Input PATID will generate code that prompts for a value of PATID, checks the input for existence in the table PAT, performs any other input validation, and responds to a help request by listing out the definition of PATID from the data dictionary. Once a valid PATID exists, one could set the patient’s name with the command Get NAME in PAT The conventions used for the interactive dialogue and data access are all established by the system style.

The ADB allows the designer to associate tables in the data model with data group descriptions and program specifications with the process descriptions. The program generator uses the program specifications and the associated data model definitions to produce MUMPS programs. These represent the lowest, or implementation, level. The generated programs are maintained by the MUMPS environment. Because they do not include any information that is not already in the ADB, they are not considered to be part of the ADB.

Notice that the ADB contains two categories of information. At the descriptive level there is subjective information. Text is used to convey concepts; decisions are made regarding what should be included and what level of detail should be supplied. Even the links between the descriptive and conceptual objects are based on a subjective determination of what will be helpful in the context of design, operations, or maintenance. At the conceptual level, on the other hand, the objects represent a complete specification of some aspect of the application design. They always describe exactly what is implemented; therefore, we consider them to be objective documentation.

During initial implementation the ADB can be considered the application design as created by the system sculpture process. It reduces redundancy, identifies inconsistencies, and provides the designer with limited local views of the application. The average program specification is 15 lines in length and normally can be displayed as one or two CRT pages. The text in the ADB is used for help messages, manual preparation, and application documentation.

From the maintenance perspective, the ADB may be viewed as a knowledge base that describes the application. The subjective information provides a road map to the system. It suppresses detail and assists the designer in identifying what conceptual objects relate to the problem being investigated. The objective information adds detail and describes the system completely. Cross-references are updated with each generation; they identify which programs call (are called by) which programs and access (read or write) which tables. Thus, the designer can sit down at a terminal, interactively identify a program or table, see how that object interacts with other objects, review all definitions, and then make and test changes. Because the objective documentation is the generator’s input, this documentation is always complete and up-to-date.

On the basis of this brief introduction, one would expect TEDIUM applications to be more compact than the equivalent MUMPS code, always well documented, and described in application, rather than implementation, terms. The price that one pays in using TEDIUM is the loss of the convenience of interpretative debugging, the need to specify a program completely before it can be tested, and the occasional rough edges associated with a locally developed product. Clearly, it is more fun to write programs in MUMPS, but the production of an operational system entails considerable tedium. TEDIUM addresses the second issue; how well it does that is examined in the following section.

Type of Database Entry

1982

August

1986

1987

Patients

37,000

50,000

56,000

Tumor Registry entries*

33,000

41,000

44,000

Patients admitted to JHOC

15,000

18,000

22,000

Patients with clinical data

8,800

18,000

22,000

Patient days

216,000

458,000

557,000

Unique data points

4,500K

14,000K

16,000K

Patients with microbacteriology data

3,000

7,000

9,000

Patient days

16,000

109,000

133,000

Reports

200,000

365,000

428,000

*Total hospital patients.

OCIS Development Evaluation This chapter has considered the history of the OCIS with a special emphasis on the development of its software. An evaluation of the OCIS as a clinical tool is distributed throughout the following chapters, which describe its features. In reading those evaluations, however, one must keep in mind that the OCIS was designed for a cancer center for the purpose of delivering a level of care that could not be supported effectively with a manual system. Thus, the OCIS was intended to enable this type of care; it was not developed to justify its existence on the basis of cost savings.

Today, the staff cannot order morning laboratory tests, view in-house laboratory results, examine the Tumor Registry, schedule a visit, or review a local medical record without using the OCIS. Many clinicians are not aware that they are using OCIS; they think of it as a set of specialized applications used by others. Nevertheless, the JHOC could not operate without the OCIS. Moreover, the existence of the OCIS acts as a base for the addition of integrated features that could not be implemented independently. Table 1 illustrates the size, scope, and growth of the OCIS database; it suggests how the JHOC has come to rely upon the OCIS.

I suggest that it is best to evaluate the OCIS in the context of the visions of the JHOC founders, Dr. Owens and Dr. Lenhard in particular. The next chapters will provide anecdotal examples that justify the system’s worth —clinically as Table 2. OCIS Development Effort

Organization

75

76

77

78

79

80

81

82

83

84

85

86

87

Total

Oncology Center*

1

2

2

2

4

7

7

7

5

5

5

Biomedical Engineering*

1

1

2

5

7

8

5

3

1

Total FTE

0.5

0.5

2.5

6

6.5

8.5

7

6.5

6

6

4.5

4.5

4.5

63.5

*Counts individuals, not all full time.

well as financially. However, from a historical perspective, it would be improper to suggest that the OCIS had a set of well-defined, clearly stated (and therefore evaluatable) objectives, for example, to realize an economic benefit. The OCIS was not a research project when it was started a decade earlier, and it was never supported by research funds. The goal was to use the available technology (which was very expensive by today’s standards) to implement a system that would aid in patient care. That it was successful (and incidently saved money) is the justification for this book.

Of course, the OCIS is also a software product, and I am now engaged in software engineering research. Therefore, it is fitting that I conclude this.long and somewhat egocentric chapter with an evaluation of the OCIS development history. This discussion will be limited to the Phase II system. Table 2 summarizes the development and maintenance effort over the system’s lifetime.

There are several ways of evaluating a software effort. The term “evaluate” usually implies an evaluation with respect to a standard or set of goals. In the section on software engineering, two quality goals were identified. Verification was concerned with the correctness of a system with respect to its specification. Inasmuch as the OCIS is generated from its specifications, it must be correct. The fact that it is used in life-endangering situations suggests that its users also perceive it to be defect free. The second quality measure, validation, considers how well the system meets the environment’s needs. The following chapters provide ample evidence of its correspondence.

Another way of evaluating a software project is to compare its development history with that of other similar projects. But TEDIUM introduces a new development method (system sculpture), and it is not clear how the OCIS results relate to other projects. Thus, even though this is a section on evaluation, I will not evaluate the OCIS project. Rather I will present some characterizing metrics that describe both this project in particular and the software process in general.

To begin, I note that there is very little difference in this project between development and maintenance. Organizationally, the same staff members and users are involved. The OCIS is viewed as a resource to support the JHOC needs; it is not a product to be built, delivered, and frozen. Thus, one would expect the OCIS to grow continuously. Moreover, one would expect older programs to be replaced by newer programs. Tables 3 and 4 show that both these expectations are realized.

Table 3 presents the relative growth of the OCIS. Each program is about 15 lines long. Using a formula to include the data definitions, the OCIS application Table 3. Growth of the OCIS Definition

Year

Programs

Tables

Elements

1982

2177

456

1251

1983

3662

848

2025

1984

5024

1045

2398

1986

5541

1375

2613

Year

New Programs

1980

17

1981

625

1982

1388

1983

990

1984

1014

1985

737

1986

770

*In production use, December 1986.

consists of about 85,000 “lines of code.” This code is, of course, its minimal specification. An earlier study showed that each TEDIUM specification line produced 4 lines of MUMPS code. Thus, the specification is the equivalent of 340,000 lines of MUMPS code. (Because a TEDIUM program may generate more than one MUMPS routine, there are over 9000 OCIS MUMPS routines.)

There was also a study that indicated that one line of TEDIUM would provide the functionality of 20 lines of COBOL. If that is true, then the OCIS would be a 1,700,000-line COBOL application. The point is, there is a difference between the size of the problem specification and the size of the implementation. The more compact the representation, the easier it will be to build and maintain the product. Bigger clearly is not better.

Table 4 shows the number of new programs that were written each year. The data are taken from a 1986 analysis of the programs then in production use, but the general pattern is clear. In 1980 very little was done. Most work involved maintenance of the Phase I system and learning about TEDIUM. The 1981 effort also was hampered by the TEDIUM training (and debugging) effort. Nevertheless, a staff of seven full-time equivalents produced 625 tested programs — about one program for every two effort days. By 1982, the staff was familiar with TEDIUM, and individual productivity doubled. Tested programs were being installed at the rate of one per effort day.

In the previous section it was stated that a goal of TEDIUM was to have the designer specify the programs from the perspective of the application. It has been shown already that this specification approach is compact, that is, minimal. The Table 5. Measures of Program Generation by Year

1981

1982

1983

1984

1985

1986*

Number of programs

625

1377

950

1014

731

430

Mean generations

25.14

18.08

13.48

12.78

10.32

7.03

Median generations

27

13

10

8

7

5

Mode of generations

18

8

4

5

3

2

90th percentile

44

37

29

27

25

15

*Through September.

Percentile

Upper Bound (Number of Generations)

10

3

20

5

30

7

40

9

50

11

60

14

70

17

80

22

90

31

question now is: Does TEDIUM allow the designer to express himself effectively? To provide insight into this property of TEDIUM, I have recorded each time that a designer edits a specification. A small number of edits (or generations) indicates that the language is expressive. A large number of edits may have many meanings; for example, the problem is not well understood, the designer incrementally develops programs, or the environment is not expressive.

Table 5 summarizes some measures of generation (editing) activity by year. Notice that the numbers are higher in the earlier years. This reflects two facts. First, the early years involved more learning; thus more errors would be made in debugging. Second, older programs will be more subject to revision as changes are made to the system. The data do show, however, that TEDIUM is an expressive environment. On the basis of the 1986 data, an average of only six edits is required for a production program. Table 6 presents some data from a 1984 analysis. In this case only half of the 5000 OCIS programs had been edited more than ten times; this number included all debugging and maintenance changes.

The final two tables offer some insight into the OCIS maintenance activity. Since 1985 all maintenance has been performed by a staff of four programmers and a system manager who programs half of the time. This staff is responsible for maintaining 1600 programs that they did not write, plus maintaining and adding to a large, complex application. Only one member of this staff had professional programming experience prior to employment by the JHOC.

Table 7 displays the activities of the principal contributors to the OCIS. The BME staff members are identified by a B, and the JHOC members are denoted by an O. The table shows the number of programs in production use in 1986, by year aqd designer. Table 8 shows the average number of program generations (edits) for the data in Table 7. One can recognize the learning process. In the case of 04’s 1982 and 1983 activity, the large number of programs produced are the result of copying and modifying existing programs. This reuse obscures true “productivity.” Still, the data in Table 8 clearly show that, as the designers learn their job and how to use their tools, their profiles become almost identical; there are few individual differences.

Designer

81

82

83

84

85

86*

B1

228

228

7

B2

25

48

16

B3

38

B4

98

14

B5

50

236

101

30

B6

32

88

B7

17

Ol

45

27

48

9

B8/02

226

258

214

159

97

03

3

15

22

10

29

6

04

38

256

292

390

174

161

05

31

171

143

254

233

115

06

23

57

94

07

33

5

3

08

135

49

*January through September only.

Not shown in all these tables is the fact that the OCIS staff is dedicated and stable. Few people have left the JHOC, and those who now make up the Information Center have a clear understanding of the OCIS, the JHOC, and its users’ needs. The members of the staff have gone through a long apprenticeship, and they are very familiar with both the application domain and the OCIS implementation. This fact is a necessary condition for the system’s development and maintenance success.

To conclude this evaluation, the OCIS was developed using TEDIUM, and now it is necessary to live with TEDIUM or reprogram the entire system. At one point, an objective analysis of the system was conducted and other development approaches considered. There were no feasible alternatives. There is anecdotal evidence that TEDIUM provides an effective environment for the OCIS. Some TEDIUM users, who were reluctant at first to work with the environment, have told me that they could not maintain the OCIS without it or that they would be pleased to act as a reference for other potential users. Someone who installed the OCIS in Australia said that he could not accomplish that task if it were not for TEDIUM.

Of course, I find the TEDIUM endorsements gratifying; I continue to use it as a tool in my current research. However, I find even more satisfying the fact that I could participate in the development of the OCIS, walk away from the JHOC in 1983, and come back to see the system in use, enhanced, and an integral part of the JHOC operations. In closing this section on evaluation, I am reminded of Dr. Octo Barnett’s three criteria for evaluating a clinical system: Will people use it? Will people pay for it? Will people steal it? The first two questions were answered in the affirmative in the 1970s. Let us hope that this book will contribute to the theft of our concepts.

Designer

81

82

83

84

85

86*

B1

23.8

11.9

3.6

B2

22.2

15.0

9.1

B3

25.4

B4

27.0

23.0

B5

25.5

16.4

13.5

13.5

B6

23.1

16.5

B7

13.3

Ol

16.3

12.9

10.2

12.7

B8/02

23.1

14.0

13.0

9.2

6.0

03

14.3

9.4

9.5

5.4

18.6

5.3

04

42.4

19.6

12.5

12.6

10.1

5.8

05

33.0

24.3

13.5

14.1

10.3

9.5

06

16.0

22.0

10.9

07

12.6

27.2

12.0

08

10.3

7.5

*January through September only.

References There clearly is no need for any OCIS references. However, for readers interested in the software process, there is a collection in this series called Implementing Health Care Information Systems (H.F. Orthner and B.I. Blum, Editors, Springer-Verlag, New York, 1989). It includes chapters on software development, MUMPS, and TEDIUM. For readers wishing additional information about MUMPS, the MUMPS Users’ Group offers primers and reference manuals. Their address is MUG, 4321 Hartwick Road, College Park, MD 20740. A more complete discussion of TEDIUM is available in B.I. Blum, TEDIUM and the Software Process, MIT Press, Boston, MA, 1989.

n

Рекомендуем к просомтру

www.kievoncology.com благодарны автору и издательству, которые способствует образованию медицинских работников. При нарушении авторских прав, сообщите нам и мы незамедлительно удалим материалы.