Y2K: The cost of test

Testing is projected to be one of the most time-consuming and expensive aspects of the year 2000 (Y2K) remediation process. In fact, leading analysts have estimated that code testing will amount to somewhere between 50% and 60% of the entire project cost. How-ever, their projection assumes that code modifications will be made in much the same manner that other types of code maintenance changes have historically been made, and that the same type of testing is required to achieve acceptable quality levels in production.

While testing is -- and will continue to be -- required to find and fix errors introduced during the code-modification process, many of the costs projected for year 2000-related code testing can be eliminated entirely. These major cost reductions are accomplished by introducing automated testing techniques during the renovation process that greatly reduce the need for unit test, component test and system test phases after the code is modified.

This article examines the testing costs that have, to date, been associated with year 2000 code renovation, and shows how automated testing techniques can effectively collapse the time required for, and the costs incurred during, post-renovation testing.

Various studies of software costs show that while initial investments in application development (design, coding and testing) can be substantial, they are quickly surpassed by the recurring cost of code maintenance (changing and improving the code). Studies have shown that over a life cycle of six years, the cost of maintenance is roughly 10 times the original cost of developing and testing the code. Moreover, the cost of testing is a larger fraction of the overall cost in the maintenance phase than it is in the new code development phase. As a result, testing for most maintenance projects has amounted to somewhere between 50% and 80% of the total project cost. Industry projections that year 2000 project testing will be 50% to 60% of the total project costs are consistent with the lower range of these studies.

Developing economic models for year 2000 maintenance

Converting virtually every application in an enterprise to year 2000 conformance is a task unprecedented in any corporation. The portfolio of Cobol business applications alone, for example, comprises more than 200 billion lines of code worldwide.

As software industry consultants began to develop economic models to estimate the cost of year 2000 maintenance in the mid-1990s, they relied heavily on information about traditional software maintenance. As a result, the assumptions that went into these models included:

  • The strategy used to change the code;
  • The amount of change required;
  • The quality of the code in the changes; and
  • The quality required for putting the code back into production.

It is important to examine each of these assumptions and to calculate the costs based on the true nature of the renovation approach being used, rather than on historical methods of calculating the cost of testing.

THE STRATEGY USED TO CHANGE THE CODE. There are at least two common techniques used to change code to operate correctly in the year 2000 and beyond. The first, field expansion, expands all year fields from two digits ("98") to four digits ("1998"). This technique is the most effective in dealing with all possible year values, but requires extensive changes to files and databases that currently contain two-digit year values. Changing both the programs and the data represents not only a larger risk, but a greater possibility that changes will be done incorrectly.

The second technique, windowing, utilizes the same two-digit fields that exist in the applications today and that cause the programs to interpret the value of the year field according to a specified 100-year window. If the window is specified as 1980 to 2079, for example, a value of "22" will be interpreted by the program as "2022." A window can be either fixed (always in the same 100-year range) or sliding (the window moves each year to be a fixed number of years before and after the current year). This means that in 1998 a sliding window of -18 to +81 would range from 1980 to 2079, while in 1999 it would range from 1981 to 2080.

Windowing is currently the most common change strategy used for renovating application code. In that context, this
article will show ways to minimize both the time and cost of testing done in the end-user environment when a windowing strategy is applied.

THE AMOUNT OF CHANGE REQUIRED. The amount of change required for year 2000-related code renovation is typically determined by measuring the date density in the programs being renovated. In reviewing published industry averages, it appears that date density is approximately 6%. In other words, for every 1,000 lines of code (KLOC), there are, on average, about 60 lines of code that would either define a date field or perform a calculation on at least one date field.

For windowing, existing Data Division statements do not need modification. However, Procedure Division code is modified and some temporary variables are inserted. Experience in renovating Cobol code has shown that the Procedure Division is about 50% of the code, or 500 lines per KLOC. Six percent of the Procedure Division, or 30 lines, involve dates. In reviewing modified code, it appears that 10% of the lines involving dates, or three lines, require modification for renovation using windowing .

Modifications that are done manually or with early-generation automated tools vary widely in the amount of code they add. Some may add as few as eight to 10 lines of code per line modified; others will add as many as 25 to 30 lines of code per line modified. If we pick a number somewhere in the middle -- 16 lines of code changes per modification, which is not unusual -- and multiply it by the three lines of Procedure Division code to be changed per KLOC, it results in 48 lines of code changed per KLOC.

THE QUALITY OF THE CODE IN THE CHANGES. The normal bug rate for new application development prior to entering unit test ranges from 30 bugs per KLOC (implying a very good development process) to several hundred bugs per KLOC (implying a poor development process). A typical error rate at this stage is 100 bugs per KLOC, which we will use to extrapolate the bug rates for maintenance code.

Using the statistics on maintenance programming from the book Practical Software Maintenance by Thomas Pigoski (New York: Wiley Computer Publishing, 1997), it is estimated that the bug rate in maintenance code (new, deleted and changed lines of code) is up to 7.5 times the bug rate in new code. This amounts to 750 bugs per KLOC (7.5 x 100 bugs per KLOC). Assuming that maintenance to date-sensitive code may be less error-prone, we will use half of that rate (375 bugs per KLOC, or .375 bugs per LOC) for our analysis purposes. The 48 changed lines per KLOC described above would then result in about 18 bugs per KLOC (48 changed lines of code per KLOC x 0.375 bugs per changed line of code) for code changes done manually or using first-generation automated systems. We will examine the differences in first-generation, second-generation and more advanced automated systems, as well as their effect on the cost of testing, later in this article.

THE QUALITY REQUIRED FOR PUTTING CODE BACK INTO PRODUCTION. The final assumption of the industry code model is that a certain number of bugs are permissible in production software. To date, there are no large software systems that have proven to be bug-free. Ultimately, this means that code going into production has residual bugs in it. In the many software projects this author has studied, the residual bugs in production code encompassed a broad range (from a low of three bugs per KLOC up to 60 bugs per KLOC, with 30 bugs per KLOC being normal). It is important to note that residual bugs can be tolerated because they are in code that deals with rarely seen conditions.

All of the various software-failure models in existence confirm that the time and cost required to remove a bug during any testing stage increases exponentially to the time and cost of removing that same bug at a previous stage. The typical quality level for normal code, 30 bugs per KLOC, will be used as our target quality level. However, the examples will work at any desired quality level.

Given that the typical quality level of production code is 30 bugs per KLOC before year 2000 renovation, and that the typical manual renovation process introduces approximately 18 bugs per KLOC, almost all of the bugs introduced by renovation must be removed to get the code back to its acceptable (pre-renovation) quality level. This translates into significant testing on the part of the user.

Achieving "production quality"

The purpose of testing is to ensure that when software goes into production, the number of residual bugs represent a reasonable risk to the enterprise using the software. In testing, the cost of identifying and removing bugs is exponentially higher at each successive testing phase.

Naturally, there are several software testing models. A commonly used testing model, which does testing in four distinct phases, is described below.

  • Unit test is the testing done by an individual programmer prior to testing a particular code unit with other code units. Bugs internal to that code are removed, but unit testing is not effective in removing bugs in the code's interfaces with other code units.
  • Component test is the testing of two or more code units done prior to that code becoming part of a system. Component testing removes bugs in the code's interfaces with other code units, as well as internally within a code unit. However, removal of bugs internal to the code unit during component test is far more costly than removing them during unit test. Additionally, component test is not effective in removing bugs that span multiple code units.
  • System test is the testing of a complete application done prior to production. It removes bugs in the code-processing data that spans multiple code units, but is not cost-effective in removing bugs that could have been found in either unit or component tests.
  • Acceptance test is testing that is done by, or on behalf of, the end user to demonstrate that the code is ready for production. If the system test is done correctly, the acceptance test should not reveal bugs. Acceptance test is not part of the bug-finding test process.

In a normal code-development process, where three phases of testing consume 50% of the cost of development, each test phase removes 60% to 80% of the remaining bugs. For illustration purposes, we will use a 70% removal rate for each phase of testing (not including acceptance test). Thus, the 18 bugs per KLOC introduced by a typical manual year 2000 renovation process will be reduced by unit test to 5.4 bugs, then reduced by component test to 1.6 bugs, and finally reduced by system test to 0.5 bugs when entering production.

Therefore, the renovated code returned to the customer's environment for acceptance test will contain the 0.5 year 2000 bugs per KLOC that survived the testing, plus the residual bugs contained in the original pre-renovation code.

Consequently, industry estimates for the cost of year 2000 testing based on the experience of conventional software maintenance models will likely be a good estimate for any year 2000 effort based on manual modification of code. However, advanced, second-generation automated processes now on the market have several significant differences that result in a substantially different model and much lower overall project costs.

Code renovation tools

In the mid-1990s, the need for a more automated approach to handling year 2000 code changes was recognized, and the first generation of automated tools was developed. First-generation tools rely on text scanning, often using customer-provided "date seeds," and the code modification is manually assisted. The person performing the renovation still has to make a decision about whether to make a change, and what type of change to apply from an extensive library of changes. While this decreases the amount of time required to make changes, it does not substantially reduce the likelihood of missing dates or making erroneous changes.

A second generation of more sophisticated code renovation tools became available that improved the situation. Second-generation tools are characterized by a rules-based, date-finding process coupled with rules-based modification algorithms. In second-generation renovation products, date variables are determined not just by scanning the text, but by a detailed analysis of how the suspect variable interacts with other classified variables. The best tools are able to analyze variable interaction across program boundaries (or program edges) as well as within the program. These second-generation tools use rule algorithms to make the code changes, reducing the likelihood of an error.

Today, there are advanced second-generation tools available that not only automate the date-finding and modification steps, but also have the ability to apply automated testing techniques throughout the renovation process itself. As one might expect, this will have a tremendous positive effect on the overall cost of any given renovation project: first, because bugs will be "caught" sooner (when they are less expensive to fix); and second, because automated renovation processes, by definition, introduce fewer bugs than manual (or manually directed) renovation processes.

Yet code quality of an automated "fix" by itself does not address the business requirements needed to demonstrate due diligence. While an automated renovation process can reduce the cost of unit, component and system tests, it cannot eliminate them. When you add automatic testing to automated fixing the result is a much higher intrinsic quality. Moreover, it demonstrates the due diligence required to decrease the liabilities as-sociated with year 2000 renovations.

Not surprisingly, the ability to put fewer bugs "in" and take more bugs "out" during renovation translates into a reduced need for testing on the customer's part. In fact, when automated testing occurs throughout the renovation process -- assuming it is done properly and thoroughly -- the customer's testing should involve only acceptance-level testing to determine that the renovated code is production-ready.

By introducing automated testing techniques during the renovation process that do the work of the unit test, component test and system test, it is possible to effectively condense the time required for -- and eliminate much of the cost involved in -- post-renovation testing. So, for those using second-generation processes to renovate their code, the old adage that "testing is 50% to 60% of total project cost" no longer applies.

Of course, year 2000 solutions on the market today offer widely varying levels of automation in the renovation process. This means that post-renovation testing may be minimal (as with advanced second-generation processes that have integrated automated testing), or it may be the majority of the total project cost (as with manual processes). Given this disparity, companies should make total project cost, and not simply code conversion cost, the basis of comparison among potential year 2000 renovation solutions.