The purpose of this paper is to show how SAS can be used to both analyse and redevelop an existing application. The paper describes several general techniques for rapid analysis and redevelopment of existing applications using SAS. Examples will show how these techniques were used to completely rewrite the twenty thousand line Taconet Billing application in eighteen months.
I presented this paper at the SAS Users Group Australia – SUGA 97 Conference back in 1997. Methods like Agile and Extreme Programming were unknown, Rapid Application Development (that is, not Waterfall) was the latest management concept.
Rapid Application Redevelopment
Paul Shipley
Pleasal Enterprises Pty Ltd
The purpose of this paper is to show how SAS can be used to both analyse and redevelop an existing application. The paper describes several general techniques for rapid analysis and redevelopment of existing applications using SAS. Examples will show how these techniques were used to completely rewrite the twenty thousand line Taconet Billing application in eighteen months.
Picture this:
You have just spent the last eighteen months working on the design of a new application to replace your old legacy application. After months of trying to keep all of the members of the steering committee happy, meet the conflicting demands of the users and accommoding the “good ideas” of the CIO, you think that you might have a design that will be approved. Then the Government announces legislative changes that will require redesign of at least half of the application, the company records a massive loss for the year (mainly due to the late delivery of your application), the board cuts the IT budget by 20% and directs that tenders be sought for the outsourcing of IT since “an outsider must be able to do it better!”.
What went wrong?
Methodologies 101
When I was at school we were taught the “Waterfall” approach to application development. Basically you did all of the analysis work until it was completed, then all of design work, then all of the coding, and so on though unit testing, system testing, conversion, implementation and review. This works fine in theory and has been the standard approach to application development during the past thirty years. In practice thirty years ago organisations had time to pause, consider and then develop applications over a long period. However in today’s rapidly changing environment there simply is not the time to spend several months (or even years) designing, coding and testing an application before anything is delivered. Businesses are expecting (if not demanding) instant results amid constant change. Trying to complete a design or code a new application with constantly changing user requirements results in a “Mega Project” that will take so long to be completed that it is unlikely to ever be finished. Clearly a new approach is needed.
Rapid Application Development
No, this does not mean throwing some code together that sort of works and calling it finished! Rapid Application Development (RAD) can best by described as “Evolutionary design, Incremental delivery”. Small components of the complete application are designed, built and delivered continuously until all of the components are completed and the application finished. This approach has several advantages:
- Ongoing change is simply part of the process.
- The clients are getting a constant flow of delivered components that they can use.
- There is constant feedback about the delivered components.
However RAD is not a substitute for a clear strategy or user requirements. All of the tasks required by traditional methodologies are still required, the key difference in using RAD is that the timing is rearranged so that complete components are delivered continuously rather than trying to complete the entire application.
How you can use RAD with SAS
The major focus of this paper is the experience of redeveloping the Taconet Billing application, which is Telstra’s internal computer resource chargeback application. Taconet Billing comprises about twenty thousand lines of SAS code, and was completely rewritten by a team varying from two to four people over an eighteen month period working on the redevelopment part time.
The major issues that had to be addressed by this redevelopment were:
- The phasing out of the package that the application was based around due to the vendor withdrawing support following a change of ownership.
- The desire of clients to reduce maintenance caused by hard coding of control information.
- A need to improve the available reporting.
While this is an example of a SAS to SAS conversion, the techniques and tools described can also be used in non-SAS applications. SAS makes an excellent tool for data analysis and conversion even if the application itself does not use SAS.
The “Legacy” Application
Today there are very few truly new application developments, except at newly created organisations where no applications currently exist. Most development work is either maintenance or replacement of existing applications. At their simplest, these legacy applications typically look something like the above. The components are:
- A central Database holding all of the data.
- A Online interface to the Database.
- A number of Batch processes to update the Database.
- A number of Reports from the Database.
This was indeed the case with Taconet Billing, except that it had a number of hard coded control tables instead of an online interface.
Where to start
The aim of RAD is not to cut corners but to reduce the development cycle so that small components of the applications are delivered continuously rather than a single big delivery much later. Each of the components still goes through all of the development life cycle steps, however as the components are smaller, the process is accelerated.
To achieve these goals you will need to instruct your clients in this new philosophy. If they have been used to specifying their requirements (in detail) for an entire application, they will now have to get used to working with single components. You will probably have to use a small component as a test case to show how this new approach works and the results that can be achieved.
To begin with there needs to be an overall plan. This should not be detailed, but instead should show what the major application components are and how they relate both currently and in the future. This provides the road map of the steps involved in getting from where you are today to the finish. The diagrams in this paper are an example of this sort of plan.
You can then start to analyse each of the components. The best component to start with would be the one that would generate the biggest returns to the clients.
Analysis: Know your data
The first step to redeveloping your application is to know your data and how it is processed. While data models, data flow diagrams and other documentation are a good starting point, there is no substitute for actual analysis of the data files. Some of the SAS tools that can be used follow.
- COB2SAS Macros are available from the SAS Institute to assist with the conversion of COBOL FD’s and WORKING STORAGE definitions into SAS INPUT statements.
- SAS Informats and Formats are available for almost any data type.
- The SAS/Access range of products can be used to interface to most databases.
- SAS Date Informats, Formats and the YEARCUTOFF option can be used to both find invalid dates and convert two digit years to include the century.
- PROC PRINT can be used to list a sample of the data. This is particularly usefully when used with either OBS= or a WHERE clause.
- PROC FREQ can be used to list the possible values and their relative proportion of discrete variables. For example: Sex, State and Postcode. It is also possible to use formats to group continuous variables into discrete ranges. For example: Dates can be reclassified into Age Groups or Quarters of the Year.
- PROC UNIVARIATE can be used to describe the characteristics of continuous variables. For example: Price, Quantity and Duration.
- PROC TABULATE can be used to see how discrete and continuous variables are related. For example: Postcode vs Price.
Once this analysis is complete you should have an understanding of the data, how it is related and the application processes.
The following sections describe the major components of the application, these being: Reporting, Batch Processing and Online Processing. While these are described in order, one of the advantages of this methodology is that the components that are of most importance to you can be tackled first regardless of which part of the application they come from. Provided there is a well-disciplined central project management several teams could be working on separate areas concurrently.
Reporting
Due to its visibility to the clients reporting in most applications is second only to the online interfaces in importance. Reports are also the easiest to convert as they do not require updating of the database. Traditionally this has meant paper print outs, however more recently this could include: Spreadsheets, Extracts to Data Warehouses, and Web based HTML and CGI applications.
In the case of Taconet Billing there was a particular report summarising the overall charges for the month. The clients found this report confusing and difficult to understand. Also, it did not provide a breakdown of any unrecovered amounts. A week spent redeveloping the report and writing a breakdown report resulted in dramatically improved client satisfaction and confidence in the new approach.
Initially the redeveloped reports would use the main database, however because this may not be in the most suitable form for reporting as its primary role is to support the online and batch processes. Also performance problems may occur by having the reporting accessing the database concurrently with the onlines. The answer to this is to have a database just for reporting.
Reporting Database
Having redeveloped the major reports the next phase would be to convert all of the reports to use a “Reporting Database”. This involves extracting the required data out of the main database into a secondary database that is used solely for reporting. This database could be tuned for reporting rather than online work (eg: denormalised). This does not have to be in the same environment as the main database and could be part of a Data Warehouse rather than a separate entity. Of course reports that required up to the minute data would still need to use the main database, however there are normally very few of these types of reports.
For Taconet Billing a SAS/AF application was developed that allowed clients to “drill-down” through the reporting database and either generate reports or extracts that could be used in Excel. This resulted in both significant client satisfaction and a large reduction in the effort required to provide ad hoc reports.
Batch Processing
Almost all applications have some Batch processing that updates the database. It could be loading new pricing tables, processing orders or generating a payroll. These components are normally easy to replace as they are not directly seen by the clients. All that is required is that the correct outputs are generated for the given inputs.
This is easily achieved using either one of the SAS/Access products or a database load/unload utility. Of course there is a timing issue using load/unload rather than direct database access, however this should not be a problem as the onlines would normally not be available during the batch update.
The main Taconet Billing database was already in SAS format, the problem was that it was being processed by a package whose vendor had been taken over and the new owner had withdrawn support for it. As we were only using a small part of the package we were able to replace the components by analysing their inputs and outputs and writing new modules to do the same function. The new modules used one third less processing resources by taking advantage of SAS version 6 features.
All Batch
Once all of the batch programs have been converted the central database can be split between the onlines and the batch processing, again using either SAS/Access or database load/unload utilities to interface the two.
This approach has the advantage that the two databases can be in the format that is most efficient for that type of processing. The Batch database could be in SAS format to take advantage of the efficiency of batch sequential processing using SAS, while the online database uses your favourite OLTP database product.
Splitting the central database also positions the application to be become distributed. The various distributed online databases could then be interfaced to a central server to do the batch updating.
Online Processing
The online components can be redeveloped in much the same manner as the rest of the application by concentrating on single areas. Parts of the application can be rewritten using SAS/AF or SAS/FSP, and using SAS/Access to directly interface to the database.
The components to start with would be whose that were both small and relatively unassociated with the rest of the onlines. Groups of clients can be migrated to the new components when available.
In the case of Taconet Billing there was a small online component that was originally written in SAS/AF version 5. This needed to be redeveloped with the introduction of SAS version 6. Due to its small size it was not possible to implement the new application in parts. Instead components were redeveloped and then released to the clients for acceptance testing as they were completed. Once the clients were satisfied with the entire application it was released into production.
There were also a number of hard coded control files where the data was embedded in SAS code, incurring a considerable effort to maintain. Some small SAS/FSP screens were developed to maintain the data and a program written to automatically generate the correct SAS code from the data. This also allowed the data to be validated before being processed.
Once all of the online components are converted the redeveloped application would be in its final form.
The Final Application
This is the structure of the completed application after having been redeveloped using the above strategies. Of course you don’t have to convert all of your application, you can change only the components that are causing problems or are no longer meeting business needs.
In the case of Taconet Billing the Onlines generated the SAS code in the control files eliminating the overhead of changing the code manually. The new Batch programs replaced the package solution saving $80K per year in license fees as well as reducing runtime and CPU resources.
An application to allow clients to access and extract data from a reporting database themselves saved about half a person of effort in reduced ad hoc reporting requests and reduced delay in report distribution from up to three months to whenever clients want them.
Conclusion
Using RAD techniques it is possible to redevelop applications to improve client satisfaction in a timely manner without creating a “Mega Project”.
RAD is not a global panacea. You still have to have clearly defined requirements, well constructed and documented code, through testing and a clear strategy. However RAD does allow for components of applications to be delivered to the clients in a more timely manner than by using traditional methodologies.
By having your application structured into the separate components illustrated you not only have an application that is more flexible and easier to maintain, but it is also positioned to take advantage of new technologies, such as distributed processing and data warehousing, as they become available.
About the author
Paul Shipley has worked in IT for 13 years, mostly at Telstra Corporation Limited. Since learning SAS over ten years ago he has used it in a variety of roles in user support and application development. Currently Paul is the Technical Leader for Taconet Billing, Telstra’s internal computer chargeback application.
When not cutting code Paul can be found instructing aerobics classes or working out in the gym.