Make Mine MAPPER #8 -------------------- by Rob Haeuser Mega-databases: Tricky, but Possible --------------------------------------------------------------- How many of you have heard this tired old rumor: MAPPER may be great for small to medium sized data-bases, but isn't capable of managing a really large application? Wrong, wrong, wrong! This rumor has circulated for years, and it is time for the truth to come to light. Part of the problem with this false impression is that the programmers spreading it simply don't know the techniques of major data base design in MAPPER, blaming the software for any seemingly insurmountable problems. Granted, sometimes it may get a bit tricky, but who said that tricks were for kids? I recently had an opportunity to work on a project that most of you would call major. The Federal Government mandated drastic changes to the Welfare program, with an emphasis on job training and placement for welfare recipients. We were told to bring up a system A.S.A.P., but no later than October 1, 1990. This gave us less than four months. Various options were examined, but, given the time frame, not many were available. Naturally, MAPPER was way up there on the list. LINC was another major contender. A fail-safe approach was taken: parallel systems would be developed in both MAPPER and LINC (more on this later). For years I have been developing theories about mega-data-base design, but never had an opportunity to try most of them out (maybe due to the above mentioned rumor?). This Welfare Reform project was just the thing I needed to exercise some theories, and kill a rumor in the process. 250,000 clients, with the potential for dozens of records per client, seems pretty big to me. How about 1,000,000 clients? There are currently a little over 200,000 in the system, but we designed the data-base to comfortably handle well over 1,000,000 people. At that level, the system would cruise at between 30,000,000 and 50,000,000 records. The system uses eight modes, well over 30 form types, and has in excess of 36,000 rids. Single records are used in some areas, while data paragraphing is used in others. Updates occur daily, weekly, and monthly. Not all areas of the data-base are updated by the same process. In fact, even within a single area, multiple runs can be used. For example, the 'master' area uses all 2,000 rids of a form type. It is updated in it's entirety once a month, with transaction activity of some sort on 100% of the master records, and therefore, all 2,000 rids. The same process updates an additional 2,000 'monthly activity' rids, for a total of 4,000 rids. And finally it starts a secondary process to construct over 1,000 'office' rids. Currently the transaction files total in excess of 100,000 records. In addition, internal tables are used for matching. All- in-all, a heck of a lot of updating goes on. No one run should have to handle that much logic! The main concern was time. With this much data to be updated, it could take quite a few hours by conventional means, where one run does it all. After some initial testing, I began to take timings. 100 master rids (5% of the data-base) seemed like a good number to update for starters. The first timing test was a minor disappointment. It had taken 125 minutes, a little over 2 hours. At that rate, it would have taken a single run 2,500 minutes, or a little over 41 1/2 hours, to complete. Almost two full days! That was unacceptable. I had fallen into the 'transaction' trap, where every action against an individual record is handled by a 'transactional' function, such as binary-find and read-line. I was using the output area to construct the rids, so it wasn't writing the data that was at fault. It was all the I/O involved in doing a table look-up for every record in the database. Binary-find usually takes from 5 to 10 I/O to find a record. Multiply that by 200,000 and you end up with over 1,000,000 I/O burned just to read a table. And there were two tables to be read. Oh well. There must be a better way. There is. Sometimes, rather than process each record individually, it is possible to process an entire rid with one pass of a MAPPER 'macro' function, such as Match. The input file is in a completely different format than that of the master database, and matching is all but impossible. 'Transactional' processing is most appropriate here. But once the master rids are constructed, they can then be matched to tables for further updating. The second version of the monthly update cut the execution time from over two hours to right at 20 minutes, an 84% reduction. Not bad, you may think. That is, until you multiply it by 20 and realize that 400 minutes is still over 6 1/2 hours. I felt uncomfortable with the idea of a single process running that long. It was time for another theory. The theory was that if one run took so much time to update so much, wouldn't multiple runs, each updating smaller pieces, take less time? I decided to code the monthly update in such a way that multiple versions could be run simultaneously, each version updating a portion of the database. Because of a recommendation from Unisys regarding the number of background runs a system should allow, I decided to code for a maximum of 20 versions. This way each one would handle updates for 5% of the database (sound familiar?). With a full compliment of twenty runs, execution time was reduced to just over two hours. This amounts to a reduction of 67% in the execution speed as compared to a single run. When compared to the original estimate of over 40 hours, the reduction is in the neighborhood of 95%! Not only was writing the monthly update a bit tricky, but the whole monthly update process is started by forces external to MAPPER, which presents it's own set of problems. The transaction files are produced by a monthly COBOL process, which runs at a different time each month. It is therefore not feasible to use MAPPER's run scheduling capability to automatically start the update runs. The batch port is the mechanism by which external forces can instigate run activity within MAPPER. After the twenty input files are ready, a batch port run is started. The batch run preps a sentinel rid used by all versions of the monthly update run, entering a start date/time, blanking out end date/time, and setting a 'finished flag' to N(o). It also writes a field indicating how many versions are to be executed. Each version has a unique control line in the sentinel rid, so they don't confuse one another. It then begins linking to the scheduler run, linking once for each version. Finally, it starts a run that deletes over 1,000 rids in another area, so that they can be re-created after the monthly updates are finished. Did you know that the scheduler run won't allow you to pass input to indicate that an auto-restart should be attempted if MAPPER crashes? Well, ours does now. In a massive process, delays can be costly. This process had to be as resilient as possible to failure. So I coded the update run to check the sentinel rid when it first wakes up. It can detect whether or not it has already been started by the presence of data in 'last rid updated' and 'last input line read' fields. These fields are updated whenever the run finishes processing a rid. If data is detected, processing begins (actually resumes) at the input file line number and rid indicated, possibly saving hours of redundant effort. Another extremely helpful technique in accessing a data base of this size is key storage methodology. Since a tremendous amount of on-line inquiry is required, it is imperative that master records be rapidly accessible. My general rule of thumb is two seconds or less for response time. This system provides that response. The secret to that kind of response is in the key, especially an all-numeric one. You can use certain digits within it to indicate which rid the record is in. If you have control over how the key is created, it can contain mode, type, rid, and even a line number! That gets fairly sensitive, but response is unbelievable! Since the keys were created in a COBOL system, the best we could do was to use certain digits to indicate a rid. However, since the data is sorted on the key, binary-find can be used for access. For this reason, any master data can be retrieved with approximately 5 I/O and well within the two second target response time. Another trick was how to obtain an even spread across 2,000 rids. You can't just use 4 digits in the key as a rid number, as it could range all the way up to 9,999. Three digits was the maximum. But that only gave a range up to 999. First, we analyzed the keys to find three contiguous digits with a fairly even population distribution (it was not the first or last three!). Then a fourth digit was analyzed for 50-50 distribution between the ranges 0-4 and 5-9. If this digit is 0-4, the lower 1,000's rid is used; if it is from 5-9, the upper 1,000's rid is the one. With this technique we have a database distribution that varies by no more than 15% from one rid to another. In other words, there are no 10,000 line rids, while others may have just a few dozen. The database distribution is literally as flat as a pancake! The same type of technique is used to determine which form type within a mode certain records are to be found. Five form types are used. If the 'magic digit' is 0-1 the data is in form type B, 2-3 is in C, 4-5 in D, 6-7 in E, and 8-9 is in form type F. The same technique could be used to determine a 'modes' compliment, if necessary. A footnote on the number of parallel runs to execute: 20 may be over-kill. I executed with 10 versions running, and added only 15 or 20 minutes to the overall execution time. It is imperative that you know the state of your CPU when coding this type of process. If your system is at or near the 100% mark on CPU utilization, more runs will simply slow you down. There is definitely a point of diminishing returns. As of this writing, the MAPPER system has been up and running since October 1, 1990. As far as the LINC system goes, well, it's still in development. But that's another story.