Make Mine MAPPER #8
                     --------------------
                        by Rob Haeuser


              Mega-databases: Tricky, but Possible
---------------------------------------------------------------
     How many of you have heard this tired old rumor: MAPPER may be
great for small to medium sized data-bases, but isn't capable of
managing a really large application? Wrong, wrong, wrong! This
rumor has circulated for years, and it is time for the truth to
come to light. 
     Part of the problem with this false impression is that the 
programmers spreading it simply don't know the techniques of major
data base design in MAPPER, blaming the software for any seemingly
insurmountable problems. Granted, sometimes it may get a bit
tricky, but who said that tricks were for kids?
     I recently had an opportunity to work on a project that most
of you would call major. The Federal Government mandated drastic
changes to the Welfare program, with an emphasis on job training
and placement for welfare recipients. We were told to bring up a
system A.S.A.P., but no later than October 1, 1990. This gave us
less than four months.
     Various options were examined, but, given the time frame, not
many were available. Naturally, MAPPER was way up there on the
list. LINC was another major contender. A fail-safe approach was
taken: parallel systems would be developed in both MAPPER and LINC
(more on this later).
     For years I have been developing theories about mega-data-base
design, but never had an opportunity to try most of them out (maybe
due to the above mentioned rumor?). This Welfare Reform project was
just the thing I needed to exercise some theories, and kill a rumor
in the process. 
     250,000 clients, with the potential for dozens of records per
client, seems pretty big to me. How about 1,000,000 clients? There
are currently a little over 200,000 in the system, but we designed
the data-base to comfortably handle well over 1,000,000 people. At
that level, the system would cruise at between 30,000,000 and
50,000,000 records.
     The system uses eight modes, well over 30 form types, and has
in excess of 36,000 rids. Single records are used in some areas,
while data paragraphing is used in others. Updates occur daily,
weekly, and monthly. Not all areas of the data-base are updated by
the same process. In fact, even within a single area, multiple runs
can be used.
     For example, the 'master' area uses all 2,000 rids of a form
type. It is updated in it's entirety once a month, with transaction
activity of some sort on 100% of the master records, and therefore,
all 2,000 rids.  The same process updates an additional 2,000
'monthly activity' rids, for a total of 4,000 rids. And finally it
starts a secondary process to construct over 1,000 'office' rids. 
     Currently the transaction files total in excess of 100,000
records. In addition, internal tables are used for matching. All-
in-all, a heck of a lot of updating goes on. No one run should have
to handle that much logic!  
     The main concern was time. With this much data to be updated,
it could take quite a few hours by conventional means, where one
run does it all. After some initial testing, I began to take
timings. 100 master rids (5% of the data-base) seemed like a good
number to update for starters. 
     The first timing test was a minor disappointment. It had taken
125 minutes, a little over 2 hours. At that rate, it would have
taken a single run 2,500 minutes, or a little over 41 1/2 hours, to
complete. Almost two full days! That was unacceptable.      I had
fallen into the 'transaction' trap, where every action against an
individual record is handled by a 'transactional' function, such as
binary-find and read-line. I was using the output area to construct
the rids, so it wasn't writing the data that was at fault. It was
all the I/O involved in doing a table look-up for every record in
the database. 
     Binary-find usually takes from 5 to 10 I/O to find a record.
Multiply that by 200,000 and you end up with over 1,000,000 I/O
burned just to read a table. And there were two tables to be read.
Oh well. There must be a better way.
     There is. Sometimes, rather than process each record
individually, it is possible to process an entire rid with one pass
of a MAPPER 'macro' function, such as Match. The input file is in
a completely different format than that of the master database, and
matching is all but impossible. 'Transactional' processing is most
appropriate here. But once the master rids are constructed, they
can then be matched to tables for further updating. 
     The second version of the monthly update cut the execution
time from over two hours to right at 20 minutes, an 84% reduction.
Not bad, you may think. That is, until you multiply it by 20 and
realize that 400 minutes is still over 6 1/2 hours. I felt
uncomfortable with the idea of a single process running that long. 
    It was time for another theory. The theory was that if one run
took so much time to update so much, wouldn't multiple runs, each
updating smaller pieces, take less time? I decided to code the
monthly update in such a way that multiple versions could be run
simultaneously, each version updating a portion of the database.
     Because of a recommendation from Unisys regarding the number
of background runs a system should allow, I decided to code for a
maximum of 20 versions. This way each one would handle updates for
5% of the database (sound familiar?).
     With a full compliment of twenty runs, execution time was
reduced to just over two hours. This amounts to a reduction of 67%
in the execution speed as compared to a single run. When compared
to the original estimate of over 40 hours, the reduction is in the
neighborhood of 95%!
     Not only was writing the monthly update a bit tricky, but the
whole monthly update process is started by forces external to
MAPPER, which presents it's own set of problems. The transaction
files are produced by a monthly COBOL process, which runs at a
different time each month. It is therefore not feasible to use
MAPPER's run scheduling capability to automatically start the
update runs. 
     The batch port is the mechanism by which external forces can
instigate run activity within MAPPER. After the twenty input files
are ready, a batch port run is started. The batch run preps a
sentinel rid used by all versions of the monthly update run,
entering a start date/time, blanking out end date/time, and setting
a 'finished flag' to N(o). It also writes a field indicating how
many versions are to be executed. Each version has a unique control
line in the sentinel rid, so they don't confuse one another.
      It then begins linking to the scheduler run, linking once for
each version. Finally, it starts a run that deletes over 1,000 rids
in another area, so that they can be re-created after the monthly
updates are finished.
      Did you know that the scheduler run won't allow you to pass
input to indicate that an auto-restart should be attempted if
MAPPER crashes? Well, ours does now. In a massive process, delays
can be costly. This process had to be as resilient as possible to
failure.
     So I coded the update run to check the sentinel rid when it
first wakes up. It can detect whether or not it has already been
started by the presence of data in 'last rid updated' and 'last
input line read' fields. These fields are updated whenever the run
finishes processing a rid. If data is detected, processing begins
(actually resumes) at the input file line number and rid indicated,
possibly saving hours of redundant effort.
     Another extremely helpful technique in accessing a data base
of this size is key storage methodology. Since a tremendous amount
of on-line inquiry is required, it is imperative that master
records be rapidly accessible. My general rule of thumb is two
seconds or less for response time. This system provides that
response. 
     The secret to that kind of response is in the key, especially
an all-numeric one. You can use certain digits within it to
indicate which rid the record is in. If you have control over how
the key is created, it can contain mode, type, rid, and even a line
number! That gets fairly sensitive, but response is unbelievable! 
     Since the keys were created in a COBOL system, the best we
could do was to use certain digits to indicate a rid. However,
since the data is sorted on the key, binary-find can be used for
access. For this reason, any master data can be retrieved with
approximately 5 I/O and well within the two second target response
time.
     Another trick was how to obtain an even spread across 2,000
rids. You can't just use 4 digits in the key as a rid number, as it
could range all the way up to 9,999. Three digits was the maximum.
But that only gave a range up to 999.
     First, we analyzed the keys to find three contiguous digits
with a fairly even population distribution (it was not the first or
last three!). Then a fourth digit was analyzed for 50-50
distribution between the ranges 0-4 and 5-9. If this digit is 0-4,
the lower 1,000's rid is used; if it is from 5-9, the upper 1,000's
rid is the one. 
     With this technique we have a database distribution that
varies by no more than 15% from one rid to another. In other words,
there are no 10,000 line rids, while others may have just a few
dozen. The database distribution is literally as flat as a pancake!
     The same type of technique is used to determine which form
type within a mode certain records are to be found. Five form types
are used. If the 'magic digit' is 0-1 the data is in form type B,
2-3 is in C, 4-5 in D, 6-7 in E, and 8-9 is in form type F. The
same technique could be used to determine a 'modes' compliment, if
necessary. 
     A footnote on the number of parallel runs to execute: 20 may
be over-kill. I executed with 10 versions running, and added only
15 or 20 minutes to the overall execution time. It is imperative
that you know the state of your CPU when coding this type of
process. If your system is at or near the 100% mark on CPU
utilization, more runs will simply slow you down. There is
definitely a point of diminishing returns. 
     As of this writing, the MAPPER system has been up and running
since October 1, 1990. As far as the LINC system goes, well, it's
still in development. But that's another story.