The Quixotic Quest: Nobody Can Read All those Reports

A single incident in my early career sparked my on-going efforts to participate in the quest to reasonably and efficiently deal with the avalanche of data our modern world generates.

Posted by Keith Wilson on November 4, 2015

I know exactly when the genesis of this web site occurred, and that event has been a driving force behind my career since then.

Report Distribution

It was 1990-91 or so and I was working at a IBM mainframe software company as a developer on a report distribution application. It was an interesting application and I learned a lot of lessons about software development from that project. It was probably my favorite project of my career for a lot of different reasons.

The application was a COBOL/BAL/VSAM/CICS application that monitored the JES spool based on rules that were defined in a CICS-based UI. The application could watch the JES spool for reports that were generated by any applications. When a report that was to be processed was detected, the report file was pulled from the JES spool, broken into pages, and then each page was placed into a VSAM record. Indices were created based on rules which assigned pages or groups of pages to users. the users could then logon to the application and see lists of reports by name and dates to view. The reports could then be viewed on-line. This may not sound like a big deal today, but in the mainframe world of 1990, it was revolutionary for many of our customers.

I remember when I realized exactly what the app meant to our customers and why they were willing to pay six figures for the application. We had one particular client that was a large bank in Canada. Before buying our application, their information distribution process was to print hundreds of thousands of pages of reports to high speed printers every night. The printers were in a warehouse or something in one of the large cities in Canada. Printing started shortly after close-of-business each evening. The reports had to be printed by 3 AM (or something like that) and boxed so they could be loaded onto a tractor-trailer truck. The truck then began making the rounds to deliver the reports to the branch offices. The goal was to have the reports to each office by 8 AM so that hundreds of pages of reports could be delivered to each person's desk. The individuals in the offices then spent the day finding the data they needed to answer phone calls or whatever in the hundreds of pages of reports that they received. At the end of the day they were done with that day's report and they would have an updated version of the report tomorrow.

Our application allowed them to stop using a tractor-trailer truck. It allowed them to stop using the printers. The office personnel came into the office in the morning, signed onto the application, and all of the pages of reports they were used to seeing were waiting for them, indexed and searchable, pretty as you please.

But, it got better.

The reports they were receiving were often copies or almost copies of one another. Each worker receiving many of the same pages to cover their clients, some of which were shared. The application producing the reports didn't necessarily know which clients or pages of the report each worker needed, so many of them received hundreds of pages that they never used. With the application I worked on, we could collect one complete, comprehensive report with all the client information, and then an administrator could modify the rules to ensure that the pages each clerk needed were distributed electronically to that clerk. If changes were needed, the rules could be easily changed and different sections of the report would be provided to an individual clerk. The applications that were producing the reports could be simplified.

It was a win-win situation if I ever saw one. I felt wonderful when I realized what a difference our application was making in the lives of these individuals. I also realized that we were saving reems of paper, thus saving forests. We were saving fuel because the truck wasn't driving around. Now, granted, they probably laid off a truck driver and 2-3 printer operators who were no longer needed, but I didn't focus on that aspect of the situation.

A Eureka Moment

But, this new understanding of the use of our application had a tremendous impact on my understanding of data processing. I had lots of questions and lots of ideas.

For example, I realized that there was something absurd in what our client had been doing before the application. Everyday they printed hundreds of thousands of pages of reports and distributed those reports to dozens, maybe hundreds of people. Most of the pages that were printed were never looked at by anyone. The clerks would turn to particular pages to view particular data to answer particular questions. At the end of the day, all of the pages were tossed. Most of them had never been viewed. Those that had been used probably only had 5-10% of the text on those pages read by anyone. Most of the data read was read by clerks who were using the data as landmarks to try to find their way through the report to the page they were looking for. There was a lot of wasted time as the clerks searched through hard-copy reports looking for the handful of data elements they needed to respond to a query.

The Quest Begins

That was the moment that I began to ponder the study of information management. Suddenly indexing, compression, just-in-time data access, and many other concepts had new value. We were not just collecting records and records of data, we were attempting to provide instant access to specific pieces of data to answer specific questions.

Those were the days before "big data", "machine learning",, "data science", and many other information technologies that we take for granted today. The problems, however, that we were trying to solve still are the problems today. We have a lot of data, with more being generated every day. Individuals who understand what that data means need instant access to specific data that will help them do their job and solve problems.

It remains easier to generate huge amounts of data than it does to make sensible use of that data to provide information to solve problems and answer questions. Since that time, it has been my desire to help build tools to make human use of data more efficient and helpful. I hope this blog will be an impetus for me to take the next step in participating in that quixotic quest.