We encounter legacy data as a part of life. Colleges and universities have transcript records dating back decades or even centuries. Real estate property records in the United States go as far back as Spanish and British land grants in the 1500s. Very old records are completely paper based and may be completely manually prepared, perhaps typed on a typewriter or written in longhand with a quill pen.
How long should transcripts be retained? Nola Ochs graduated from college at age 95 (can you imagine being in a history class with her?). If she had started college during World War I, maybe she would have needed the old records to keep from having to repeat a course taken previously. That’s a long retention time, but not so long compared to land grant records. Deeds, titles, and liens have to be kept in perpetuity. Retaining old records is expensive.
Scanning old records is also expensive, but there can be indirect savings. What is the cost of losing the records due to fire, flood, or decay? Scanned images protect against such loss. Using scanned images protects fragile, antique records from unnecessary handling. The originals can be moved to a climate controlled storage facility that is better suited for document preservation, possibly at lower total cost. SQL Server 2008 FILESTREAM can be used to store the scanned images.
Although scanned documents help with data retention and availability issues, discoverability adds to the cost of scanned images. When scanned documents are stored, time must be taken to catalog the contents. Keywords and descriptive header data must be linked to the scanned documents to make the contents discoverable. Using the SQL Server 2008 GEOGRAPHY data type can facilitate discovery of geographic data such as property records. Proper analysis and requirements gathering is necessary to make sure that the appropriate searchable header data is identified.
Data migration from paper records such as college transcripts to digital records is a slow, somewhat expensive process of manually rekeying all of the data. The easiest way to avoid the cost is simply not to do it. Schools get few requests for decades old transcripts, so this is a good strategy most of the time. I recently requested transcripts from schools attended long ago. All of the schools could quickly snail mail an old transcript. It’s a relatively simple matter of printing from microfiche or copying a paper transcript and putting it in an envelope. It was interesting to learn that requesting an electronic transcript could actually take much longer than requesting a paper transcript. If the data never was migrated into the current system, requesting an electronic transcript requires that all of the data be manually keyed into the new system before the electronic transcript can be sent. That can take a long time in an academic bureaucracy.
Once legacy data is migrated into a modern system doesn’t mean that remains in digital form. One school reported that they lost a transcript that was sent to them electronically. I did not understand how that could happen, so I asked for an explanation. Once per day someone at the school (which will remain nameless) goes to the computer that receives the electronic transcripts. All of the electronic transcripts are printed as a batch. The printouts are distributed to the staff so that the data can be manually keyed into the academic records system. Sometimes the printouts of the electronic transcripts get misplaced. When the data is keyed in from the paper copy, errors can occur. Reconciliation of data across disparate systems is necessary to detect any errors and get them corrected.