How can we be sure we'll remember our digital past?
As technology evolves, data from outmoded machines is put at risk; panel addresses pathways and costs.
(Page 2 of 3)
The problem of digital preservation reaches across two standards. There's the media – floppies, CDs, hard drives – and the format of the files themselves – does it run in DOS, Hypercard, ClarisWorks 2.0?Skip to next paragraph
Subscribe Today to the Monitor
Microsoft tackles this issue of "legacy" computing by running a kind of corporate museum. The company protects its multiplatform history by preserving old copies of "every major hardware and software change," says Lee Dirks, director of Scholarly Communications at Microsoft and a task force member.
"We've got computers stored on campus that go back to the Altair, the first computer [to run Microsoft software]," he says. "In fact, we bought multiple copies of the Altair just in case."
But maintaining antique computers is a costly way to keep the past alive.
A concept that is gaining momentum, Mr. Dirks says, is emulation, where programmers trick modern computers into thinking the way their classic cousins did. This lets them run old software without retro machines. Another problem arises when the emulator itself is written for last generation's operating systems. Do you write an emulator to handle the original emulator?
A more likely approach to long-term preservation is migration, says Berman. This calls for updating the file format every generation – without changing the contents, one hopes. This method has problems, as well. Some of the original context will be lost in translation, says Dirks. Also, the scale of the conversation will snowball as the number, size, and back-catalog of the files increases with each passing generation of technology.
For example, after one year of photographing the night sky, LSST will likely produce more digital information on space than all past efforts combined, says Berman.
"So, again, who will pay for this?" asks Berman. "I don't expect being able to tell anyone in two years that it will be free."
A pay model?
While panel members are careful not to discuss possible recommendations in too much detail this early in the project, several of them mentioned basic economic models for making data accessible and sustainable.
They include an iTunes-style pay-per-use model, where users would be charged to download old books, census data, etc.; a privatized model, where businesses that already host pictures or files online agree to keep them for decades into the future; or a public-good model, where governments or endowments fund preservation.
"I think it's unlikely that we'll map out, 'Well, this fee structure will go to this kind of data and this model is for this industry,'" says Amy Friedlander, who serves on the task force and is director of programs at the Council on Library and Information Resources in Washington. "We should assume there will be a mix of strategies, because no model is mutually exclusive."
Deciding which files are worth saving is a judgment call the panel will leave to the community.
While the task force spends the next two years reflecting, many other alliances will be researching the problem as well.
At the same time that it's funding the Blue Ribbon Task Force, the National Science Foundation has offered $100 million for five organizations to design a "DataNet" for sustainable data preservation "over a decades-long timeline."