 |



 |
Volume 38

Number 23

July 20, 2006 |
|
 |
|
|
 |









|
|
More to be digitized with new ULS scanner
|
 |
|
 |
|
| Digital Research Library scanner technician Ted Tarka uses a new high-speed scanner to begin digitizing Pitt's Darlington Collection. |
|
 |
|
From a windowless room on the third floor of an office building in Point Breeze, a handful of Pitt employees are poised to make their digital mark on human history. Armed with stacks of antique books, a scanner and library administrators willingness to share, Pitts Digital Research Library is a partner in a project launched in 2005 by Yahoo! and the Internet Archive to build a searchable digital collection of the worlds books and multimedia content.
Pitts contribution to the Open Content Alliance (www.opencontentalliance.org) will start with the Darlington Collection, known as one of the Universitys richest sources of information on western Pennsylvania and Ohio Valley history. The books themselves, housed in the Darlington Library in the Cathedral of Learning, wont be going anywhere, but the University Library Systems acquisition of a new scanner means the Darlingtons historic content will have a broader audience than ever before.
We have pledged to contribute digital books on Americana, said ULS director Rush G. Miller, noting that the content of some 600 works already is available on line as part of the Historic Pittsburgh collection (a collaboration among Pitt, the Historical Society of Western Pennsylvania and the Carnegie Museum of Art).
The Open Content Alliance plans to launch a digital collection of Americana on line later this year, Miller said. This is exactly what theyre looking for, Miller said of the Darlingtons books and maps. Its a marvelous collection of early colonial history. This is a treasure thats a collection of the University that isnt well known, he said, adding that digitizing will allow historians and researchers to discover the collections thousands of books.
When we put something like this on line, the use is 100 times more, he said, adding that his goal is to have the project completed within two years.
The Digital Research Library already has placed a number of materials on line including former Pennsylvania Gov. Dick Thornburghs papers, a collection of 19th-century schoolbooks and more than 40 collections of images, including detailed photos of Chartres Cathedral.
To facilitate the Open Content Alliance project and increase the Universitys capacity to digitize more of its holdings, ULS has purchased a $100,000-plus high-speed scanner. In addition to eliminating constraints on the size and type of works that can be scanned, speed also will be increased. Manufacturers estimates peg the scanners speed at up to 500 pages per hour. If we could get 200 pages an hour, wed be happy, Miller said.
What we want to do is build a fairly robust internal capability to be able to tackle major projects, he said. Digital Research Library ccordinator Ed Galloway, three librarians and two scanner technicians now are working out the technical details of how to make that happen.
With much excitement but little fanfare, the new scanner arrived in late June and was installed at the Digital Research Library, located in the Universitys Thomas Boulevard facility in Point Breeze.
The size of a large desk, the scanner can hold oversized or fragile books, which previously either had to be scanned by outside vendors or could not be scanned at all.
To an outsider, the DigiBook SupraScan, manufactured by the French firm i2S, looks like a simple combination of a camera, light source and book holder. To the librarians, the machine represents the ability to broaden the range of the digitizing that can be done in-house.
At some point you realize were limited on what we can do because we have to rely on vendors, Galloway said. For example, the Digital Research Library couldnt scan maps, large books or one-of-a-kind items.
A cartload of about 100 books from the Darlington Library, known simply as batch 1, has been brought to the Digital Research Library for their moment in the spotlight. They come in a range of sizes, widths, ages and levels of fragility, but all have one thing in common: They predate 1923 a magic year for those wanting a quick assurance that theres no risk of copyright violations. Pre-1923 books are all in the public domain, Galloway explained.
Scanner technician Ted Tarka demonstrated the ease with which the machine works. He deftly placed a book in the cradle and closed the glass that holds it in place. Moving to a nearby computer monitor, he adjusted for margins and quality, pushed a button and the pages were scanned and saved in digital form.
With the new scanner, the Digital Research Library no longer will have to rely on outside firms. We desired to control our own destiny and gain the ability to scan large book collections, Galloway said, explaining that until now, books could not be scanned unless they were taken apart to place on flatbed scanners. In the past, to digitize books, We found duplicate copies and disbound them, he said. Needless to say, given the value of the Darlington works, Its no longer in our best interest to disbind the books, Galloway said.
The staff already are taking a second look at materials they had to pass up in earlier projects. Were going back to find things we couldnt do before, Galloway said.
But its not as simple as diving in and beginning to copy each page. Decisions need to be made, esthetic considerations determined.
Were trying to replicate the feel of a book that users are used to having, said Digital Research Library librarian Michael Bolam. That means deciding whether to scan in color, grayscale or simply black-and-white; correcting for the curvature of the books. and ensuring that the on-line quality is as close as possible to the book itself, all the way down to the markings and discolorations one might expect to see in antique books.
Technical aspects also need to be considered: Higher quality scans take longer, as do color scans. At the highest resolution, the Darlington digital collection could total some 48 terabytes (48,000 gigabytes) or, at a lower resolution, as little as 300 gigabytes. And theres post-production to consider.
Having the works in digital form is useless unless the information is accessible. Theres the whole other process of making it searchable, Galloway said. Thats all done here.
Acknowledging the Digital Research Librarys current state of rapid growth, This will give us the capacity to do more, Galloway said.
Kimberly K. Barlow
|
|
|
 |
 |
 |
 |
 |
|