Monday, October 31, 2011

The LC Bibliographic Framework Transition Initiative plan

The Bibliographic Framework Transition Initiative plan is available at:

http://www.loc.gov/marc/transition/news/framework-103111.html

The plan itself is a 10 p. PDF

http://www.loc.gov/marc/transition/pdf/bibframework-10312011.pdf

Bibliographic framework is intended to indicate an environment rather than a "format".

Key points

-Broad accommodation of content rules and data models. The new environment should be agnostic to cataloging rules, in recognition that different rules are used by different communities, for different aspects of a description, and for descriptions created in different eras, and that some metadata are not rule based.

-Provision for types of data that logically accompany or support bibliographic description, such as holdings, authority, classification, preservation, technical, rights, and archival metadata.

-Accommodation of textual data, linked data with URIs instead of text, and both.

-Consideration of the relationships between and recommendations for communications format tagging, record input conventions, and system storage/manipulation.

-Consideration of the needs of all sizes and types of libraries, from small public to large research.

-Continuation of maintenance of MARC until no longer necessary. It is recognized that systems and services based on the MARC 21 communications record will be an important part of the infrastructure for many years.

-Compatibility with MARC-based records.

-Provision of transformation from MARC 21 to a new bibliographic environment.

The new bibliographic framework project will be focused on the Web environment, Linked Data principles and mechanisms, and the Resource Description Framework (RDF) as a basic data model. The protocols and ideas behind Linked Data are natural exchange mechanisms for the Web that have found substantial resonance even beyond the cultural heritage sector. Likewise, it is expected that the use of RDF and other W3C (World Wide Web Consortium) developments will enable the integration of library data and other cultural heritage data on the Web for more expansive user access to information.

You all may also want to look at the report from Stanford on Linked Data at
http://www-sul.stanford.edu/about_sulair/news_and_events/Stanford_Linked_Data_Workshop_Report_FINAL.pdf

Published in October 2011, the report was compiled by Michael A. Keller, Jerry Persons, Hugh Glaser, and Mimi Calter. (60 pages, PDF)

The preliminary project timetable
The Library of Congress will develop a grant application in the next few months. The two-year grant will provide funding for the Library of Congress to organize consultative groups (national and international) and to support development and prototyping activities. Work to be done, then, more or less in 2012 and 2013 includes: developing models and scenarios for interaction within the information community, assembling and reviewing ontologies currently used or under development, developing domain ontologies for the description of resources and related data in scope, organizing prototypes and reference implementations.


Additional LC bibliographic framework transition links

Bibliographic Framework Transition Initiative Website
http://www.loc.gov/marc/transition/

Bibliographic Framework Transition Initiative Listserv
http://listserv.loc.gov/listarch/bibframe.html

Working Group on the Future of Bibliographic Control Website
http://www.loc.gov/bibliographic-future/

Thursday, October 27, 2011

Report of the Stanford Linked Data Workshop, 27 June – 1 July 2011 (published October 2011)

The Report of the Stanford Linked Data Workshop, 27 June – 1 July 2011 was published in October 2011. The report was compiled by Michael A. Keller, Jerry Persons, Hugh Glaser, and Mimi Calter. (60 pages, PDF)

The Stanford University Libraries (SULAIR) and the Council on Library and Information Resources (CLIR) had a week-long workshop on the prospects for a large scale, multi-national, multi-institutional prototype of a Linked Data environment for discovery and navigation of academic information resources. The report summarizes the workshop, charts the use of Linked Data in cultural heritage venues, and has short biographies and statements from each of the participants.

The accompanying survey is available at http://www.clir.org/pubs/archives/linked-data-survey/


With the assistance of other participants, the Stanford team will generate a model for a multi-national, multi-institutional discovery environment built on Linked Open Data demonstrating to end users, our communities of researchers the value of the Linked Data approach.

From the conclusion:
"Given the proliferation of URIs, whether RDF triples or more, from numerous sources it seems plausible to attempt to model and then construct a discovery and navigation environment for research purposes based on the open stores of RDFs becoming available. To many of us, this seems a logical next step to the vision of the hypertext/media functions in a globally networked world of Vannevar Bush, Ted Nelson, and Douglas Englebart. It is highly significant to us as well that Tim Berners-Lee, responsible for the launch of the World Wide Web, has led this line of thought through his publications and presentations and those of his colleagues at the University of Southampton, Wendy Hall and Nigel Shadbolt."

Tuesday, October 11, 2011

Final Report PCC ISBD and MARC Task Group September 2011 (78 p.) PDF

The _Final Report_ of the PCC ISBD and MARC Task Group (September 2011) has been posted online. 78 pages. PDF.

"The MARC21 community needs to transition to an environment where nearly all records created omit ISBD punctuation.

As an initial step in such a transition, the Program for Cooperative Cataloging called for the establishment of a task group to further investigate these issues. The PCC ISBD and MARC Task Group was established in March 2011 and was charged with the following: investigate the omission of ISBD punctuation from the cataloging process in favor of having cataloging interfaces generate punctuation needed for display; perform a field-by-field analysis of MARC to identify instances of embedded ISBD punctuation; and, identify the use of any non-ISBD punctuation present in fields."

The report is really well-done. Robert Bremer, OCLC, chaired the group.

Monday, September 12, 2011

Thinking about how to use the Open Metadata Registry and HIVE to create vocabulary services for Yale's Cross Collection Discovery tool (CCD) and Yale's metadata workflow router and metadata editor, Ladybird

After talking today with Daniel Lovins (Emerging Technology Librarian at Yale), I have begun thinking about how to use the Open Metadata Registry and HIVE to create vocabulary services for Yale's Cross Collection Discovery tool (CCD) and Yale's Ladybird, a metadata editor/router. I have a long way to go, but Daniel has gotten me started on what should be a promising path for me and for Yale library. Thanks, Daniel.

The Metadata Registry provides services to developers and consumers of controlled vocabularies and is one of the first production deployments of the RDF-based Semantic Web Community's Simple Knowledge Organization System (SKOS).

HIVE is an automatic metadata generation approach that dynamically integrates discipline-specific controlled vocabularies encoded with the Simple Knowledge Organisation System (SKOS). HIVE assists content creators and information professionals with subject cataloging and provides a solution to the traditional controlled vocabulary problems of cost, interoperability, and usability.

Yale's Cross Collection Discovery (CCD) provides a way to search across Yale's collections of art, natural history, books, and maps, as well as photos, audio, and video documenting people, places, and events that form part of Yale's institutional identity and contribution to scholarship.

Yale's Ladybird is a staff tool that manages the workflow and routing of digitized materials (and associated metadata) to a DAM system and to the Web. In order to perform these tasks, the interface is used to create or edit a metadata record for the digital representation of the asset.

COMET (Cambridge Open METadata) project completed July 2011

COMET (Cambridge Open METadata) project was completed this July 2011.

The COMET blog final post sums up the work done, the lessons learned, and indicates some next steps.

The CUL open data service is worth a look. (It is funded under the JISC Infrastructure for Resource Discovery program.)

I was particularly interested in the document on the ownership of MARC21 records by Hugh Taylor, Head of Collection Description and Development at Cambridge University Library. It is nice brief on the issues of intellectual property law and contracts and licences as they relate to MARC21 records in library catalogs. Hugh is very good on "reading the ownership of MARC 21 bibliographic records." The whole project is nicely documented at the COMET project blog.

Friday, August 12, 2011

Restoring the Sterling Memorial Library nave at Yale

Yale is beginning a project to restore the nave of the Sterling Memorial Library. Our new UL, Susan Gibbons, asked staff to share their ideas about how to restore the nave. Here is my idea.

I'll start with the past. The nave originally held the public card catalog. Because of that and because Yale's scholarly information collections were much of that time primarily paper-based and held mostly in Sterling Memorial Library, the nave/catalog was _the_ portal to Yale's collections. One had to use the catalog and the nave to use the library and one had to use the library because that was where the knowledge was stored. Staff and users stood side by side in the nave at the catalog--creating it, maintaining it, and using it. The nave was vital to the life of the mind at Yale, and it was a vital center of the community life of scholars at Yale. That era is gone and now the nave is empty. I refer to it among friends as the "Dead Zone." The soul of the nave has fled. Only the shell of it remains.

Let's turn to the future. We can only restore the nave by breathing a new soul into the space. Reverence for what was will not restore the nave. Nostalgia won't either. How do we breathe new life into that space?

The nave was alive when it was full of scholars (undergrads and up) and librarians using the great collections at Yale to advance their knowledge. We can restore the nave to life by bringing back purposeful scholars and librarians (not attract tourists). The restored nave must become again a tool for discovery and use of knowledge, a portal into a great scholarly workshop.

The nave (and such other spaces as the periodical, newspaper, L and B, and main reading rooms) must combine desirable spaces with available tools--digital and analog--in such a way as to become a sharable, collaborative workspace. It can't just be a bank of computers, or a coffee shop, or a classroom, or a study hall, or a co-working space--though it may need something of each of those; it must be a space for labor-intensive interactions among scholars, librarians, archivists, IT specialists, etc., and the collections--digital and analog. The key to the success of such a scholars' workshop is a mix of vision/conceptualization, tools (not just banks of computers), and, most importantly, skilled staff that generates a vibrant scholars' workshop.

Monday, July 11, 2011

Is a Bookless Library Still a Library?

Time magazine has an article by that title written by Tim Newcomb, dateline July 11, 2011. It's main topic is Drexel's new library--built for a bookless library. Newcomb quotes Danuta Nictecki, formerly at Yale library.

But I write about it here mainly for the question the article asks in its title but doesn't begin to answer. Is a Bookless Library Still a Library?
How we answer it has important consequences for those served by libraries.

It isn't a simple question at all. The difficulty is first in the confusion between books as objects and what those objects carry. The Drexel library still has book content and journal content and etc.; it doesn't have that content in the familar codex format we call books; it still has the content of the books and journals and etc., but it has it in digital, online formats. Libraries without books are like banks without gold deposits in vaults or without paper currency or coins. A bank isn't a bank because it has money in one particular format; it's a bank because it provides services related to our use of money. We still call them banks--even though we don't have to walk into them anymore to use them.

Wednesday, June 29, 2011

The W3C Library Linked Data Incubator Group DRAFT REPORT

The W3C Library Linked Data Incubator Group has issued a draft report for commentary. Linked library data is an opportunity to bring library concepts and practices to the Web in ways that transcend any individual library and its individual limitations.

I quote here one line from the benefits section. "The Linked Data approach offers significant advantages over current practices for creating and delivering library data while providing a natural extension to the collaborative sharing models historically employed by libraries, archives, and museums ("memory institutions")." And one more from the benefits to "memory institutions." "By using Linked Data, memory institutions will create an open, global pool of shared data that can be used and re-used to describe resources, with a limited amount of redundant effort compared with current cataloguing processes." Cheaper, faster, better.

From a cataloger's viewpoint, library linked data is the ultimate cooperative cataloging environment and the ultimate user services environment.

The report
http://www.w3.org/2005/Incubator/lld/wiki/DraftReportWithTransclusion
includes these sections:

Benefits
Vocabularies and Datasets
Relevant Technologies
Implementation challenges
Recommendations

Two related parts are:

Use Cases, a survey report describing existing projects
http://www.w3.org/2005/Incubator/lld/wiki/UseCaseReport

Vocabularies and Datasets, a survey report
http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset

The LLD XG invite comments from interested members of the public.

Feedback can sent as comments to individual sections posted on the dedicated blog at http://blogs.ukoln.ac.uk/w3clld/ or by email to the public mailing list (public-lld@w3.org, archived at http://lists.w3.org/Archives/Public/public-lld/ ) using descriptive subject lines such as '[COMMENTS] "Benefits" section.'

Comments are especially welcome in the next four weeks (through 22 July). Reviewers should note that as with Wikipedia, the text may be revised and corrected by its editors in response to comments at any time, but that earlier versions of a document may be viewed by clicking on the History tab.

It is anticipated that the three reports will be published in final form by 31 August.

Tuesday, June 14, 2011

RDA is a go (conditionally)

The Library of Congress, the National Agricultural Library, and the National Library of Medicine have issued an executive summary statement from their Executives on the Report and Recommendations of the U.S. RDA Test Coordinating Committee on the implementation of RDA—Resource Description & Access at http://www.nlm.nih.gov/tsd/cataloging/RDA_report_executive_summary.pdf

The cover statement by the executives of LC, NAL, and NLM is available at: http://www.nlm.nih.gov/tsd/cataloging/RDA_Executives_statement.pdf

The official statement is:


“We endorse the report, with the conditions articulated by the committee. Even though there are many in the library community who would like to see a single “yes” or “no” response to the question should we implement RDA, the reality is that any standard is complicated and will take time to develop. We also recognize that the library world cannot operate in a vacuum. The entire bibliographic framework will have to change along the lines recommended in the report of the Working Group on the Future of Bibliographic Control. The implementation of RDA is one important piece, but there are many others that must be dealt with simultaneously. We especially note the need to address the question of the MARC standard, suggested by many of the participants in the RDA test. As part of addressing the conditions identified, LC will have a small number of staff members who participated in the test resume applying RDA in the interim. This will allow LC to prepare for training, documentation, and other preparatory tasks related to the further development and implementation of RDA.

The conditions identified by the Test Coordinating Committee must be addressed immediately, and we believe that the Committee should continue in an oversight role to ensure that the conditions are met. We have discussed the Committee’s recommendations with the Library of Congress Working Group on the Future of Bibliographic Control. We will continue to work closely with the Working Group on the Future of Bibliographic Control to think about the overall direction of bibliographic control and the changes that are necessary to assure that libraries are in the best position to deliver twenty-first century services to users.

We believe that the long-term benefits of adopting RDA will be worth the short-term anxieties and costs. The Test Coordinating Committee quite rightly noted the economic and organizational realities that cause every librarian to ask if this is the time to make a dramatic change in cataloging. Our collective answer is that libraries must create linkages to all other information resources in this Web environment. We must begin now. Indefinite delay in implementation simply means a delay in our effective relationships with the broader information community.”

Monday, May 23, 2011

Library of Congress: Bibliographic Framework Transition Initiative

The Library of Congress has begun a "Bibliographic Framework Transition Initiative."

"A major focus of the initiative will be to determine a transition path for the MARC 21 exchange format in order to reap the benefits of newer technology while preserving a robust data exchange that has supported resource sharing and cataloging cost savings in recent decades."

"This work will be carried out in consultation with the format's formal partners -- Library and Archives Canada and the British Library -- and informal partners -- the Deutsche Nationalbibliothek and other national libraries, the agencies that provide library services and products, the many MARC user institutions, and the MARC advisory committees such as the MARBI committee of ALA, the Canadian Committee on MARC, and the BIC Bibliographic Standards Group in the UK."

This could make the RDA effort look like a piece of cake. How will the process be arranged to include these players?

Good luck, though. Sounds like fun to me. Let's get to work.

The press release has more information.

Monday, May 16, 2011

The MARC pilot project final report by Henriette D. Avram (1968) 173 p.

The MARC pilot project final report by Henriette D. Avram (1968) 173 p.

A bit of history today. The MARC pilot project launched library cataloging into the digital era in 1968. We are still using the MARC format today, but it is widely understood to be ready (or long overdue) for replacement.

As we bid to leave the MARC format behind, Avram's report on the pilot project is well worth perusing.

My employer, Yale University Library, was one of the participating libraries.

As I have continued to think about this today, I re-read Roy Tenant's 2004 article, A Bibliographic Metadata Infrastructure for the 21st Century in which Tenant broadly identifies ways we must "assimilate MARC into a broader, richer, more diverse set of tools, standards, and protocols." Tenant sees that we don't need another bibliographic format, we need an infrastructure that accommodates wide diversity in formats. This is an excellent article. Well worth reading for its application to our current situation nearly a decade after he wrote it.

We do need a replacement for MARC, one that will be better suited to the infrastructure Tenant outlines. The Network Development and MARC Standards Office at the Library of Congress, the Standards division at the Library and Archives Canada and the Bibliographic & Metadata Standards section at the British Library, the maintenance agencies staffed to care for MARC, should initiate a MARC replacement project.

Friday, May 13, 2011

Yale's Open Access policy

Yale's "Open Access" policy
Yesterday, Yale announced its new "open access" policy for online images of millions of objects housed in Yale's museums, archives, and libraries, and more than 250,000 images are available through a newly developed collective catalog.

The goal of the new policy is to make high quality digital images of Yale's vast cultural heritage collections in the public domain openly and freely available. Yale is using a Creative Commons license, Attribution 3.0 Unported (CC BY 3.0) for the open access material. "This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials."

This policy is a big big success for Yale and its Office of Digital Assets and Infrastructure (ODAI) and the libraries, archives, and museums at Yale. Without a centralizing, coordinating agency on campus, I can't imagine that Yale University would have been able to make this decision and connect it to a tool for discovery across Yale's many cultural and scientific units such as the libraries at Yale, the Yale University Art Gallery, the Peabody Museum of Natural History, and the Yale Center for British Art. ODAI is proving its value to Yale.

Thursday, April 14, 2011

RDA and the eXtensible Catalog

Dave Lindahl and Jennifer Bowen, XCO Co-Executive Directors, wrote a brief statement that describes the benefits of implementing RDA for new metadata and discovery applications such as XC.

http://hdl.handle.net/1802/14588

In early March, Dave Lindahl and Jennifer Bowen met with the US RDA Test Coordinating Committee at the Library of Congress to discuss XC's partial implementation of RDA. The committee invited Dave and Jennifer to submit a written statement for inclusion as an Appendix to the group's final report, due out within the next month, which will include recommendations regarding whether and how the US national libraries (LC, NLM, NAL) will implement RDA.

From the statement:

"XC software represents the first live implementation of a subset of RDA in a FRBR‐based, non‐MARC environment. XC’s implementation of RDA has been led by individuals who have participated in the development of both the RDA Toolkit and the RDA vocabulary registry. XC’s use of RDA has also been informed by the real‐world requirements of actual working software, as well as through a user research process conducted at four ARL libraries."

and

"A community‐wide implementation of RDA within the library world will benefit not only users of the eXtensible Catalog, but also developers and users of other applications that make information about library collections accessible via the open web."

Monday, March 28, 2011

“What we find changes what we seek.”

A librarian at Yale, Daniel Lovins, just led me to a review of a new book by Peter Morville and Jeffrey Callender--_Search Patterns: Design for Discovery_ (2010, O’Reilly). The review I read is at "I'd rather be writing," a blog about technical communication trends. The review is at http://idratherbewriting.com/2011/03/28/book-review-search-patterns-by-peter-morville-and-jeffrey-callender/

The line I took for the title of this post was one singled out by Tom Johnson, and I think it sums up the value of facets within search tools and points to a larger truth about how searching and discovery changes the searcher.

MADS/RDF Primer available for public review

The MADS/RDF Primer is available for public review.
Status: Final Public Review Document
Updated: 28 March 2011
Previous Version: 19 November 2010

MADS/RDF is a way to record data from the Machine Readable Cataloging (MARC) Authorities format for use in Semantic Web applications and Linked Data projects. MADS/RDF is a knowledge organization system (KOS) designed for use with controlled values for names (personal, corporate, geographic, etc.), thesauri, taxonomies, subject heading systems, and other controlled value lists. The MADS ontology has been fully mapped to SKOS. MADS/RDF is designed specifically to support authority data as used by and needed in the LIS community and its technology systems.

Note that the MADS/RDF is intended mainly for those designing and implementing LIS technology systems.


Now I'll wait to see how it is recieved by those who know ontologies far better than I do.

Thursday, March 24, 2011

Yale's new UL, Susan Gibbons on future of libraries from ALCTS 2010 MW

A talk from ALCTS Midwinter meeting Jan. 15, 2010

http://www.ala.org/ala/mgrps/divs/alcts/resources/z687/gibbons.pdf
Time Horizon 2020: Library Renaissance
Susan Gibbons Vice Provost and Andrew H. & Janet Dayton Neilly Dean River Campus Libraries, University of Rochester

Gibbons says, "In thinking about the event horizon of 2010 to 2020, it is already clear that this will be a period of unprecedented change for libraries. More specifically this coming decade will mark the renaissance of technical services and a complete transformation of collection development. While the last ten years have witnessed a significant re-conceptualization of public services, it is technical services and collection development that will be at the center of the next significant phase of library transformation."

Read on.

New University Librarian for Yale: Susan Gibbons

Yale has hired a new university librarian: Susan Gibbons. See the following article from the Yale Daily News from Tuesday, March 22, 2011.

http://www.yaledailynews.com/news/2011/mar/22/new-university-librarian-headed-to-yale/

We're looking forward to a new UL.

Monday, March 7, 2011

A bit of required reading for academic librarians

On Lorcan Dempsey's blog: The Collections shift
http://orweblog.oclc.org/archives/002160.html

Read his blog, watch the embedded media, and read the reports and other writing's he cites.

This is the starting place for today's required reading. Dempsey notes a few things he's read or heard of recently about "several central trends: the move to electronic, the managing down of print collections, and the curation of institutionally-generated learning and research resources." These are the big three transformative trends for academic libraries and how librarians and libraries and the universities they serve deal with these three trends will determine the survivial of academic libraries.

The "move to electronic" is more precisely the move from paper-based media to digital media and the resulting transformation in the economics of the distribution and manufacturing of written materials--artistic, scientific, business-oriented, scholarly, personal etc. The sub-strata of materials has changed and that change has transformed the economics of production, distribution, and use of written work [and photography and music and etc.] Relatively expensive, durable and scarce materials such as books, reports, and journals that had to be distributed by ship, plane, train, truck or cart are now produced, distributed and used digitally. That transformation is not yet complete, but the completion date is approaching at an accelerating pace and will soon be here.

One consequence for libraries is the diminishment of the importance of existing print collections. Libraries must manage the shift from the centrality of the print collection to its successful fulfillment of its mission for the university it serves to a peripheral role in the _library's_ enterprise. Surviving that transition will not be easy. Thriving through that transition is almost unthinkable for anyone who equates libraries with books or thinks anything like "libraries are all about the humanities." Many academic libraries will flounder in this transition.

Curation of "institutionally-generated learning and research resources" is an opportunity for universities and not necessarily an opportunity for their libraries. A unveristiy press, university research labs, university museums or archives, offices of public affairs, IT units, and such newly formed entities as Yale's Office of Digital Assets and Infrastructure may take on much of a unversities curatorial role for their "institutionally-generated learning and research resources." Additionally, curation may not be best done in a networked, digital environment on an institution by institution basis. Curation of "institutionally-generated learning and research resources" is likely to require the scale of the network itself to be successful. Universities will need to coalesce around discipline-based networks to curate their "institutionally-generated learning and research resources."

Friday, February 4, 2011

Metadata guidelines for the UK RDTF

Andy Powell and Pete Johnston, of Eduserv, with funding from JISC, have put up some high level draft guidelines for "how metadata associated with library, museum and archival collections should be made available for the purposes of supporting resource discovery in line with the Resource Discovery Taskforce (RDTF) Vision."

See the guidelines at http://rdtfmetadata.jiscpress.org/

They are taking comments on the draft until Feb. 18.

You can read Powell's and Johnston's own commentary/announcement at their joint blog, eFoundations, at http://efoundations.typepad.com/efoundations/2011/02/metadata-guidelines-for-the-uk-rdtf.html

A few quotes from the guidelines.

"These guidelines have been developed such that they:

1. support the RDTF Vision;
2. are compatible with the outcomes of the JISC IE Technical Review meeting in London, Aug 2010;
3. are in line with Linked Data principles as far as possible;
4. are compatible with the W3C Linked Open Data Star Scheme;
5. are in line with Designing URI Sets for the UK Public Sector;
6. take into account the Europeana Data Model and ESE;
7. are informed by mainstream web practice and search engine behaviour and are broadly in line with the notion of “making better websites” across the library, museum and archives sectors."

"The guidelines are intended to help libraries, museums and archives expose existing metadata (and any new metadata that is created using existing practices) in ways that 1) supports the development of aggregator services and that 2) integrates well with the web of data. The intention is not to change existing cataloguing practice in libraries, museums and archives."

"RDTF metadata should be made openly available using one or more of three approaches, referred to below as the community formats approach, the RDF data approach and the Linked Data approach."

For what it is, it looks good. Powell and Johnston "believe that by putting this guidance in place it will be possible to create significantly more coherence in the way that metadata is created, managed and used across the library, archives and museum sectors than is currently the case."

Friday, January 14, 2011

Digital forensics

A CLIR report on digital forensics for born digital collections is out. Digital Forensics and Born-Digital Content in Cultural Heritage Collections by Matthew G. Kirschenbaum,Richard Ovenden,Gabriela Redwine with research assistance from Rachel Donahue.

The report makes a case for applying digital forensics, an applied field originating in law enforcement,computer security, and national defense, to the archives and curatorial community since libraries, special collections, etc. increasingly receive
computer storage media (and sometimes entire computers) as part of their acquisitions of "papers" from artists, writers, musicians, etc. Upwards of 90 percent of the records (i.e. personal and corporate "papers") being created today are born digital (Dow 2009, xi).

Here's a quote from the introduction: "Digital forensics therefore offers archivists, as well as an archive’s patrons, new tools, new methodologies, and new capabilities. Yet as even this brief description must suggest, digital forensics does not affect archivists’ practices solely at the level of procedures and tools. Its methods and outcomes raise important legal, ethical, and hermeneutical questions about the nature of the cultural record, the boundaries between public and private knowledge, and the roles and responsibilities of donor, archivist, and the public in a new technological era."

This report cites an earlier one that sounds good, too. "The starting place for any cultural heritage professional interested in matters of forensics, data recovery, and storage formats is a 1999 JISC/NIPO study coauthored by Seamus Ross and Ann Gow
and entitled Digital Archaeology: Rescuing Neglected and Damaged Data Resources. Although more than a decade old, the report remains invaluable."

Monday, January 10, 2011

OCLC report on managing print collections in mass-digitized library world

Malpas, Constance. 2011. Cloud-sourcing Research Collections: Managing Print in the Mass-digitized Library Environment. Dublin, Ohio: OCLC Research. http://www.oclc.org/research/publications/library/2011/2011-01.pdf.

Cloud-sourcing Research Collections is a 76 p. (pdf) analysis of the feasibility of outsourcing management of low-use print books held in academic libraries to shared service providers, including large-scale print and digital repositories.

Mass digitization projects like Google Books and shared online collections like the HathiTrust have given substance to the visions of a transformation of library use from paper to online resources. This "flip" and related demands for physical space and care of paper resources has resulted in renewed attention to print collections in academic libraries. This is the time for discussion within and among research libraries on how to construct new systems of services based on aggregations of digital resources, local paper resource collections and shared storage repositories for online and paper resources.

The report's main conclusion is:

"Based on a year-long study of data from the HathiTrust, ReCAP, and WorldCat, we concluded that our central hypothesis was successfully confirmed: there is sufficient material in the mass-digitized library collection managed by the HathiTrust to duplicate a sizeable (and growing) portion of virtually any academic library in the United States, and there is adequate duplication between the shared digital repository and large-scale print storage facilities to enable a great number of academic libraries to reconsider their local print management operations. Significantly, we also found that the combination of a relatively small number of potential shared print providers, including the Library of Congress, was sufficient to achieve more than 70% coverage of the digitized book collection, suggesting that shared service may not require a very large network of providers."

This points a way forward for academic libraries. The report might be an interesting frame for a discussion at Yale of how we think of our collections in this environment and how we move to use the environment to create services for readers. It is one of the few reports that integrates questions of online resources with paper resources. That kind of integrated approach to collections, preservation, user/reader services makes a lot more sense than digital only or print only approaches.