Friday, December 3, 2010

Robert Darnton The Library: Three Jeremiads from the NY Review of Books, Dec. 23, 2010

The Library: Three Jeremiads from the NY Review of Books, Dec. 23, 2010
http://www.nybooks.com/articles/archives/2010/dec/23/library-three-jeremiads/?pagination=false

Excellent piece on the crises in research libraries. It's all about money and the lack of it for research libraries.

3 Jeremiads, 3 problems

1st: Monographs and scholarship
" ... a vicious circle: the escalation in the price of periodicals forces libraries to cut back on their purchase of monographs; the drop in the demand for monographs makes university presses reduce their publication of them; and the difficulty in getting them published creates barriers to careers among graduate students."

"Another rule of thumb used to prevail among the better university presses. They could count on research libraries purchasing about eight hundred copies of any new monograph. By 2000 that figure had fallen to three or four hundred, often less, and not enough in most cases to cover production costs. Therefore, the presses abandoned subjects like colonial Latin America and Africa."

2nd: Journals
"A few years later, “sustainability” had become a buzz word, and the inflationary spiral of journal prices had continued unabated. In 2007 I became director of the Harvard University Library, a strategic position from which to take the full measure of the business constraints on academic life. Although economic conditions had worsened, the faculty’s understanding of them had not improved."

"How many professors in chemistry can give you even a ballpark estimate of the cost of a year’s subscription to Tetrahedron (currently $39,082)?"

"At Harvard we developed a new model. By a unanimous vote on February 12, 2008, professors in the Faculty of Arts and Sciences bound themselves to deposit all of their future scholarly articles in an open-access repository to be established by the library and also granted the university permission to distribute them."

3rd: Google Books
"The fundamental incompatibility of purpose between libraries and Google Book Search might be mitigated if Google could offer libraries access to its digitized database of books on reasonable terms. But the terms are embodied in a 368-page document known as the “settlement,” which is meant to resolve another conflict: the suit brought against Google by authors and publishers for alleged infringement of their copyrights."

"Despite its enormous complexity, the settlement comes down to an agreement about how to divide a pie—the profits to be produced by Google Book Search: 37 percent will go to Google, 63 percent to the authors and publishers. And the libraries? They are not partners to the agreement, but many of them have provided, free of charge, the books that Google has digitized. They are being asked to buy back access to those books along with those of their sister libraries, in digitized form, for an “institutional subscription” price, which could escalate as disastrously as the price of journals."

"... my happy ending: a National Digital Library—or a Digital Public Library of America (DPLA), as some prefer to call it."

Monday, November 15, 2010

Open Bibliographic Data Guide: JISC study on the business cases for Open Bibliographic Data

http://obd.jisc.ac.uk/

links to the _The Guide to Open Bibliographic Data_ that JISC developed on behalf of its partners in the Resource Discovery Task Force. It is about the business cases for Open Bibliographic Data – releasing some or all of a library’s catalog records for open use and re-use by others. The Guide uses 17 use cases to explore
* How to license the data
* Legal issues to be considered
* Potential costs and savings
* Practical implications in terms of processes, effort and skills
* Data formats and other technical options

The assumed rationale is about discoverability and is gaining in credibility the more our resources are discovered from ‘out there’ (through such as Google) and not from ‘in here’ (through the local OPAC). --most of the above quoted or modified slightly from the Guide.

A PDF version of the use cases is at http://obd.jisc.ac.uk/wp-content/uploads/2010/11/Open-Bibliographic-Data-The-Use-Cases.pdf

On a quick review, the case studies seem useful: specific, brief, comprehensive. Each use case includes sections on description, motivation, benefits, consequences, rights & licensing, practicalities and costs.

A table of use cases and examples is at http://obd.jisc.ac.uk/examples

An example for use case 1 (publish data for unspecified use) is Open Library http://openlibrary.org and another is Cambridge U. Library http://openbiblio.net/2010/10/05/jisc-openbibliography-cul-data-release/

An example for use case 2 (publish open Linked Data for unspecified use) is Libris, the joint catalogue of the Swedish academic and research libraries http://libris.kb.se/

Wednesday, November 10, 2010

Understanding linked data and its potential for libraries

I've been hearing a lot about linked data in the past year, and I find that I'm very fuzzy about what linked data is and how it matters or might matter to libraries, to organizations that have libraries and to people who may use libraries.

My first question is What is linked data? A good starting place for me is the definition at http://linkeddata.org

"Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as 'a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.'" --linkeddata.org, read Nov. 10, 2010.

I checked wikipedia for a definition and found a slightly different, more technical definition of linked data.

"Linked Data is a sub-topic of the Semantic Web. The term Linked Data is used to describe a method of exposing, sharing, and connecting data via dereferenceable URIs on the Web." --wikipedia, read on Nov. 10, 2010.

Of course, I had no idea what "dereferenceable URIs" are. Well, a dereferenceable URI is the normal and obvious way that links on the Web work: a URI refers to a page that the web server returns a copy of.

I'm in a technical vocabulary thicket,and I don't want to be. That may be useful later, but not now. I need to put it in my own words or into words I understand.

Let me try working with that definition cited by linkeddata.org: "... a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.'

Linked data is a way to expose, share, or connect to data on the Web so that the data can be understood or is meaningful to other machines on the Web. This is how Linked Data is a sub-topic of the Semantic Web. Additionally, Linked Data uses URIs as names for things and RDF as the data model so that statements about resources (in particular Web resources)are made in the form of subject-predicate-object expressions, and these expressions are known as triples.

Well, that is making sense to me, but I don't know that it would make much sense to anyone else, or be seen by anyone else as an improvement over the other available definitions. It helps me, though. That is enough for now.

Wednesday, October 20, 2010

Ed Summers on the linked data release from Deutschen Nationalbibliothek

See Ed Summers' comments on the DNB release of linked library data at
http://inkdroid.org/journal/2010/10/19/linked-library-data-at-the-deutschen-nationalbibliothek/

Summers' piece is worth reading.

Here are some highlights.

The Deutschen Nationalbibliothek (DNB)has released linked library data for

■1.8 million authors from the Personennamendatei (PND)
■1.3 million corporate bodies from the Gemeinsame Körperschaftsdatei (GKD)
■187,000 subject headings from the Schlagwortnormdatei (SWD)
■51,000 Dewey Decimal Classification categories
The full dataset that the DNB has made available for download amounts to 38,849,113 individual statements (aka triples).

See the DNB announcement at http://lists.w3.org/Archives/Public/public-lod/2010Oct/0016.html

This is a huge event for _library_ uses of linked data, and exemplary behavior from DNB. Other research and national libraries should emulate the DNB.

Summers cites Herta Müller's authority information as an illustration and notes the use of RDA vocabularies, which are also available as linked data. "RDF vocabularies are explicit ways of describing resources like people, places, topics, etc. When different things are described using the same vocabulary (or the vocabularies themselves are related together in a particular way) it becomes possible to merge the descriptions, and build software on top of it."

"Another really interesting thing to note about this RDF for Herta Müller are the links to Wikipedia (http://de.wikipedia.org/wiki/Herta_M%C3%BCller), VIAF (http://viaf.org/viaf/12324250) and dbpedia (http://dbpedia.org/resource/Herta_M%C3%BCller). These are important because they contextualize the DNB record for Herta Müller by relating it to other records for her, thus allowing it to be disambiguated from records describing other people named Herta Müller."

"[Summers] did some quick and dirty analysis of the full data dump from the DNB and found: 3,569,402 links to VIAF and 40,136 links to dbpedia (the Linked Data version of Wikipedia)."

Summers goes on to talk a bit about what more needs to be done.

"What remains to be done to some extent is leveraging this contextual information around our data in Library Applications, both cataloging, metadata enrichment applications and end user facing discovery applications."


There is a lot more in his piece, and links to many related tools, projects, and activities.

Monday, August 16, 2010

Getting on the cloud

On Friday, I began renting a virtual server from linode.com. It's a linux thing. I'm using Ubuntu. Renting the virtual server is cheap and pretty easy. My colleague, Daniel Lovins, is helping me with advice, encouragement and answers to a few dumb questions. (Thanks, for the hand-holding and the example, Daniel.) Today, with Daniel's help, I installed Apache and MySQL. I'm starting to learn Vi, too. My plan is to set up drupal, an instance of vufind, and the eXtensible Catalog Metadata Toolkit (XC MST), and then I'll see what I can do with these tools in this environment. I have a lot to learn, but I feel that I am now able to actually play with the right toys.

Monday, August 9, 2010

Dempsey on supply and demand

Lorcan Dempsey's Aug. 8, 2010 blog posting on "Sorting out demand" is insightful and useful. His ideas are his 3rd top trend as presented at the 2010 LITA Top Tech Trends panel at ALA Annual Conference. http://orweblog.oclc.org/archives/002124.html

His argument about the shift (in libraries) from a focus on managing supply to "sorting out demand" is an economic one. Costs - in time, effort or money - for the user drives what services the library can provide and the "right" structure for the library. Libraries in the 20th century reduced the user's supply-based transaction costs by integrating the many sources of supply and bringing them close to the user. Libraries in the 21st century (or this early part of it)must also reduce user costs but those will be different costs as the supply transaction costs are falling due to the effect of digital format and the network. Dempsey mentions several examples of how libraries can provide services on the demand side: recommendations, contextualizing content for particular communities, connective services, tailoring content to purpose, and managing institutional assets. Each of these examples is interesting and promising, but not seem to be as compelling as the 20th century economic rationale for libraries.

I continue to think that libraries as we know them will be changed in ways very like bookstores and printed journals or newspapers; they will not serve the necessary local distribution needs as well as a globally networked provider. Some will survive because of local peculiarities or because they develop special services for their community--these may be the same thing. Research libraries will become more research museum-like, that is, more artifact-centric. But how a university, for instance, manages its institutional digital repositories and its licensed online resources may make more sense outside of the library. Yale has created a university-wide Office of Digital Assets and Infrastructure, that, though in its institutional infancy, is clearly the focal point on digital repositories at Yale, digital preservation at Yale, and discovery of resources across the university's many collections in libraries, archives, museums, etc. The changing economics will change the institutional structures. The more radical the changes in economics the more radical the resulting institutional changes.

Lastly, Dempsey closes his post with a link to Dan Chudnov's prescient 2006 post "help people build their own libraries" http://onebiglibrary.net/story/because-this-is-the-business-weve-chosen

Friday, August 6, 2010

catalogs and scholarship: a bit of sentiment

OK, it must be Paul Courant day here or something like that. I was looking at his blog post about the closing and more specifically the removal of the U. Michigan card catalog, and in his discussion of responses to the removal he spoke of his own sentimental response. These are the lines that struck me:

"... I’ll always remember the card catalog as the rich, powerful and brilliant piece of scholarship that it was, and as a place that I visited in eager anticipation of learning something new. I don’t think that I was ever disappointed."

The catalog as a work of scholarship is a view one rarely hears anymore, but it is the correct view of the card catalog of a research university. Those catalogs were the heart of the heart of the university, or the soul in the machine, or whatever lovely, sentimental phrase you most prefer to use. When one was "in" the catalog following a citation or browsing an author's works or perusing a subject, one was in more or less the active mind of the library and thus one could imagine it as the mind or memory of the university or of scholarship itself.

If the catalog was a work of scholarship, then the cataloger was a scholar. And in that I think we can feel the loss that many catalogers feel when they consider the past 30 years and look ahead to the future of cataloging and libraries: their work as scholars is at an end. It is possible to see a descending arc--the catalog as scholarship to the catalog as information repository to the catalog as a database. And the arc descends for the cataloger from scholar to information manager to data assistant. This view is a sentimental one and a depressing one; it is not objectively true, but I know that it feels true to many catalogers. It is part of the sorrow of catalogers that many lament the loss of status as scholars and don't feel any warmth for the status of a data-centric programmer.

Paul Courant's talk at OCLC: "Economic Perspectives on Academic Libraries,"

Paul Courant's recent talk at OCLC is now online. It is called, "Economic Perspectives on Academic Libraries," and it is worth a look and listen. http://www.oclc.org/research/news/2010-08-05.htm

His talk and slide take about 70 minutes. There is a QA video, too. Another 15 minutes or so.

His talk is interesting just to get his perspective as a university librarian (U. Michigan) and as a former provost (also at UM) and as an economist (on faculty at UM).

Summary: A library is a complicated institution, a big nonprofit business that supports the mission of an even bigger nonprofit business. It plays essential roles in the production and distribution of scholarship, which can be understood as an industry. For over a century, a library's focus has been almost entirely on the interests of the local institution and its local value has been almost entirely dependent on it role in sharing the costs of expensive information. However, digital information technology is radically altering the value of that focus and that role. A library is profoundly affected by both the emerging role of the network and by the fact that copying and distribution (of books, articles, etc.) are now very cheap. Courant develops these themes and shares some of his thoughts on the effective and efficient functioning of academic libraries.

There is no transcript to read, but there are 17 slides to view and both and MP3 to listen to and a Webcast to watch and hear.

His blog is at http://paulcourant.net/ and is called Au Courant.

Thursday, July 29, 2010

two surveys

Two surveys worth looking at. One by Ithaka S+R and one by BL/JISC.
The Ithaka S+R report:
Faculty Survey 2009: Key Strategic Insights for Libraries, Publishers, and Societies (April 7, 2010) at
http://www.ithaka.org/ithaka-s-r/research/faculty-surveys-2000-2009/Faculty%20Study%202009.pdf

The BL/JISC report:
Researchers of Tomorrow:A three year (BL/JISC) study tracking the research behaviour of 'Generation Y' doctoral students. Annual Report 2009-2010 (June 2010) at
http://explorationforchange.net/attachments/056_RoT%20Year%201%20report%20final%20100622.pdf

Each is excellent and worth reading, but my first impression is that nothing here is surprising. Digital technologies are transforming scholarship and communication and the relation of libraries (and archives and musuems) to scholars (faculty and graduate students) is changing. The need for libraries as direct intermediaries between scholars and local collections is lessening. Digital technologies present new opportunities for scholarship and communication, but institutions--publishers, societies, libraries, universities--have been slow to capitalize on them in any coordinated way. Researcher behaviors have adapted to the new technologies and have as yet held on to traditional attitudes, values and skills per evaluation and use of sources.

Wednesday, April 7, 2010

Digital information seeker--a report on OCLC, RIN, and JISC projects

The Digital Information Seeker: Report of findings from selected OCLC, RIN and JISC user behaviour projects is out. http://www.jisc.ac.uk/media/documents/publications/reports/2010/digitalinformationseekerreport.pdf

It was produced for JISC by Lynn Silipigni Connaway, PhD and Timothy J Dickey, PhD, OCLC Research. Dated Feb. 15, 2010.

This report gives a really nice look at the landscape of user studies. Its 61 pages are a succinct review of a selected sample of studies.

The report was not intended to be definitive. It provides a synthesis. The report makes it easier for librarians and other information professionals to better understand the information-seeking behaviors of libraries’ intended users. It also makes it easier to review the issues associated with developing information services and systems to best meet users’ needs.

The 12 studies included in this report:

Perceptions of libraries and information resources (OCLC, December 2005),
http://www.oclc.org/us/en/reports/2005perceptions.htm

College students’ perceptions of libraries and information resources (OCLC, April 2006),
http://www.oclc.org/us/en/reports/perceptionscollege.htm

Sense-making the information confluence: The whys and hows of college and university user satisficing of information needs (IMLS/Ohio State University/OCLC, July 2006),
http://www.oclc.org/research/projects/imls/default.htm

Researchers and discovery services: Behaviour, perceptions and needs (RIN, November 2006),
http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-and-discoveryservices-behaviour-perc

Researchers’ use of academic libraries and their services (RIN/CURL, April 2007),
http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-use-academiclibraries-and-their-serv

Information behaviour of the researcher of the future (CIBER/UCL, commissioned by BL and JISC, January 2008),
http://www.jisc.ac.uk/media/documents/programmemes/reppres/gg_final_keynote_11012008.pdf

Seeking synchronicity: Evaluating virtual reference services from user, non-user and librarian perspectives (OCLC/ IMLS/ Rutgers, June 2008),
http://www.oclc.org/research/projects/synchronicity/default.htm

Online catalogs: What users and librarians want (OCLC. March 2009),
http://www.oclc.org/us/en/reports/onlinecatalogs/default.htm

E-journals: Their use, value and impact (RIN, April 2009),
http://www.rin.ac.uk/our-work/communicatingand-disseminating-research/e-journals-their-use-value-and-impact

JISC national e-books observatory project: Key findings and recommendations (JISC/UCL, November 2009),
http://www.jiscebooksproject.org/

Students’ use of research content in teaching and learning (JISC, November 2009),
http://www.jisc.ac.uk/media/documents/aboutus/workinggroups/studentsuseresearchcontent.pdf

User behaviour in resource discovery (JISC, November 2009),
http://www.jisc.ac.uk/whatwedo/programmes/inf11/userbehaviourbusandecon.aspx

Implications for libraries:

• Each library serves many constituencies with different needs and behaviors.
• Each library must do better at providing seamless access to resources.
• Each library must recognize that more digital resources of all kinds are better for users.
• Each library must prepare for changing user behaviors.
• Each library's access tools need to look and function more like search engines and Web services since these are familiar to users and they are comfortable and confident in using them.
• Each library must value high-quality metadata for its resources since metadata is vital for discovery.
• Each library must better promote its brand, its value, and its resources within its community.

Friday, March 26, 2010

Research Libraries, Risk and Systemic Change from OCLC Reseach

Research Libraries, Risk and Systemic Change: a report from OCLC Reseach is available as an online pdf at http://www.oclc.org/research/publications/library/2010/2010-03.pdf

This report is from data gathered in 2008 and used at OCLC internally since then, but it is just now being published externally. It makes a clear, concise, and compelling case for change in libraries. The identification of risks and the strategies for mitigation are sensible and prudent. One wonders how the research library community could act collectively to implement these strategies. The Association of Research Libraries seems to be one useful organization to shape collective action. Coalition for Networked Information (CNI) is another.

How might this analysis be used by individual libraries or ad hoc groups (such as the Ivy plus libraries or the borrow direct libraries) to mitigate the risks and seize the opportunities for which these risks are but shadows?

Friday, March 19, 2010

OCLC report: Implications of MARC tag usage on library metadata practices

OCLC has published a report called, Implications of MARC Tag Usage on Library Metadata Practices[pdf]

I've only begun to read it.

The "implications" section of the Exec. Summ. are interesting.

1. Be consistent. Splitting content across multiple fields will negatively affect indexing, retrieval, and mapping to other encoding schema.
2. Respond to local user needs whether that is counting plates in a book or adding contents notes.
3. Focus on authorized names, classifications, and controlled vocabularies that key word searching of full-text will not provide (as full text online negates some value of descriptive surrogates).
4. Use specific MARC fields for particular types of note if they are available rather than the general 500 note.
5. Map the 200 or so MARC 21 fields in use to simpler schema. (MARC data cannot continue to exist in its own discrete environment, separate from the rest of the information universe. Leverage it and use it in other domains to reach users in heir own networked environments.)
6. Accuracy of fields that are used in machine matching becomes more important in environments using linked data to leverage fuller descriptions and other related information generated from other sources.

And MARC's future:

1. MARC is a niche data communication format approaching the end of its life cycle.
2. Encoding schema will need to robust MARC crosswalks to ingest millions of legacy records.
3. How would we create, capture, structure, store, search, retrieve, and display objects and metadata if we didn’t have to use MARC and if we weren’t limited by current library systems?
4. How do we best take advantage of linked data and avoid creating the same redundant metadata in individual records?
5. How do we integrate library metadata with sources outside the traditional library environment?
6. To meet the demands of the rest of the information universe, give priority to interoperability with other encoding schema and systems.