Wednesday, October 20, 2010

Ed Summers on the linked data release from Deutschen Nationalbibliothek

See Ed Summers' comments on the DNB release of linked library data at
http://inkdroid.org/journal/2010/10/19/linked-library-data-at-the-deutschen-nationalbibliothek/

Summers' piece is worth reading.

Here are some highlights.

The Deutschen Nationalbibliothek (DNB)has released linked library data for

■1.8 million authors from the Personennamendatei (PND)
■1.3 million corporate bodies from the Gemeinsame Körperschaftsdatei (GKD)
■187,000 subject headings from the Schlagwortnormdatei (SWD)
■51,000 Dewey Decimal Classification categories
The full dataset that the DNB has made available for download amounts to 38,849,113 individual statements (aka triples).

See the DNB announcement at http://lists.w3.org/Archives/Public/public-lod/2010Oct/0016.html

This is a huge event for _library_ uses of linked data, and exemplary behavior from DNB. Other research and national libraries should emulate the DNB.

Summers cites Herta Müller's authority information as an illustration and notes the use of RDA vocabularies, which are also available as linked data. "RDF vocabularies are explicit ways of describing resources like people, places, topics, etc. When different things are described using the same vocabulary (or the vocabularies themselves are related together in a particular way) it becomes possible to merge the descriptions, and build software on top of it."

"Another really interesting thing to note about this RDF for Herta Müller are the links to Wikipedia (http://de.wikipedia.org/wiki/Herta_M%C3%BCller), VIAF (http://viaf.org/viaf/12324250) and dbpedia (http://dbpedia.org/resource/Herta_M%C3%BCller). These are important because they contextualize the DNB record for Herta Müller by relating it to other records for her, thus allowing it to be disambiguated from records describing other people named Herta Müller."

"[Summers] did some quick and dirty analysis of the full data dump from the DNB and found: 3,569,402 links to VIAF and 40,136 links to dbpedia (the Linked Data version of Wikipedia)."

Summers goes on to talk a bit about what more needs to be done.

"What remains to be done to some extent is leveraging this contextual information around our data in Library Applications, both cataloging, metadata enrichment applications and end user facing discovery applications."


There is a lot more in his piece, and links to many related tools, projects, and activities.