From: Ivy Anderson <[log in to unmask]> Date: Sun, 28 Oct 2012 23:44:33 +0000 Thanks to Ann for highlighting the work of the UC Santa Cruz Genocoding Project. Libraries can help with this. The California Digital Library (CDL) negotiates systemwide agreements for the UC system, and we have worked hard to include text mining provisions in our licenses for a number of years. However, researchers are often unaware of the library's efforts in this regard and may not think to reach out to their libraries for assistance. In many cases, we may have already negotiated the rights they are seeking. How those rights are operationalized is another matter, and one in which libraries should also consider playing a more active role. We in the library community can do more to connect with our researchers on these issues and invite them to enlist our help in working with publishers and other information providers on text mining arrangements. - Ivy Ivy Anderson Director of Collections California Digital Library University of California, Office of the President [log in to unmask] http://cdlib.org -----Original Message----- From: Ann Shumelda Okerson <[log in to unmask]> Date: Sun, 28 Oct 2012 04:17:38 -0400 1. Text mining (UCSC Gencoding Project). Interesting information here -- I was happily surprised to see that already there is a lot of text mining opportunity, with growing improvement in coverage, though a long way to go. Also pleased to see a project that systematically contacts publishers and reports results of those contacts. Are there other systematic inquiries and studies for research in other fields? Do any of our liblicense-l readers know? This information came courtesy of Kathleen Shearer, Canadian Association of Research Libraries Dear Ann: Here is the url for the text mining permissions of publishers as identified by UCSC Genocoding Project: http://text.soe.ucsc.edu/progress.html * Current coverage of Pubmed 30% (2.7 million articles) * We have already indexed 8 million documents from Elsevier and PubmedCentral. They represent roughly 30% of PubMed (all years, ~1940-today). PubmedCentral includes text from most open-access publishers. Requests for permission sent to publishers * Out of the 6 million articles published since 2000 in Pubmed, we concentrate on publishers with more than 1000 articles. * There are 527 publishers that fulfill these criteria. * We have contacted the following ones. As shown, not all publishers require permission for text mining. Web crawling has started in June 2012, ******* 2. Charleston Conference Session: Concurrent Session: Text Mining Rights from Three Perspectives Thursday, November 8 3:15 PM - 4:00 PM Francis Marion Hotel, Carolina Ballroom Speakers: Teresa Lee (E-Resources & Access Librarian, University of British Columbia), Heather Piwowar (Postdoc, Duke and University of British Columbia) and Judson Dunham (Senior Product Manager, Elsevier) ******* Ann Okerson/CRL