From: Ann Shumelda Okerson <[log in to unmask]> Date: Sun, 28 Oct 2012 04:17:38 -0400 1. Text mining (UCSC Gencoding Project). Interesting information here -- I was happily surprised to see that already there is a lot of text mining opportunity, with growing improvement in coverage, though a long way to go. Also pleased to see a project that systematically contacts publishers and reports results of those contacts. Are there other systematic inquiries and studies for research in other fields? Do any of our liblicense-l readers know? This information came courtesy of Kathleen Shearer, Canadian Association of Research Libraries Dear Ann: Here is the url for the text mining permissions of publishers as identified by UCSC Genocoding Project: http://text.soe.ucsc.edu/progress.html * Current coverage of Pubmed 30% (2.7 million articles) * We have already indexed 8 million documents from Elsevier and PubmedCentral. They represent roughly 30% of PubMed (all years, ~1940-today). PubmedCentral includes text from most open-access publishers. Requests for permission sent to publishers * Out of the 6 million articles published since 2000 in Pubmed, we concentrate on publishers with more than 1000 articles. * There are 527 publishers that fulfill these criteria. * We have contacted the following ones. As shown, not all publishers require permission for text mining. Web crawling has started in June 2012, ******* 2. Charleston Conference Session: Concurrent Session: Text Mining Rights from Three Perspectives Thursday, November 8 3:15 PM - 4:00 PM Francis Marion Hotel, Carolina Ballroom Speakers: Teresa Lee (E-Resources & Access Librarian, University of British Columbia), Heather Piwowar (Postdoc, Duke and University of British Columbia) and Judson Dunham (Senior Product Manager, Elsevier) ******* Ann Okerson/CRL