From: Marcin Wojnarski <[log in to unmask]> Date: Wed, 5 Feb 2014 15:10:38 +0100 As a data mining specialist, I've followed the different discussions about mining scholarly publications for some time already, and I've noticed that there is a big confusion about the legal nature of text mining and the true origin of restrictions related to it. 1) Restrictions imposed on text mining are technical, not legal. Publishers impose technical limits on how much content can be downloaded in a given period of time, and if someone downloads too much, the university may get cut off from publisher's servers. This is regulated legally, of course, but only in the agreement signed between the university and the publisher, not by general law, the least by copyright. What exact terms are signed is a matter of mutual agreement between parties - they can agree on whatever they want - so blaming copyright for limited bandwidth to publisher's server, as often done in discussions about data mining of academic papers, is unreasonable. 2) Restrictions are related to subscription content alone. There are no ways to impose restrictions on mining Open Access content, even if OA means only "free" OA. Even more: if I get access to a paper illegally and mine it, I can only be accused of illegal copying, but not of text mining. That's because copyright law has nothing to do with mining, these are two different things. Data mining is related to *information* contained in the paper, and not to the paper itself; whereas the copyright protects only the paper as a creative work, in its literal and graphical form, not the information contained in it. It's important to see the distinction. It's true what Ross Mounce said that "the right to read is the right to mine". I would say even more: mining does NOT need any right. Data mining is just another name for collecting statistics. And it's my *personal freedom* to collect whatever stats I want, from whatever papers I want, nobody can forbid me to do this. Thus, if I'm lucky enough to see the paper - on whatever legal basis, or even none at all - it's only my business what I do with information that I obtained in this way. Regards Marcin Wojnarski Marcin Wojnarski, Founder and CEO, TunedIT http://tunedit.org http://www.facebook.com/TunedIT http://twitter.com/TunedIT http://www.linkedin.com/in/marcinwojnarski TunedIT - Online Laboratory for Intelligent Algorithms On 02/05/2014 12:20 AM, LIBLICENSE wrote: > > From: Ivy Anderson <[log in to unmask]> > Date: Tue, 4 Feb 2014 05:32:56 +0000 > > This short article from Nature News may be of interest to LibLicense readers: > > Elsevier opens its papers to text-mining > Researchers welcome easier access for harvesting content, but some > spurn tight controls. > > Richard Van Noorden > 03 February 2014 > http://www.nature.com/news/elsevier-opens-its-papers-to-text-mining-1.14659 > > > > Ivy Anderson > Director of Collections, California Digital Library > University of California