From: Joseph Esposito <[log in to unmask]> Date: Thu, 21 Jul 2016 21:16:34 -0400 I would like to thank the many people who responded to my initial query online and off. Joe Esposito On Thu, Jul 21, 2016 at 9:02 PM, LIBLICENSE <[log in to unmask]> wrote: > > From: Winston Tabb <[log in to unmask]> > Date: Fri, 22 Jul 2016 00:31:14 +0000 > > This is yet another reminder that we should always attempt to include > a provision in our licenses that nothing in them overrides provisions > of copyright law. > > Sent from my iPhone > > > On Jul 21, 2016, at 6:28 PM, LIBLICENSE <[log in to unmask]> wrote: > > > > From: kalev leetaru <[log in to unmask]> > > Date: Wed, 20 Jul 2016 21:32:55 -0400 > > > > One very critical distinction here that is often missed by academic > > data miners at universities is that in most cases of TDM you are > > looking at two very different classes of restrictions in the form of > > copyright and license. Copyright may or may not play a role depending > > on what you are doing and to what degree humans play a role (most > > TDM's like HathiTrust Research Center are erring towards the total > > exclusion of any possibility of a human consuming any copyrighted > > content, placing a firewall around the content such that only machines > > can access it). > > > > Yet, the far bigger issue is licensing. Take a digitized historical > > collection of newspapers from the 1800's that your library subscribes > > to. Regardless of your personal views on copyright claims to digitized > > imagery of public domain content (and more importantly, the views of > > courts in your particular jurisdiction), the bottom line is that when > > you access those images, you are doing so through a license agreement > > your library signed that governs your access to that content. If you > > have your university counsel take a deep read through those agreements > > you'll find that most publishers place explicit restrictions on data > > mining of their content, either specifically on mining or on direct > > and indirect access modalities to the content. Most universities are > > not currently trying to negotiate mining access as part of their > > license agreements and I always advise libraries to at least explore > > this with their publisher. Without saying more than I can in a public > > form, a good number of the large publishers are going to be making > > announcements around new or expanded data mining programs in the > > not-so-distant future (some of them in the next 12 months). Many of > > these will come in the form of an add-on that must be purchased by the > > library, but will grant data mining privileges to the content and > > provide mechanisms, such as cloud-based computing facilities and > > special APIs, to permit legal authorized large-scale data mining. > > > > Re #2, in the US the closest parallel is the illegal downloading of > > copyrighted music and movies. As the saying goes, just because someone > > else robbed the store, doesn't mean the merchandise is clean. Under US > > copyright law, the courts have generally held that downloading a > > copyrighted piece of content without authorization from the copyright > > holder is illegal. In particular, the process of downloading that file > > from a server to your local harddrive creates a copy of the work, > > which constitutes illegal duplication. In the case of SciHub, the > > simple act of downloading a PDF from that website to your computer > > typically constitutes infringement in the eyes of the US legal system > > under US copyright law. I can't speak to the EU legal system, since > > I'm not as familiar with its nuances, but I would assume you would > > largely see similar interpretations given various reciprocity of > > copyright laws. > > > > ~Kalev > > > > > >> On Wed, Jul 20, 2016 at 8:12 PM, LIBLICENSE <[log in to unmask]> wrote: > >> > >> From: "Peter B. Hirtle" <[log in to unmask]> > >> Date: Wed, 20 Jul 2016 14:00:33 +0000 > >> > >> Joe, on question #1, I am not aware of any court cases that have yet tested TDM. > >> > >> Perhaps the closest are the Google Books/HathiTrust cases that make it > >> clear that when you are not using the expressive content of a work, > >> there is no infringement. The arguments are well laid-out in Jockers, > >> Matthew L. and Sag, Matthew and Schultz, Jason, Brief of Digital > >> Humanities and Law Scholars as Amici Curiae in Support of > >> Defendant-Appellees and Affirmance, (The Authors Guild, Inc., et al., > >> v. Google, Inc., et al.) (Second Circuit) (July 10, 2014). > >> > >> Available at SSRN: http://ssrn.com/abstract=2465413 or > >> http://dx.doi.org/10.2139/ssrn.2465413. I can also recommend Matthew > >> Sag, Orphan Works As Grist for the Data Mill, 27 Berkeley Tech. L.J. > >> (2012). > >> > >> Available at: http://scholarship.law.berkeley.edu/btlj/vol27/iss3/9 or > >> http://dx.doi.org/doi:10.15779/Z387M5B. > >> > >> Your second question is a little harder. While some argued during the > >> Napster cases that "there is no legal restriction on downloading that > >> content," I think that is far from established. I suspect that the > >> opposite is likely to be the case now - but I haven't been tracking > >> the cases (primarily with regards to audio and videos) to speak to > >> this. > >> > >> Peter B. Hirtle > >> Affiliate Fellow, Berkman Klein Center for Internet & Society, Harvard > >> University > >> [log in to unmask] > >> [log in to unmask] > >> [log in to unmask] > >> http://vivo.cornell.edu/display/individual23436 > >> Copyright and Cultural Institutions: Guidelines for Digitization for > >> U.S. Libraries, Archives, and Museums: > >> http://hdl.handle.net/1813/14142 > >> > >> > >> -----Original Message----- > >> From: Joseph Esposito <[log in to unmask]> > >> Date: Sat, 16 Jul 2016 15:08:15 -0400 > >> > >> I was puzzling over some of the copyright issues concerning text and > >> data mining (TDM) this week and wonder if anyone on this list can > >> offer some guidance. Basically, I have two questions: > >> > >> 1. My understanding is that there is a growing body of court rulings > >> to the effect that TDM is not protected by copyright. That is, > >> machines/robots/spiders can mine full-text databases without > >> triggering a copyright claim. Are there any summary articles/blog > >> posts that lay out the current view of this? > >> > >> 2. Related to this is a question that came up concerning SciHub and > >> other sites that reproduce scholarly content. While it may be a breach > >> of contract or illegal to upload content to SciHub and its brethren, > >> and it may be illegal for SciHub to display that content, there is no > >> legal restriction on downloading that content. The downloader, on the > >> other hand, cannot redisplay that content. If this is true, could a > >> TDM robot download articles from SciHub (or, for that matter, from > >> ResearchGate or Academia.edu) with impunity? > >> > >> Can anybody help me here? > >> > >> Joe Esposito