LIBLICENSE-L Archives

LibLicense-L Discussion Forum

LIBLICENSE-L@LISTSERV.CRL.EDU

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
LIBLICENSE <[log in to unmask]>
Reply To:
LibLicense-L Discussion Forum <[log in to unmask]>
Date:
Thu, 21 Jul 2016 19:24:08 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (122 lines)
From: kalev leetaru <[log in to unmask]>
Date: Wed, 20 Jul 2016 21:32:55 -0400

One very critical distinction here that is often missed by academic
data miners at universities is that in most cases of TDM you are
looking at two very different classes of restrictions in the form of
copyright and license. Copyright may or may not play a role depending
on what you are doing and to what degree humans play a role (most
TDM's like HathiTrust Research Center are erring towards the total
exclusion of any possibility of a human consuming any copyrighted
content, placing a firewall around the content such that only machines
can access it).

Yet, the far bigger issue is licensing. Take a digitized historical
collection of newspapers from the 1800's that your library subscribes
to. Regardless of your personal views on copyright claims to digitized
imagery of public domain content (and more importantly, the views of
courts in your particular jurisdiction), the bottom line is that when
you access those images, you are doing so through a license agreement
your library signed that governs your access to that content. If you
have your university counsel take a deep read through those agreements
you'll find that most publishers place explicit restrictions on data
mining of their content, either specifically on mining or on direct
and indirect access modalities to the content. Most universities are
not currently trying to negotiate mining access as part of their
license agreements and I always advise libraries to at least explore
this with their publisher. Without saying more than I can in a public
form, a good number of the large publishers are going to be making
announcements around new or expanded data mining programs in the
not-so-distant future (some of them in the next 12 months). Many of
these will come in the form of an add-on that must be purchased by the
library, but will grant data mining privileges to the content and
provide mechanisms, such as cloud-based computing facilities and
special APIs, to permit legal authorized large-scale data mining.

Re #2, in the US the closest parallel is the illegal downloading of
copyrighted music and movies. As the saying goes, just because someone
else robbed the store, doesn't mean the merchandise is clean. Under US
copyright law, the courts have generally held that downloading a
copyrighted piece of content without authorization from the copyright
holder is illegal. In particular, the process of downloading that file
from a server to your local harddrive creates a copy of the work,
which constitutes illegal duplication. In the case of SciHub, the
simple act of downloading a PDF from that website to your computer
typically constitutes infringement in the eyes of the US legal system
under US copyright law. I can't speak to the EU legal system, since
I'm not as familiar with its nuances, but I would assume you would
largely see similar interpretations given various reciprocity of
copyright laws.

~Kalev


On Wed, Jul 20, 2016 at 8:12 PM, LIBLICENSE <[log in to unmask]> wrote:
>
> From: "Peter B. Hirtle" <[log in to unmask]>
> Date: Wed, 20 Jul 2016 14:00:33 +0000
>
> Joe, on question #1, I am not aware of any court cases that have yet tested TDM.
>
> Perhaps the closest are the Google Books/HathiTrust cases that make it
> clear that when you are not using the expressive content of a work,
> there is no infringement.  The arguments are well laid-out in Jockers,
> Matthew L. and Sag, Matthew and Schultz, Jason, Brief of Digital
> Humanities and Law Scholars as Amici Curiae in Support of
> Defendant-Appellees and Affirmance, (The Authors Guild, Inc., et al.,
> v. Google, Inc., et al.) (Second Circuit) (July 10, 2014).
>
> Available at SSRN: http://ssrn.com/abstract=2465413 or
> http://dx.doi.org/10.2139/ssrn.2465413.  I can also recommend Matthew
> Sag, Orphan Works As Grist for the Data Mill, 27 Berkeley Tech. L.J.
> (2012).
>
> Available at: http://scholarship.law.berkeley.edu/btlj/vol27/iss3/9 or
> http://dx.doi.org/doi:10.15779/Z387M5B.
>
> Your second question is a little harder.  While some argued during the
> Napster cases that "there is no legal restriction on downloading that
> content," I think that is far from established.  I suspect that the
> opposite is likely to be the case now - but I haven't been tracking
> the cases (primarily with regards to audio and videos) to speak to
> this.
>
> Peter B. Hirtle
> Affiliate Fellow, Berkman Klein Center for Internet & Society, Harvard
> University
> [log in to unmask]
> [log in to unmask]
> [log in to unmask]
> http://vivo.cornell.edu/display/individual23436
> Copyright and Cultural Institutions: Guidelines for Digitization for
> U.S. Libraries, Archives, and Museums:
> http://hdl.handle.net/1813/14142
>
>
> -----Original Message-----
> From: Joseph Esposito <[log in to unmask]>
> Date: Sat, 16 Jul 2016 15:08:15 -0400
>
> I was puzzling over some of the copyright issues concerning text and
> data mining (TDM) this week and wonder if anyone on this list can
> offer some guidance. Basically, I have two questions:
>
> 1.  My understanding is that there is a growing body of court rulings
> to the effect that TDM is not protected by copyright. That is,
> machines/robots/spiders can mine full-text databases without
> triggering a copyright claim. Are there any summary articles/blog
> posts that lay out the current view of this?
>
> 2.  Related to this is a question that came up concerning SciHub and
> other sites that reproduce scholarly content. While it may be a breach
> of contract or illegal to upload content to SciHub and its brethren,
> and it may be illegal for SciHub to display that content, there is no
> legal restriction on downloading that content. The downloader, on the
> other hand, cannot redisplay that content. If this is true, could a
> TDM robot download articles from SciHub (or, for that matter, from
> ResearchGate or Academia.edu) with impunity?
>
> Can anybody help me here?
>
> Joe Esposito

ATOM RSS1 RSS2