LISTSERV - LIBLICENSE-L Archives

From: "Evans, Gwen" <[log in to unmask]>
Date: Thu, 5 May 2016 15:49:50 +0000

OhioLINK users (which include almost all higher ed in the state of
Ohio, with Ohio State being the largest, and including the Cleveland
Clinic which is a heavy user) downloaded 11.5 million articles in 2015
(publisher and our locally loaded Electronic Journal Center). This is
only from the OhioLINK consortial licensed content, but that includes
several big publishers (Elsevier, Wiley, Springer, ACS et al). As a
system, we don’t have the same research intensity that CDL covers, but
11.5 million for one state versus 47 million globally for SciHub is
24% of the total activity recorded for SciHub. Whoever is downloading
from Scihub in Columbus (no other Ohio region has similar activity)
represents less than 1% of the comparable statewide academically
sanctioned activity in just the OhioLINK collective packages, and the
percentage would shrink if sanctioned activity for the locally
subscribed Ohio State resources were added.

We are interested in trying to figure out that how much of it really
is frustrated academics driving the use. We downloaded the data set
because we were curious to see whether we could identify the use in
Columbus OH. Columbus is the home of Ohio State, sure, but it is the
state capital and home to Battelle Research, a number of large medical
systems, Chemical Abstracts Service, several corporate headquarters in
manufacturing, and to paraphrase a failed US VP candidate, “I can see
a large Abbott Labs facility from my window!” The dataset doesn’t
include the actual IP addresses (for obvious reasons laid out in the
article), just the geolocation tag, so we couldn’t do a comparison
without a lot of extra work.

But I fully agree that trying to easily find and get sanctioned
articles is a miserable experience for academics, and it is certainly
due to publisher restrictions and the authentication mechanisms
required by them. It's infuriating that the publisher quoted in the
article blames libraries. We have clear data and studies that show
that many, many researchers go directly to big publisher platforms,
not through the library website or other sanctioned portals. Michael
Clarke’s blog post on Scholarly Kitchen of yesterday is precisely to
the point https://scholarlykitchen.sspnet.org/2016/05/04/accessing-publisher-resources-via-a-mobile-device-a-users-journey/.
It’s the publishers that throw up the very first hurdles to sanctioned
access for the most common path to the content. If a user goes
directly from a google search to the publisher platform, and can’t get
it, why are you pointing the finger at us? We didn’t build those
systems.  And they aren’t necessarily standardized across all vendors,
and certainly campus IT authentication and proxy sytems aren’t
standardized (and if you think libraries are always consulted before a
campus chooses an auth system, think again) and even in states and
systems that are trying to implement eduroam, that is a long and
lengthy process and involves actual IT investment and cooperation
across many different institutions.

Notice that it is institutions having to invest more real dollars in
trying to make content more seamlessly accessible to the very
researchers that provided it without cost to the publishers, in terms
of authentication, proxies, VPNs, staffing and management, and
discovery tools. So to Ivy’s, Ann’s, and Lisa’s points, there are a
lot of literally uncalculated cost benefits to a fully open access
model for higher education that go beyond the obvious ones. I also
think it’s a bit unfair to expect the relatively narrow higher ed
library market to deliver results comparable to Google. Reliable
article level indexing and normalization at scale isn’t cheap or easy.
And OhioLINK (since we locally load content from publishers in our own
multi-publisher journal platform) can attest that the metadata
delivered by publishers varies in quality, type, standardization, etc.
and requires a lot of work. And we have no idea how often SciHub
misses in delivering an article from a search even if it is in fact in
their database. Their ease of use comes from the lack of
authentication barriers primarily — if I find it, I get it.


Best, Gwen

(Okay, I can no longer see Abbott from my window because we moved
offices, but I used to be able to!)


Gwen Evans
Executive Director, OhioLINK
http://www.ohiolink.edu/
[log in to unmask]
1224 Kinnear Rd
Columbus, Ohio 43212

ORCID ID:0000-0002-4560-0435

Per Ohio Revised Code, this communication and any attachments may
constitute a public record. (http://codes.ohio.gov/orc/149.43)



To: LibLicense-L Discussion Forum <[log in to unmask]>
Subject: Re: “Who’s Downloading Pirated Papers? Everyone” (Sci-Hub Data)
From: kalev leetaru <[log in to unmask]>
Date: Wed, 4 May 2016 08:51:30 -0400

In case it is of interest, here is my take on SciHub and the trend it
represents in academic publishing:

http://www.forbes.com/sites/kalevleetaru/2016/04/29/the-future-of-open-access-why-has-academia-not-embraced-the-internet-revolution/

To second Ann's comments, one of the most striking things to me about
that Science piece is just how heavily SciHub is apparently being used
at Western academic institutions which likely have legal subscriptions
to the journals in question. That to me stands testament to just how
awful current academic library journal subscription search systems
are. I  can personally attest from 15 years at various institutions
public to private just how impossible it can be to just find the
fulltext of a particular journal article even when you know your
institution subscribes to the journal and issue in question. Or you
search and find 10 different copies from 10 different services the
institution subscribes to, but some are abstracts only, some are ASCII
text only with no figures, and so on.  Or you want to see an entire
issue of a journal and you find multiple subscriptions that purport to
include the journal, but then when you browse through, after having
clicked through screen after screen, you find that some subscriptions
have time delays so don't include the most recent issue or end or
start on a particular date or only have samples of the journal, etc.
Its a huge huge mess today. Its not librarys' exclusive faults, but I
do think there is immense room for improvement - even I ended up in
the habit of going to Google Scholar first to have it link me into my
library's subscriptions, since it at least seemed to be able to track
down whether my library had a copy and connect me directly to the best
copy that had fulltext and images.

Google Scholar is far from perfect, but as a researcher who does
intense deep dives into the literature, it is the model that I think
libraries simply have to adopt to stay relevant and serve their
communities.  It simply can't be that I can spend half a hour to an
hour (sometimes several hours) just trying to track down a journal
article in the myriad mess of a typical academic library's
esubscriptions system - I should be able to search for the article and
jump right to the best available copy with a mouse click.

While slightly tangent, the Science article also alludes to the
possibility that SciHub downloaders are using it for text mining. That
is another area where academic libraries need to play a much bigger
role in helping academics. I myself have always found libraries to be
highly adversarial when it comes to connecting researchers with
publishers to explore possible collaborations and in fact libraries
have always been the primary obstacle for me in my 15 years data
mining in the academic world. Instead I've always had to reach out
directly to publishers after my home institution library would push
back saying it was not their job to help connect researchers or would
otherwise not invest any effort of any kind in helping make those
connections.

I've found publishers to be extremely helpful and open in supporting
large-scale data mining when approached - from 21 billion words of
academic literature
(http://dlib.org/dlib/september14/leetaru/09leetaru.html) to my
dissertation (http://www.kalevleetaru.com/Publish/Leetaru_Dissertation_Can_We_Forecast_Conflict-Dissertation.pdf)
to a myriad other initiatives I oversee
(http://blog.gdeltproject.org/) - see more in my NFAIS opening
keynote: (http://kalevleetaru.com/Publish/ISU2015-Leetaru-Mining-Libraries.pdf),
but in every case I reached out directly to the publishers after my
home institutional library failed to be of any help in forging
connections and collaborations. Libraries have a lot they can offer
there in helping to connect scholars with publishers and assisting in
that process to ensure legal data mining that benefits all sides, but
they need to recognize that if they put their foot down and say it is
not their role, researchers will simply go right around them directly
to the publishers, further reducing the library's role in academic
life.

To me SciHub appears to be less a service for poorer nations to access
scholarship financially inaccessible to their institutions and more a
reaction to the just plain horrific state of access to academic
scholarship today, from extreme costs of subscriptions to the awful
state of library access portals.


~Kalev
http://kalevleetaru.com
http://blog.gdeltproject.org/