Subject: | |
From: | |
Reply To: | |
Date: | Mon, 5 Nov 2012 17:13:51 -0500 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
From: Ann Shumelda Okerson <[log in to unmask]>
Date: Mon, 5 Nov 2012 04:54:29 -0500
Forwarded by Paul Zarins, of Stanford University Library, below is a
message from Glen Worthy, Stanford's Digital Humanities LIbrarian.
________________________________
From: "Glen Worthey" <[log in to unmask]>
To: "Pavils Zarins" <[log in to unmask]>
Sent: Thursday, November 1, 2012 5:02:54 PM
Subject: Re: Fwd: Suggested Readings in Text Mining?
My bias will be pretty obvious to you -- but as far as I'm concerned,
regarding text mining specifically for humanities research, Matt Jockers
is the very best. Here is a set of several highly relevant blog posts
from him:
http://www.matthewjockers.net/category/tm/
the best and most entertaining of which is basically a chapter from his
book /Macroanalysis: Digital Methods and Literary History /(due out
early next year):
http://www.matthewjockers.net/2011/09/29/the-lda-buffet-is-now-open-or-latent-dirichlet-allocation-for-english-majors/
I suspect that Ann (and others on the Liblicence list) may be especially
interested in this: Matt was also co-author (on behalf of digital
humanities and legal scholars) of an amicus brief that was filed in the
Authors Guild v. HathiTrust case:
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2102542
and which was frequently cited by the judge in his decision. Obviously,
text mining is not the main focus of this brief, but it does play a
strikingly prominent role in what turned out to be a very important
legal document.
Finally, as just a portal into the huge world of text mining for
humanities research, see this very helpful "progressive" (that is,
progressing from "beginner" to "expert" level) review article with links
aplenty:
"Topic Modeling for Humanists: A Guided Tour"
http://www.scottbot.net/HIAL/?p=19113
(Note that, for some purposes -- though not all! -- "topic modeling" is
rough synonym for "text mining." It's probably better characterized as
a subset of text mining, but I believe at the moment it's one of the
more actively-pursued subsets, at least in digital humanities.)
Hope this helps,
Glen
|
|
|