Turns out Clay Ford at the University of Virginia wrote a nice tutorial for this, and a package does the trick very nicely. I tested the “update” at the bottom of the post which shows how to use the pdftools package. There is also a tutorial by Ingo Feinerer for using the tm (text mining) package.