Ben Schmidt, July 10, 2015
I’ve put a repo up on the Bookworm-Project github page that automatically creates a Bookworm installation using Vagrant. We’re working on it, but Bookworm can still be a pain to install because of all the passwords you have to manage. This sidesteps that by allowing you to try it out on a virtual machine.
Why would you want to do that? Presumably because you have an interesting collection of texts you want to explore, or you want to poke around at the API to see what else might work.
The steps are relatively simple, and should work on almost any platform:
vagrant up. Lots of stuff will download and install. Go to lunch. In an hour or so, you’ll have a local installation.
http://localhost:8007/D3. You should get an interactive barchart that lets you see how much different authors in the Federalist papers use any given word.
Once you have the VM up and running, you can play with the bar charts; you can explore the Federalist paper example using the R package in the pre-included RStudio installation at
http://localhost:8787/ (which may get some decent demo scripts in the near term); or, most likely, you can create a new directory with your texts and build a Bookworm installation on top of them. How would you do that?
Building new bookworms should be relatively simple, once you have your texts in the input format.
/vagrantso you can access it from your local machine.
field_descriptions.json. The descriptions of what these should look like are in the Bookworm manual. I recommend managing the processes to create these files in a Makefile so you can reproduce on another server if desired.
git clone https://github.com/Bookworm-Project/BookwormDB myBookworm, replacing (if desired) “myBookworm” with something more informative.
cd myBookworm, and then run
maketo build the Bookworm. Start debugging.
I don’t know how small this can get–it’s probably worth finding out. But for the time being, even for the Federalist papers I wouldn’t try running this on a VM with less than 1 GB of memory. You’ll also need several GB of free hard drive space.
This uses pre-configured default passwords in plain text. As a result, your VM will be entirely insecure. Anyone able to access the machine’s web address will have full administrative control of the machine.
If you’re going to expose one of these machines to the Internet, you should at a minimum:
ALL ... WITH GRANT OPTIONto the latter.
Conal Tuohy has suggested that having the images available in the Open Virtualization Format. I’m happy to provide these, but the files are so large I won’t be putting a download URL online. You’ll have to e-mail.
It should take much to get one of these running on AWS. If you make the changes to the Vagrantfile to do so and are willing to back-propogate them as a branch on the github page, I’d be grateful.
The core Vagrantfile is from Andrew Goldstone’s Rutgers class on literary data. The dependencies for the Bookworm installation work from my branch of Muhammad Shamim’s Bookworm pre-installation script.