Bookworm VMs

Ben Schmidt, July 10, 2015

I’ve put a repo up on the Bookworm-Project github page that automatically creates a Bookworm installation using Vagrant. We’re working on it, but Bookworm can still be a pain to install because of all the passwords you have to manage. This sidesteps that by allowing you to try it out on a virtual machine.

Why would you want to do that? Presumably because you have an interesting collection of texts you want to explore, or you want to poke around at the API to see what else might work.

The steps are relatively simple, and should work on almost any platform:

  1. Install VirtualBox
  2. Install Vagrant
  3. Clone the repo.
  4. Go into the repo directory with a terminal, and type vagrant up. Lots of stuff will download and install. Go to lunch. In an hour or so, you’ll have a local installation.
  5. Confirm everything is working by visiting http://localhost:8007/D3. You should get an interactive barchart that lets you see how much different authors in the Federalist papers use any given word.

Once you have the VM up and running, you can play with the bar charts; you can explore the Federalist paper example using the R package in the pre-included RStudio installation at http://localhost:8787/ (which may get some decent demo scripts in the near term); or, most likely, you can create a new directory with your texts and build a Bookworm installation on top of them. How would you do that?

Building New Bookworms on the VM

Building new bookworms should be relatively simple, once you have your texts in the input format.

  1. Create a directory for your project, probably somewhere inside /vagrant so you can access it from your local machine.
  2. Place three files in that directory; input.txt, jsoncatalog.txt, and field_descriptions.json. The descriptions of what these should look like are in the Bookworm manual. I recommend managing the processes to create these files in a Makefile so you can reproduce on another server if desired.
  3. Run git clone https://github.com/Bookworm-Project/BookwormDB myBookworm, replacing (if desired) “myBookworm” with something more informative.
  4. cd myBookworm, and then run make to build the Bookworm. Start debugging.
  5. If you want line charts, while still in the myBookworm folder run make lineChartGUI.

System Requirements

I don’t know how small this can get–it’s probably worth finding out. But for the time being, even for the Federalist papers I wouldn’t try running this on a VM with less than 1 GB of memory. You’ll also need several GB of free hard drive space.

Security Warning

This uses pre-configured default passwords in plain text. As a result, your VM will be entirely insecure. Anyone able to access the machine’s web address will have full administrative control of the machine.

If you’re going to expose one of these machines to the Internet, you should at a minimum:

  1. Change the password for the user “vagrant”;
  2. Change the root password for MySQL;
  3. Change the “reader” and “keeper” user passwords for MySQL, granting select privileges to the former and ALL ... WITH GRANT OPTION to the latter.

Esoterica/Next Steps

Conal Tuohy has suggested that having the images available in the Open Virtualization Format. I’m happy to provide these, but the files are so large I won’t be putting a download URL online. You’ll have to e-mail.

It should take much to get one of these running on AWS. If you make the changes to the Vagrantfile to do so and are willing to back-propogate them as a branch on the github page, I’d be grateful.

Acknowledgements

The core Vagrantfile is from Andrew Goldstone’s Rutgers class on literary data. The dependencies for the Bookworm installation work from my branch of Muhammad Shamim’s Bookworm pre-installation script.