Background: To index Word, Excel, PDF and other "unstructured" documents, Solr uses Tika, another Apache project. Tika comes bundled in Solr and is ready to run in Solr. However, if you want to run Tika individually (e.g. you don't trust your installation, or you're just curious) you have to copy a few .jar files around (Java experts who can manage class paths will probably tell me there's a better way to do this).
cd [Your path]/apache-solr-nightly/lib(I have no idea where ~/.m2 came from. It may have been when I ran the Tika build.) Then I could run
cp commons-io-1.4.jar commons-codec-1.3.jar [Your path]/apache-solr-nightly/example/solr/lib
cp ~/.m2/repository/org/jempbox/jempbox/0.2.0/jempbox-0.2.0.jar [Your path]/apache-solr-nightly/example/solr/lib
java -jar tika-0.2.jarin that directory.