Heritrix in Eclipse - internetarchive/heritrix3 GitHub Wiki
Specifically Ubuntu 11.04, but should work for other versions from the general time period (10.10, 11.10, ...).
N.B. There are other ways to do this, for example using additional eclipse plugins for maven or git, but this is one way that is known to work.
sudo apt-get install sun-java6-jdk eclipse git maven2
sudo update-java-alternatives --set java-6-sun
sudo update-java-alternatives --list
cd ~/workspace
git clone git://github.com/internetarchive/heritrix3.git
cd ~/workspace/heritrix3
mvn -Dmaven.test.skip=true install
In eclipse: File / Import... / Existing Projects Into Workspace ... choose ~/workspace/heritrix3
Select Project > Properties > Java Build path >
Select Libraries tab > Add variable > Configure variables > New
Name: M2_REPO
Path: /home/{username}/.m2/repository
- Run / Debug Configurations...
- double-click Java Applications to create a new one
- choose Main class org.archive.crawler.Heritrix
- Arguments tab
- Program arguments: -a PASSWORD -l dist/src/main/conf/logging.properties
- VM arguments: -Dheritrix.development
Screenshot.png (image/png)
Screenshot-1.png (image/png)