Roles of Maven, Eclipse and Git in PDQ - ProofDrivenQuerying/pdq GitHub Wiki
Eclipse and IntelliJ are IDEs (integrated development environments): advanced text editors that give you many features that make developing software easier, such as:
- Syntax highlighting
- Static analysis (to catch bugs in your code as you are writing it)
- Tool integrations such as for Git and Maven (see below)
- Advanced refactoring
- Many other plugins
Note that nothing in PDQ depends on an IDE. You can code perfectly well in your favourite text editor.
The following instructions explain how to import PDQ into Eclipse, which is very similar to the process for importing into other IDEs such as IntelliJ.
From Eclipse menu bar, open File > Import
.
Then, on the popup window, choose Git > Projects from Git
From the following window, we can either import from a local repository or directly from GitHub. Here we describe the former.
Choose Existing local repository
, then Add
. Now you get to select where your repository is stored locally.
Tick the box corresponding to /path/to/git/folder/pdq/.git
, press Finish
, then select pdq
.
The next window lets you decide which level of the project hierarchy you want to import into Eclipse.
Choose the top level (Working tree...), and Import as general project
.
The next window will ask how you want to name the project in Eclipse.
Leave it as pdq
and press Finish
.
Now go back to the Eclipse menu bar's File > Import
, and choose Maven > Existing Maven Projects
.
In the next windows, set the Root directory
to /path/to/git/folder/pdq
, then check all PDQ sub-projects.
Voila. You are done.
Git is a version control system that tracks changes in the source code of PDQ. It is perfectly possible to download PDQ as a zip file and use it without ever interacting with Git. However, Git is required for developers as it is the mechanism by which new code gets pushed to the GitHub repository.
If you are using Git, from now on you never have to create additional "copies" of the project anywhere. You can create and switch branches, either from Eclipse or from the command line. Note that, even if you switch branches from command line (i.e. outside Eclipse), Eclipse will recognise and update the branch display near each project (although this often takes a few seconds).
When switching branches from Eclipse, you have to right click on any of the projects, and tell which branch you want to switch to. This will switch branch for all the projects at the same time, even if you right-clicked on a single project.
Maven is a build automation tool for Java projects and one of its main features is dependency management. Again, it is entirely possible to use PDQ without Maven if you are willing to manage the dependencies yourself. If you use an IDE you will likely get all the benefits of using Maven without realising it: the IDE will interact with Maven on your behalf.
PDQ developers use Maven to manage all dependencies (the other Java libraries on which PDQ relies). This means that there is no compiled library stored in git nor in individual PDQ sub-projects. If you see one, post an issue about it: it should not be there.
-
Each sub-project has a
pom.xml
file at its root which contain various information about that sub-project including the version number, plugins, dependencies (internal and external), and build instructions (what to build, how and where). If you are familiar with Ant, think of it as thebuild.xml
. -
In addition, there is a
pom.xml
file at the root of the tree hierarchy, underpdq/
, which instructs that all these sub-projects form a single entity: PDQ. When you runmvn install
on the toppom.xml
, it will simply runmvn install
on allpom.xml
of the sub-projects, in the correct order. -
When you run
mvn install
on a sub-project, it does the following:It looks at the dependencies listed in the pom. These are simply listed as IDs, e.g. the Guava library shows as:
<dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> </dependency>
Maven checks in its local repository (a directory under
~/.m2/
) if the corresponding JAR is present. If not, it will download it from a remote server, and store it locally there. On subsequent builds, it will never download guava again, unless for instance, you decide to use a newer version.Once all the dependencies are available locally, the sources are compiled, the unit tests are run, and a JAR is put under the project's
target/
directory with the version specified in the pom. When you went to have a newer version of PDQ, you update the pom, and JARs will be built under a new name, in thetarget/
.
In addition to version, Maven needs certain identifiers to name projects and sub-projects:
- Group Id: is an ID that allows gathering components under a single umbrella. You can think of is as a namespace. The one we use in PDQ is
uk.ac.ox.cs.pdq
- Artifact Id: is an ID for individual component, typically projects and sub-projects. In our case, the top level pdq project has Artifact Id
pdq
, and each individual sub-projectpdq-<sub-project-name>
These actually never need to be changed, but they are important as they are used by Maven for (internal) dependency management. They will also be used as global identifier, if we want to have PDQ available on some public repository one day.
For routine development of PDQ it should be rare to make changes to the pom.xml, but the following would require changes to be made:
-
Adding a new dependency to a sub project:
If the dependency is already used in PDQ, but not in the sub project, it should be specified in the
<dependencies>
block, e.g.<!-- In pdq/sub-project/pom.xml --> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> </dependency>
If the dependency is new to PDQ, you must also specify it in the base
pom.xml
in the<dependencyManagement>
block, with a version number, as well as in the sub project'spom.xml
without a version number (as above), e.g.:<!-- In pdq/pom.xml --> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>16.0.1</version> </dependency>
-
Changing a plugin or dependency version. All version information is kept in the top level
pom.xml
and only requires changing there. -
Changing the version number of PDQ. Every
pom.xml
including at the top level and in sub projects declares a version number. These should (probably) all be changed at once. -
Refactoring the main class entry point for a sub project. If, for instance, you changed
PdqRegression
toRegressionPdq
you would need to update thepom.xml
inpdq/regression
and update:<mainClass>uk.ac.ox.cs.pdq.regression.RegressionPdq</mainClass>
Importing a maven project into eclipse from git
If you get an error installing the m2e-git connector, a possible way to correct it is here