DEG - maccergit/Rosalind GitHub Wiki

Again, "graphs" are a heavily studied area in both mathematics and computer science. This exercise introduces an item that will be heavily used in later exercises : edge-list format. I also explore using a Python library that provides support for graphs, and start building a utility module - which introduces the use of PyUnit for unit testing.

01

This is the direct approach - no special libraries. Here, we simply count the number of paths where the vertex is the head of the path for each vertex. Note that I realized that the extra information provided by Rosalind's version of edge-list format allows for specifying nodes that do not have paths, and included the possibility in the solution - although it does not appear to come up in this problem.

02

There is a Python library that provides support for graphs - "networkx" (pronounced "networks"). It even includes the ability to read a file in edge-list format - but the format it expects is slightly different from the format used by Rosalind. However, once the graph data structure has been built, it's a simple function call to get the degree list for the graph.

03

Since I expect reading Rosalind edge lists to process with as networkx graphs may be something that comes up a lot, I pulled this custom processing into a separate utility project that I can import. Since this will be heavily used, I also want to be sure that special cases are correctly handled - so I included a large number of unit tests. This seemed like a good time to Starr using PyUnit to drive the unit tests - not only does PyUnit make it easier to write unit tests, but it also integrates nicely with PyDev.

However, getting the module properly imported is a pain. One quick and dirty approach is to simply add the project holding the module as a "Project Reference" in the current project's preferences. However, that appears to include the referenced projects modules as if they are in the current project directory - and the resulting imports don't support fully qualified names ("import graph as ut", instead of "import utility.graph as ut"). The problem with this is as the number of utility modules grows, they can result in names that can collide with other libraries - by grouping them all under the "utility" high level qualifier, the odds of a name collision are very low - we say it "does not pollute the global namespace." To get this to work in PyDev is a bit tricky - instead of including the project as a "Project Reference", we need to add the project source directory as an "External Library Source". The warning about this being outside the workspace, and changes will not be picked up, appears to be overly dramatic - I've yet to see any changes made to the library not get picked up, as the library is access at run time. It looks like if an incompatible change is made to the API (different number of arguments, perhaps), then the callers are not flagged with errors until run time. While compile time errors are preferable to run time errors, I can live with this if I get better names for my imports.