Graphs - broyda/cp-lib GitHub Wiki
- For 0-based graphs, define the adjacency matrix in the (n) format.
For 1-based graphs, define the adjacency matrix in the (n + 1) format. Else program leads to hard to find bugs.
vector<int> adj[n + 1];
This is saying, we have an array with size (n + 1), and each of these n + 1 slots are occupied by a vector of type integer. So if we want to change the type, we can say something like
pair<int, int> adj[n + 1]
If we create an array of vectors, that need not be sent as &adj inside a function since all arrays are passed by reference.
The core idea is - there exist atleast one node with indegree = 0 that acts as a "seed".
#Topological sorting Whenever problem says, prerequisites then topological sorting should be considered.
Topological sorting is applicable ONLY in case of DAGs (Directed Acyclic Graphs). In case the graph has cycle(s), topological sorting cannot be performed on it.
Works for DAGs, for graphs with cycles need Djisktra's algorithm.
Problems discussed
- Shortest path from source to destination given that certain cells are blocking. Uses implicit graph concept for the solution.
- Print path for the above.
- Variant wherein traversal is replaced by a knight. Manhattan distance = 3 concept
- How many shortest paths exist in the first problem
Key takeaways
- First problem: In BFS, the distance array itself acts like a visited array. We initialize the distance to specific nodes as infinity, so in case distance is infinity, then its unvisited.
- More relaxing and general condition (note: since BFS traverses layer by layer, it is guaranteed that there are no multiple distances involved here. The only distance is the one set by the condition below e.g. current node's distance + 1
// "node" is current's neighbor
if (dist[node] > dist[cur] + 1) {
dist[node] = dist[cur] + 1;
}
// instead of
if (dist[node] = 1e9) {
dist[node] = dist[cur] + 1;
}
- Data structure wise, the explicit version of the graph is easier to deal with than the implicit grid based data structure. Logic remains pretty much the same.Implicit means you're creating a graph from the rules inherent in the problem statement. Explicit means to use a proper data structure for the graph (say, adjacency list).
- Instructor mentions there are only a limited set of ideas.
There is an interesting property of the queue that is used to push/pop the nodes into. Beyond a certain line will be nodes which are at a distance (x+1) and not x. x is the distance of the node that we are currently processing (e.g. the queue's top). Technically, we use DEQUE to push at the back. So some nodes may be pushed twice into the queue.
int count = 0; // global
int in_time[n];
int out_time[n];
void dfs(int node) {
in_time[node] = count++;
// No need for a special check on the leaf node since the following
// for loop won't be executed
for (neighbor: adj[node]) {
dfs(neighbor);
}
out_time[node] = count++;
}
void dfs(int node, vector<int> adj[], int visited[], int ans[]) {
ans[node] = 0;
visited[node] = 1;
pr(node, ans[node]);
for(auto nbr : adj[node]) {
pr(nbr);
if(!visited[nbr]) {
dfs(nbr, adj, visited, ans);
ans[node] += (1 + ans[nbr]);
}
}
}
void solve() {
int n;
cin >> n;
vector<int> adj[n + 1];
int visited[n+1] = {0};
for (int i = 2; i <= n; ++i)
{
int m;
cin >> m;
adj[i].push_back(m);
adj[m].push_back(i);
}
// Debugging only - print adjacency matrix
for (int i = 0; i <= n; ++i)
{
for (auto x: adj[i]) {
pr(i, x);
}
}
int ans[n+1];
dfs(1, adj, visited, ans);
}
Depends on the problem at hand. Complexity wise they are pretty much the same. However, there may be some problems that are better "suited" to the underlying implementation of either one of these two mainstream algorithms. So it's good to know the implementation of both.
int prims_mst(int V, vector<vector<int>> adj[])
{
priority_queue<pair<int, int>,
vector<pair<int, int> >, greater<pair<int, int>>> pq;
vector<int> vis(V, 0);
// {wt, node}
pq.push({0, 0});
int sum = 0;
while (!pq.empty()) {
auto it = pq.top();
pq.pop();
int node = it.second;
int wt = it.first;
if (vis[node] == 1) continue;
// add it to the mst
vis[node] = 1;
sum += wt;
for (auto it : adj[node]) {
int adjNode = it[0];
int edW = it[1];
if (!vis[adjNode]) {
pq.push({edW, adjNode});
}
}
}
return sum;
}
};
class DisjointSet {
vector<int> rank, parent, size;
public:
DisjointSet(int n) {
rank.resize(n + 1, 0);
parent.resize(n + 1);
size.resize(n + 1);
for (int i = 0; i <= n; i++) {
parent[i] = i;
size[i] = 1;
}
}
int findUPar(int node) {
if (node == parent[node])
return node;
return parent[node] = findUPar(parent[node]);
}
void unionByRank(int u, int v) {
int ulp_u = findUPar(u);
int ulp_v = findUPar(v);
if (ulp_u == ulp_v) return;
if (rank[ulp_u] < rank[ulp_v]) {
parent[ulp_u] = ulp_v;
}
else if (rank[ulp_v] < rank[ulp_u]) {
parent[ulp_v] = ulp_u;
}
else {
parent[ulp_v] = ulp_u;
rank[ulp_u]++;
}
}
void unionBySize(int u, int v) {
int ulp_u = findUPar(u);
int ulp_v = findUPar(v);
if (ulp_u == ulp_v) return;
if (size[ulp_u] < size[ulp_v]) {
parent[ulp_u] = ulp_v;
size[ulp_v] += size[ulp_u];
}
else {
parent[ulp_v] = ulp_u;
size[ulp_u] += size[ulp_v];
}
}
};
int kruskal_mst(int V, vector<vector<int>> adj[])
{
pr("inside");
vector<pair<int, pair<int, int>>> edges;
for (int i = 0; i < V; i++) {
for (auto it : adj[i]) {
int adjNode = it[0];
int wt = it[1];
int node = i;
edges.push_back({wt, {node, adjNode}});
}
}
DisjointSet ds(V);
sort(edges.begin(), edges.end());
int mstWt = 0;
for (auto it : edges) {
int wt = it.first;
int u = it.second.first;
int v = it.second.second;
if (ds.findUPar(u) != ds.findUPar(v)) {
mstWt += wt;
ds.unionBySize(u, v);
}
}
return mstWt;
}
void solve() {
int n, e; // Number of vertices and edges
cin >> n >> e;
vector<vector<int>> adj[n];
for (int i = 0; i < e; ++i) {
// Input has been provided as a triplet [x, y, weight of the edge xy]
int x, y, weight;
cin >> x >> y >> weight;
vector<int> tmp(2);
tmp[0] = y;
tmp[1] = weight;
adj[x].push_back(tmp);
tmp[0] = x;
tmp[1] = weight;
adj[y].push_back(tmp);
}
int ans = kruskal_mst(n, adj);
cout << ans << endl;
}
When we implement path compression using the following line inside a recursive function, the code essentially is resetting the parent node of every node in the path (from the parent to that specific node for which we want to find the parent for) to the representative parent.
int find_set(int v) {
if (v == parent[]) {
return v;
}
return parent[v] = find_set(parent[v]):
}