Data Structure: Graph - swchen1234/Algorithm GitHub Wiki
理论
- 图和树的区别:在于图中是否存在“环” => 要用hash map记录已访问过的点
- 图的 BFS 时间复杂度: O(n + m) // n 为点数, m 为边数
- 矩阵的 BFS 时间复杂度: O(RC) // RC个点,2RC条边
- 模版(图的遍历)
q.append(node)
visited.add(node)
while q:
node = q.popleft()
for v in node.neighbours:
if v not in visited:
q.append(v)
visited.add(v)
常见应用: 主要分为两类:图的遍历 和 简单图最短路径。
1. 图的遍历
1.1. DFS Traversal
1.2. BFS LevelOrderTraversal
1.3. Topological Sort
-
求所有拓扑排序 -> DFS
-
求是否唯一拓扑排序 -> BFS queue中最多只有一个节点, 更易实现. O(V+E)
-
模版(拓扑排序, Kahn algo, BFS)
outcoming = {i: [] for i in range(numCourses)}
indegree = [0] * numCourses
for pair in prerequisites:
indegree[pair[0]] += 1 # 存每个点须要多少prerequisite
# 存每个prerequisite, 可以有哪些孩子,从而没pop出一个prerequisite, 其所有孩子的indgree减1.
outcoming[pair[1]].append(pair[0])
lst = []
queue = [i for i in range(numCourses) if degree[i] == 0]
while queue:
noincoming = queue[0]
for outnode in outcoming[noincoming]:
degree[outnode] -= 1
if degree[outnode] == 0:
queue.append(outnode)
lst.append(noincoming)
1.4. Detect Cycle - Trajan's Algo
2. 简单图最短路径
2.1 Without Edge Length
-
最短路径 -> BFS, DP
-
最长路径 -> DFS, DP
-
模版(若要添加path):
precedent= {}
...
while len(q) > 0:
{node = q.pop()
...
queue.append(newNode)
precedent[newNode] = node
}
...
prev = target
while prev != board:
prev = prec[prev]
print prev
-
visited set 中存status:status 的常见表示方法:
- 用node的数字编号
- 若需考虑已访问哪些点 -> bitmask
- 若需考虑已访问哪些点 + 现在所在点 -> (endPosition, bitmask)
- 若需考虑整个2d matrix的状态 -> 化为1d array的tuple,作为status
-
加快速度1. -> bi-directional graph
-
- Node that use two sets instead of queue, because it is easier for us to check if one node in one set is present in the other._
-
- One trick is always search the smaller set by swapping the two sets._
-
set<>start, end;
set<>visited;
start.insert(root);
end.insert(target);
visited[root] = true;
visited[target] = true;
while(!start.empty()){
if(start.size() < end.size())
swap(start, end);
set<> temp;
for(node in start){
if(node in end)
return;
if(visited[node] == false)
foo(each);
visited[node] = true;
}
}
start = temp
}
- 加快速度2. -> A Star Search
The A* star algorithm is another path-finding algorithm. For every node at position (r, c), we have
estimated cost = actual distance from (sr, sc) to (r, c) + heuristic (guess) of the distance from (r, c) to (tr, tc)
常见heuristic guess 可以为曼哈顿距离: node.h = abs(r-tr) + abs(c-tc). Dijkstra's是A* Search 是的一个特例,Dijkstra's中估计距离为0.
用min heap来决定接下来要找的点。 We can prove that if we find the target node, we must have traveled the lowest possible distance node.g. By considering the last time when two backward paths are the same, without loss of generality we could suppose the penultimate square of the two paths are different, and then in this case node.f = node.g + 1, showing the path with less actual distance travelled is expanded first as desired.
2.2 With Edge Length
-
Dijistra
- 最快。有路径长度的图也可以通过bfs/dfs实现,但这里的方法用了greedy algo, 每次只算最有可能的点,并提前终止。
- 用heap存放已经访问过的点(距离,label), 每次pop出最短点,更新其邻居的最短距离,放入heap中,知道访问到终点。 time O(E + V * logV), 其中 edges E, vertices V。
hq = [(0, k)] p2t = {k: 0} while hq: time, p = heapq.heappop(hq) if p == destination: return for nei, d in graph[p]: if nei not in p2t or d + time < p2t[nei]: p2t[nei] = d + time heapq.heappush(hq, (d + time, nei))
-
BellFord \
- 比Dijistra慢。
- 可以处理边长为“负值”的情况。
- run time O(V * E).
- For each node, its adjacent node is updated by val(adjacent node) = min(val(adjacent node), val(node) + edge(node, adjacent node)).
3. 最长路径, 所有路径
- DFS
4. 最小生成树(Minimum Spanning Tree)
Prim's Algoref 从Set = {0}点出发,不断添加连接Set和rem之间最小边(用heap实现),直至所有点被加入
Problems
1. 遍历问题
1.1 DFS DFS traversal
133. Clone Graphdfs(node)返回clone后的node,同时将clone.neighbor = clone(n.neighbor). bfs也可以实现
1466. Reorder Routes to Make All Paths Lead to the City Zero调整路径方向,用map[a][b]=1表示从a到b, 从0开始找邻居,若方向反, cnt+=1, dfs
\
1.2 BFS level order Traversal
1.3 Topological Sort
** BFS(i.e. Kahn algo, recommended) * 检查环 -> 记录已访问个数n,若exit while loop后,n!=N, 则说明有环。 ** DFS
207. Course Schedulebfs, 最后判断已学课程?=课程总数
\
210. Course Schedule IIbfs, 返回所修科目的顺序
1136. Parallel Coursesbfs,每次for循环,semester+=1, 同时需要记录已经taken的课程数,最后返回semester
2050. Parallel Courses III每次学一门课i,更新maxTime[nei] = max(maxTime[nei], maxTime[i] + time[nei],O(V+E),也可以用min heap,存(finished time, courseId),O(VlgV+E),但因为没有遍历每条边并更新,反而更快。
\
269. Alien Dictionary通过比较相邻单词,建立char dependency map, 和indegree[c]. 同时建立set装所有出现过的字母,拓扑排序。tips:1)全部出现的char都需要有入度,防止后面一一对应的时候不能访问到. 2)若w1和w2重合部分相同,且len(w1)> len(w2),则说明有矛盾 3) 最后判断result和degree长度是否相同,说明其中有矛盾的节点出现。
\
310. Minimum Height Trees**Onion Peeling**: 本质是拓扑排序, 从edge只有1的节点出发,逐层删除,直至只剩下2个以下的点(可证明根结点至多2个)
\
444. Sequence Reconstruction拓扑排序,对于每条sequence, 只需对相邻点s[i], s[i+1]构建,无需s[i], s[j](j >> i)
1557. Minimum Number of Vertices to Reach All Nodes只需考虑没有入度的点
\
1.4. Cycle Detection
- Trajan's Algo dfs给每个node标上rank, 若有环,返回最小rank, 删除所有沿途比rank大的边 1192. Critical Connections in a Network
2. 最短路径
2.1 Shortest Path Without Edge
815. Bus Routes若建立stop的neighbor map, 并在queue中放stop, 会TLE. 所以要用bus为element. 方法一:建立stop->bus map, 再以bus为queue element, bfs,每次pop出bus, 遍历其所有站点会停的bus,若没有visited, 加入.方法二:建立bus->bus map,bfs所有bus.
\
752. Open the Lockbfs, 以4位数str密码为节点,每次尝试s[:i]+str(int(s[i])+-1)+s[i+1:]
\
127. Word Ladder建立dependency graph, newWord = word[:i] + chr(ord('a') + k) + word[i + 1:],若newWord在wordSet中,则添加至graph
\
126. Word Ladder II1)用bfs从endWord开始向startWord, 记录在dist[word];2)dfs:从startWord开始,在其neighbor中找到离endWord最近的词,不断搭建
\
433. Minimum Genetic Mutation晚期的word ladder题, 不需要要给每个字母创建邻居, 而是在bfs的搜素中,每个字符用26个字母replace, 看是否存在于候选中。
\
490. The Maze对于每个方向,继续向前走,直到碰壁,若没访问过,存有q
\
773. Sliding Puzzle将grid值化为tuple, 作为queue的state, queue中存(state, 0的值)
\
854. K-Similar Strings(需要optimize neighbour:找到s2和newWord第一个不同的字母i, 遍历j= i+1,...,找到word[j]==s2[i], 将其与w[i]交换,这样可以缩小candidates的范围)\
847. Shortest Path Visiting All Nodesstatus为(endPoint, bitmask存已访问的点)
864. Shortest Path to Get All Keysstatus为(endPoint, bitmask存已拿到的钥匙)
\
1091. Shortest Path in Binary Matrix 1926. Nearest Exit from Entrance in Maze
- A-star Algo
675. Cut Off Trees for Golf Event
将树由低到高排列,以此处理,对于dist += bfs(每棵树, 下一棵树),其中bfs可以替换为astar加快速度
\
2.2 Shortest Path With Edge Length:
* Dijistra
* BellFord
2812. Find the Safest Path in a Grid1)用bfs找到每点到小偷的最短距离 2)值域二分法+BFS肯会导致TLE => 解决方法:dijistra, 用max heap记录Maxsafeness以及位置,同时仍用visited记录访问记录。
\
505. The Maze II`因为每次球移动的距离不等,所以等价于一个有边长最短路径题->dijistra·\
787. Cheapest Flights Within K Stops
Can be solved with both BellFord and Dijistra, For Dijistra, the difficult past is the city pop out each may have a different stops, i.e. the cheapest cost is uniquely determinied by the (dest, stops), not just by destination. So we need a minCost table use (dest, stops) as the key. A follow up question from AirBnb, to output the shortest path, this is through maintain a precedent table, again, , the key is (dest, stops) pair for both Dijistra and bellford(I haven't think of a better way for bellford).
2577. Minimum Time to Visit a Cell In a Grid
3. 最长路径, 所有路径
329. Longest Increasing Path in a Matrixdfs + memo, O(mn)
\
4. 最小生成树(Minimum Spanning Tree)
3419. Minimize the Maximum Edge Weight of Graph建立reverse edge Graph + Prims
\