9.5 Challenges & Classics - swchen1234/Algorithm GitHub Wiki
Leetcode #4 Median of Two sorted array (it is extrememly painful to get this question right, with so many corner cases. There is also a faster binary search version. Suggest to do it again, again....Honestly, I am even not sure if it is an DC solution.)
unique letter string count for each char, how many times it appear uniquely in a substring. Specifially, use index[26][2] to record the most recent location of appearance of char the character. when a new appearance is encountered, res += (index[c][1] - index[c][0]) * (i - index[c][1]), and then update the most two recent occurences.
Prefix and Suffix Search It is so painful to get it right, always have trouble with off-one. The official solution is add suffix#wholeWord into the trie, and search the given pair by just suffix#prefix. the reason to store wholeword instead of the prefix is to save space, as long as we put a index to the node after #, indicating it can be the end, then it is done. In addition, the index of the leaf node in the trie are always make to be the largest during the construction.
Leetcode #169 Majority Element This is a divide and conquer classic, but can also be solved by a very smart O(n) approach. The approach is very similar to the buy stock problem, go through the list, keeping track of the current majority element and its count, but the number is different from the majority, reduce the cnt, otherwise, increase it. If the count reduce to zero, it means for all the elements we have visited, the majority takes half and the remaining element takes the same number totally. So we can forget about the previous elements visited and set count to zero. This can be proved through contradiction.
Text justification This is a painful but popular hard question. The common way to handle corner cases is to count the number of words per line first, and then construct each line by concatenating word and space with proper length( decided by / and % ).
Cat and Mouse So hard, cannot get the last case right using DFS(minimax) + memorization. The official answer is too complicated to understand.
Trapping Rain Water This is a tricky one. All different approach have the same gist, that is the area of index i is determined by the min(left_max, right_max) - height[i]. With this in mind, one can come up with a smart two pointer approach, if the left_max < right_max, then the area of left must be upbounded by left_max, hence we can keep moving the left pointer, adding to the area and updating it until it is higher than the right max.
The Skyline Problem
This problem can be solved so brilliantly with just one pass from left to right, by using a sorted array(implemented through multiset in C++) to keep track of the available highrise at given critical point. If we reach the left wall of the building, we add the its height to the multiset; we remove its height when we reach the right wall of it. Meanwhile, we return the maximal height in the multiset after the insertion and delete at each critical point. This idea is so intuitive and also easy to implement.
The problem can also be solved through divide and conquer by keep tracking of the currLeftHeight and currRightHeight during merging.
Largest Rectangle in HistogramThis is a tricky one, can be solved using stack in O(N). At all times, we maintain the stack only contained sorted elements with biggest on top. We visit each rectangle and pushed into stack exactly once, if it is smaller than the top of the stack, in order to maintain the stack sorted, we need to pop out all elements bigger than it until we are able push it into the stack. each time we pop out elements, we calculate the area under it. Actually, we are getting the maximal rectangle with this element as the smallest height. Finally, we handle the remaining elements in the stack, they are all ending in the last right side.
Maximal Rectangle Can be solved by DP by going through each row, keep track left most, right most and height for each column. left[c] represent the left most column if the rectangle contains the consecutive bin above matrix[r][c], and height[c] also represent the length of the consecutive bin above matrix[r][c]; Can also be solved similarly as largest Rectangle in Histogram by considering each row as the ground level, construct the largest histogram on to this level. So it equals solve the largest histogram row number of times.
Queue Reconstruction by Height This question can be solved elegantly through a tricky way. The key point is to think reversely and use the fact that inserting short person won't impact the second value of higher person, but the opposite is not true. So we will construct the queue from the tallest person first.
Insert Intervals The idea is very simple: push all intervals into the result before it overlaps with the newInterval; upon overlapping, update the newInterval, until no overlapping(newinterval.end < interval[i].start); push the rest intervals into res.
Employee Free Time This is almost the same as above, except need to flatten the array at the beginning. O(clgc)
Russian Doll Envelopes Sort by the first element first, then find longest increasing subsequence for the second element, this is done by binary search, for each element we find its position in the exisiting increasing sequence, if at the end we just add it, and update the result; otherwise, we can always insert the element at the right place to make it increasing and erase the rest, since they appear before the element. Actually, we don't even need to erase the rest. if the new elemnt is longer than the last element, we just need to replace the new element with the first one bigger than this, we are good to keep the sequence increasing. Note that by doing so, the exisitng sequence is not neccsary a valid answer, since smaller element may appear earlier than big element. But we don't care, since we will only update the result when the length changes.
max sum of rectangle no larger than K A variant version of 2d max subarray problem.
Minimize Max Distance to Gas Station I should actually create a category for this kind of binary searching problem, namely, converting the problem and search in the answer space. For this question, we can use a heap to store the interval and current interval split number pair in O(KlgN) but more creatively, we can think of the problem as : given min distance D, can we achieve in less than K splits. to find this max D, we do Binary search.
Maximum Average Subarray II This is another example of converting problem and binary search. Convert it to whether we can find a max subarray average higher than the the search value and update the hi and lo. The converted problem is also not trivial, since we are working with average now instead of sum, to simplify, subtract each element by the checked average.
Large Number This is not hard. the key is use the smart comparitor (s1 + s2 ? s2 + s1)
Max Points on a Line This question is not hard, however, it has one of the fewest acceptance rate due to a lot of corner cases. The key point is to figure out how to store slope as key in map. If use double as key map, we can easily get wrong answer. To fix it, we use pair<int, int> to represent double by rational number. To get unique one, we use greatest common divisor to unify quotient.
Minimum Cost To Hire k Workers This is an intersting problem. Trying to find the minimal total comp C, s.t C > (w_i / q_i ) / sum(q_i), where i belongs to K. To intuitively, sort the worker by the smallest w/q, the last one in the K set define the threshold ratio, so that the total comp is propotional to it(C = w_i / q_i * sum(q_i)), but the smallest K set doesn't garantee to have the smallest (w_i / q_i ) / sum(q_i), so we use a heap to maintain the K set returns out the minimal sum(q_i).
Consecutive Numbers Sum This is surprisingly easier if get the math. I solve it based on the mod, if k is odd, N % k return 0, then the N / k is exactly in the middle of the consecutive array, and mod zero means we can allocate the N - N / k evenly around N/k; if k is even, then N/k returns us the left center, then we can again allocate evenly around N/k, but this leaves us the right most element uncovered. the difference between it and the left center is k/2, so when N%k == k/2, we can find consecutive. The last thing is the boundary of k, since N/k + k/2 < N && N/k - k /2 > 0, we can have k <= sqrt(2 * N).
Number of Digit One Harder than I expected, while can be solved very elegantly. At each digit, there are two types of ones: 1) ones happens at regular interval, e.g. At unit digit, 1, 11, 21, 31, 41; 2) Ones that happens consecutively e.g. at tenth digit, 10, 11, 12, 13, 14 ... Therefore when we count the number of digit one from the back to the begging at each digit, we need to consider both. For example, at tenth digit, add n/100 * 10 and min(max(n%100 - 10 + 1, 0), 10).
Reaching Points It can be solved pretty elegantly in O(lg(max(tx, ty)). Noticing that if solving backward, each state can only have one predecssor, (tx - ty , ty ) or (tx, ty - tx). Furthermore, to avoid case like tx - ty - ty - ty - ty -... happens, we can do tx %= ty( if sx > ty ).
Best Meeting Point This looks like the Shortest Distance From All Buildings at first glimpse, except no barrier. The solution is much simpler, by simply using the median of the cols and rows.
Sum of Subsequence Width Surprisingly, this is solved by math. Since we are considering the min/max of subsequence, we can sort A (why?). Now for each A[i], the left is always smaller than A[i], see formula derived in official solution.
Self Crossing This is purely math problem with a lot of corner case to consider. basically, there are three scenario can lead to crossing: crossing with four/five/six lines.
K Empty Slot The trick is to convert the days->flower to flower->days. then apply the sliding window approach to find the earliest day with all k elements in the middle bigger than the two ends. In the official solution, two approaches for sliding window: 1) use a deque to always keep the window in increasing order. 2) An more simpler approach is just to compare each element with the left and right, whenever i smaller, update the left to be i, and right adjusted accordingly.
Longest Substring With At Most Two distinct Characters It can be solved by using a hash table to store the unique char in the sliding window, and their last occurences. Whenever we find a new element not in the hash table, we find the key with the smallest last occurance, and remove it from the hash table, and moving the left pointer of the sliding window to the one to the right of the removed char value.
Longest Substring With At Most k distinct Characters The approach mentioned above for k = 2 case also works here. But there is even a faster approach, by just counting occurances of elements in the sliding window, as well as the number of unique number. if the unique number > k when moving right pointer, then keep moving left pointer until the unique number reduce to k.
Substring With Concatenation of All Words It is another typical hash table with two pointer substring question, the idea is always find the right pointer qualifed the substring and move the left pointer to disqualifed the substring indeed it is very similiar to minimum window substring.
Sliding window Median Use min & max heap(multiset) to maintain the median. remove the element whenever the window pass it. Even if there is duplicate, multiset always return the one of them when find() and erase() it. O(n*lgk).
Basic Calculator It is not as easy as it looks like at first glimpse. The stack is used to store the sign before the ( and pop out after ).
Basic Calculator III I copy the answer from other's discussion and it is very concise. the idea is in each iteration either get the value from number or from (), and use the last op sign to apply to the stack. Need to test if can be applied to Basic Calculator.
Longest Valid Parenthesis This can also be solved with DP in O(n), but stack is more easy. We use the stack the store the previous idx of the last valid expression. So at begining, we push -1 into the stack, when we see '(', we push it into the stack. if we see ')', we pop out elements from the stack, if the stack become empty, it means the last ')' is not valid. we push ')' into the stack again. otherwise, we just update the max.
Number of Atoms It is simply tedious. Can be solved with recursion, stack, or RegExpr(haven't take a closer look yet), all in O(N^2). For my stack solution, always push pair<string, cnt> to the stack, as well as '(', when seeing a ')', pop everything out to a hash table and multiply by the cnt following ')', and push back to the stack.
Binary Tree postOrder Traversal This is really a great question, hardly labeled as hard. The trick is to realize that we create the stack in a kind of pre-order way(everytime we pop out the top element add it to the res, and push its left and right into the stack). So the right kid pop out first, and this is the only difference compare with pre-order. As a result the res stores the root earlier, we just need to reverse. (see https://leetcode.com/problems/binary-tree-inorder-traversal/, which is super cool too.)
Shortest Distance From All Buildings use BFS from each building, calculate distance to each point, mark the original grid -1. from the second building, calculate distance to all previous marked -1 building, maker them -2.... calc the point with the smallest total distance.
24 Game Not hard. Use backtracking for all possible permutation.
Word Pattern II Use backtracking to check, while maintaing a map<char, string> and a set to avoid different key map to the same string.
Sum of Distances In Tree I spent the whole Friday night on this. The optimal solution is O(N), by DFS twice. Assuming root from node 0, in the first DFS, populate count[node] and subtree[node]; in the second DFS, get answer[node]. The key point is to realize that ans[k] = ans[node] - count[k] + (N - count[k]) when DFS top down in the second round.
Closest Binary Search Tree Value IIBST This is actually not hard. The official solution is to bst the target value while putting the smaller value into predecessor and larger value into successor along the search. Then we have two stacks, predessor stores the left node in decreasing order and successor stores the right node in increasing order, every time we add the one of the two tops which is closer to the target, we unwind all left node of the top node's right into the stack(predessor in this case)... until we get k values. There is even simpler solution, just in order visit BST, while maintaining the size-k deque. if the new distance to target is smaller than the deque front, pop out the front, and add it to the back. By the nature of BST, the shape of the deque is alwyas ><.
Shortest Path Visiting All nodes It looks like tsp, but much simpler in that each node can be visited multiple times, and length between node is the same. The idea is similar, by storing the two status in the dp table [visited nodes in bit form][last visited node]. To use BFS, each starting node is push to a queue, and update the dp table. until all nodes has been visited. O(2^N * N) time and space complexity, why?
Trapping Rain II The trick is to realize that what decide the height of the water is the shortest border. Given that, we can always put all the surrounding border into a heap and pop out the shortest one and dfs. Realizing that it is just an extended version of Trapping Rain Water.
Cut Off Trees For Gold Event This is almost equivalent to brute force. find the tree in increasing order using a heap, for each tree to be walked to, use DFS to find the distance from the source location to the target location.
Guess the Word This problem is solved heuristically. Basically, you can always guess a number, find the candidates with the same matching number as the master -> randomly pick a number from the candidates again, find the candidate that both in the previous candidates pool and matching the number to the previous guess. So that we can keep reducing the size of the candidates pool. To speed up, find the guess with min max potential candidates pool(maximial of candidates size matching master's return).
Candy find the max( length of left increasing subarry, length of right increasing subarry )
Sliding Puzzle This is really not hard. Simpy use a string to encode the board status by turning it to 1dim str, find 0's position, and swap it with the nearby position(by having a map<0's location, vector>, if not visited, then push it to the queue.Time complexity O((r * c)!), since each board state can be checked at most once. i don't understand why the solution gives O(r * c * (r * c)!).
Stamping the Sequence It is very natural to think backward approach. I then take an almost brute force approach by comparing each window with the stamp, when match, mark the stamped window with '*', which will be used as a wildcard in future matching.
Frog Jump This is not hard, use a mp<stone, set> to all steps that can lead to a stone. visiting the stones from smallest to biggest, forward propagate this hash map. Finally return if the set of the last element is empty.
Super Egg Drop This is a classic. Two ways to solve 1) use binary search to speed up dp[K][N] = min_X(max(dp[K-1][N-X], dp[K][N-X]), the time complexity can be reduce from O(kN^2) to O(kNlgN). 2) A better approach is to use f(m, K) represent the maximal number of floors can be tested using m moves and K eggs, then we have f(m, K) = f(m - 1, K - 1) + f(m - 1, K). if the first egg breaks, then the minimal dropped floor is below the tested floor; otherwise, it is above the tested floor, so the total number of floors can be tested is the sum of them. O(klgN).
Interleaving String I failed to get the solution before looking at the answer, although this is solved by traditional DP by initializing a n by m boolean array, dp[i][j] represent whether s3[:i+j-1] is interleaved by s1[:i] and s2[:j]. The remaining is just put the number in the 2d matrix by row or col, and return the right corner value.
Encode String With Shortest Length The problem can be solved by setting dp[i][j] stores the min encoded string for string[i:j], and dp[i][j] = min(dp[i][j], dp[i][k] + dp[k + 1][j]). Also, for each substr s[i:j] we need to check if there is repeating pattern, the way(maybe not best) can be check if substr exists in substr + substr, if the starting position is not 0, then there is repeatence.
Count Different Palindromic Subsequence Similarly, we use dp[i][j] to record the count of palindrom within S[i:j], if the two end is different, we set S[i:j] to be the max of S[i+1:j] and S[i:j-1], and when they are the same S[i:j] = 2 + S[i+1:j-1]. However, we will uncover the situation that aa..a. therefore, we use dp[k][i][j] to record the count of parlindrom staring from 'a' + k.
Paint House Not hard. iterate each row, store the minimal two cost and indices up to row i. if the next row's index is not the same as last row's minIdx, then its min cost using k the color for house i will be the minCost up to row i -1 plus the cost[i][k]. other wise, it is the second min total cost up to row i - 1 + cost[i][k].
Number of Music Playlists Brilliant. use dp[i][j] be the number of playlists of length i that have exactly j unique songs.
Minimal Number of Refueling Very interesting problem. Two ways: 1) use dp[i] record the max distance achieved by i stops. 2) watermarker method. whenever the watermarker is negative, pop out the maximal fuel from the heap until positive.
Arithmetic Slices II Subsequence I didn't do a good job, since i spent a long time on map<pair<int, long>, int> with memory limit issue while the official solution use vector<unordered_map<long, int>> to solve the issue. I dont know why the pair as a key may takes much bigger memory, but it seems to be. Also, I spent a very long time figuring out how to set the count for first difference before becoming a valid arithmetric subsequence.
Special Binary String This is very special question, as you need to visualize the movement. Intuitively, we decompose the string into several mountains, and reorder and concat the mountain in decreasing order. Every mountain can be viewed as '1' + middleMountain + '0', hence the specialy represention of middleMountain be again be solved through recursion.
Strobogrammatic Number III Iterate all possible lengths of string, backtrack the possible solution and if within the range add count.
Bricks Falling when hit I make it so tedious and slow. But the basic idea is to first knock the bricks down, so that it becomes seperate set, and add the hitted bricks back reversely, and count how many new bricks are connected to the ceiling after each adding.
Best Time to Buy Sell Stocks I Can buy and sell only one time -> equal to maximal drawdown. keep track the minimal, compare the difference of the element minus minimal with the maxProfit so far. update the minimal when smaller minimal. Formed in DP way: dp[i] = max(dp[i - 1], max(0, diff))
Best Time to Buy Sell Stocks II Can buy and sell many time -> sum up all the positive difference. Formed in DP way: dp[i] = max(dp[i - 1], price[i] - minPrice).
Best Time to Buy and Sell Stock with Cooldown Must wait for one day before buy again -> solved by DP. but if just use one dim cache, dp[i] cannot be derived from dp[i - 1], since the status of dp[i - 1], can be two types, either ends with hold or not holding. therefore, we used two caches array to keep track of max profit if the last status is hold or not, and the two array inter-update each other.
Best Time to Buy and Sell Stock with Transaction Fee Can trade multiple time but with fee -> The main difference with [Best Time to Buy Sell Stocks II] is that buy at day 0 and sell at day i is not longer the sum of each day's profit on a increasing sequence. Again it can be converted to two seuqnce
Best Time to Buy and Sell Stock III Can trade at most two times -> Solved elegantly by using four status to represent the status, hence four dp array.
Best Time to Buy and Sell Stock IV Can trade at most k times -> This is a generalization of the previous case, in which we have 2k status. So we can think up updating a 2D dp matrix with 2k rows and n cols. The update is done from left to right when reading the price along time.
- A common trick among all the previous questions is use O(1) to only cache the current value and update. Since in updating both Hold[i] and notHold[i], it involves Hold[i-1] and notHold[i-1], a natural solution is cache the previous value, however a more smart way is start from the end where no other variable relies on it. For example, here update NotHold[i], first and Hold[i]. So that we can save the time storing the temp variable.
- Combination Sum: 给定[v1,v2,v3, ...], 找到加起来等于target的所有子集 =》DFS
- Coin Change: 给定[v1,v2,v3, ...], 找到加起来等于target的最小子集/子集个数 =》 DP
原型(39. Combination Sum): 元素不重复,每个元素可以用无限次
- 变种 40. Combination Sum II: 元素可能重复,每个元素只能用一次 =》 sort; if i == idx 或 nums[i] != nums[i-1];dfs(i+1,..)
- 变种 216. Combination Sum III: 找到所有长度为k的子集且其相加为target; 元素不重复,每个元素只能用一次 => 除了判断cum == target, 还要判断len(subset) == k
- 变种 377. Combination Sum IV: 找到子集和target的个数; 元素不重复,每个元素可以用无限次,元素不同顺序视为不同组合 =》 本质是上Coin Change II 改变dp中loop的顺序, for val in range(target): for num in nums:...
- 322. Coin Change: 找到子集和为target的最小子集 => dp[amount+1], for coin: for amount in range(amount + 1):dp[val] = min(dp[val], dp[val-coin]+1)
- 518. Coin Change II: 找到子集和为target的子集个数 => 和上题的唯一区别在于dp的更新, dp[val] += dp[val-coin]
- 找到所有组合=》dfs
- 找到最小组合/组合个数 => dp
- 找到组合个数 且 order matter => change for loop order
Graham Scan O(nlgn) sorting, O(n)scan
def ccw(p, p1, p2):
# test whether p-p2 is on the counter close wise side of p-p1
return (p2[1] - p[1]) * (p1[0] - p[0]) - (p2[0] - p[0]) * (p1[1] - p[1])
def buildConvexHull(stones):
if len(stones) == 0:
return []
# step 1: find lowest stone
loid, lo = min(enumerate(stones), key=lambda x: x[1])
print lo
# step 2: sort by angle to lo
toAdd = [s for s in stones if s != lo]
toAdd.sort(cmp = lambda p1, p2: ccw(lo, p1, p2) * -1 )
print toAdd
# step 3: Add each one to the stack
stk = deque([])
stk.append(lo)
stk.append(toAdd[0])
for s in toAdd[1:]:
while len(stk) >= 2 and ccw(stk[-2], stk[-1], s) <= 0:
stk.pop()
stk.append(s)
return list(stk)