Two Heaps - kjingers/Grokking-Notes GitHub Wiki
Inserting into a sorted list is O(n). Inserting into a heap is O(log(n))
heapq.heapify() is O(n). If there is just one element that is changed, then you can use private function _siftup and _siftdown to run in O(log(n)).
_siftdown(heap, startpos, pos)
'heap' is a heap at all indices >= startpos, except possibly for pos.
pos is the index of a leaf with a possibly out-of-order value.
Restore the heap invariant.
_siftup(heap, pos)
So to remove:
h[i] = h[-1]
h.pop()
if i < len(h):
heapq._siftup(h, i)
heapq._siftdown(h, 0, i)
Find the median of a number stream
The basic idea is you have a maxHeap containing the smaller half of the numbers, and a minHeap containing the larger half of the numbers. So, when you go to get the median, you just need the top of the maxHeap (if odd number of numbers), or top of both heaps for the average (for even number of nums).
The create a maxHeap, you store the negative of the number.
class MedianOfAStream:
minHeap = []
maxHeap = []
def insert_num(self, num):
# if maxHeap is empty or num is <= largest value in maxHeap,
# then put in maxHeap. Else, minHeap.
if not self.maxHeap or num <= -self.maxHeap[0]:
heappush(self.maxHeap, -num)
else:
heappush(self.minHeap, num)
# If more elements in minHeap, then move smallest to maxHeap
# elif maxHeap has two or more elements than minHeap, the move largest
# value to maxHeap.
if len(self.maxHeap) < len(self.minHeap):
heappush(self.maxHeap, -heappop(self.minHeap))
elif len(self.maxHeap) > (len(self.minHeap) + 1):
heappush(self.minHeap, -heappop(self.maxHeap))
def find_median(self):
# If even, then return average of middle two
if len(self.maxHeap) == len(self.minHeap):
return (-self.maxHeap[0] + self.minHeap[0]) / 2.0
return -self.maxHeap[0] / 1.0
Sliding Window Median
Combines sliding window pattern with Two Heap pattern. Like sliding window, we iterate though the window adding elements to our two heaps. Once our window is of size 'k', we update result list and then remove the outgoing element at the beginning of the window.
We use two helper functions remove(heap, element
and rebalance_heaps()
class SlidingWindowMedian:
def __init__(self):
self.maxHeap = []
self.minHeap = []
def find_sliding_window_median(self, nums, k):
# Initialize Result array
result = [0.0 for x in range(len(nums) - k + 1)]
for i in range(len(nums)):
# Push to appropriate heap. Then rebalance
if not self.maxHeap or nums[i] <= -self.maxHeap[0]:
heappush(self.maxHeap, -nums[i])
else:
heappush(self.minHeap, nums[i])
self.rebalance_heaps()
# If we have atleast 'k' elements in sliding window, update result array
if (i - k + 1) >= 0:
# If even number of elements, else odd
if len(self.maxHeap) == len(self.minHeap):
result[i - k + 1] = (-self.maxHeap[0] / 2.0) + (self.minHeap[0] / 2.0)
else:
result[i - k + 1] = -self.maxHeap[0] / 1.0
# Delete outgoing number. Then Rebalance
elementToBeRemoved = nums[i - k + 1]
if elementToBeRemoved <= -self.maxHeap[0]:
self.remove(self.maxHeap, -elementToBeRemoved)
else:
self.remove(self.minHeap, elementToBeRemoved)
self.rebalance_heaps()
return result
def rebalance_heaps(self):
if len(self.maxHeap) > (len(self.minHeap) + 1):
heappush(self.minHeap, -heappop(self.maxHeap))
elif len(self.maxHeap) < len(self.minHeap):
heappush(self.maxHeap, -heappop(self.minHeap))
def remove(self, heap, element):
ind = heap.index(element)
heap[ind] = heap[-1]
heap.pop()
if ind < len(heap):
heapq._siftup(heap, ind)
heapq._siftdown(heap, 0, ind)
Maximum Capital
Not very obvious to use two heaps. We defintely need a maxHeap to get the project with the greatest profit that we can afford. We also use a minHeap to go through all the projects we can afford and add them to the maxHeap. We do this after every project, since we have increased capital.
Another option found on leetcode discussions is, instead of using a minHeap, we simply use zip()
to combine and sort based on capital:
pc = sorted(zip(Capital, Profits))
Then, we can just loop through and add to maxHeap for each project.
def find_maximum_capital(capital, profits, numberOfProjects, initialCapital):
minHeap = []
maxHeap = []
# Inset all projects capital into minHeap
for i in range(len(capital)):
heappush(minHeap, (capital[i], i))
availableCapital = initialCapital
for _ in range(numberOfProjects):
while minHeap and minHeap[0][0] <= availableCapital:
capital, i = heappop(minHeap)
heappush(maxHeap, (-profits[i], i))
if not maxHeap:
break
availableCapital += -heappop(maxHeap)[0]
return availableCapital
Next Interval
For this one, we can use two Max Heaps: One based on the intervals starts, and another based on the intervals ends. This way, we can loop though, starting with the interval with the highest end, and determine which interval has the next start. We loop though the maxStartHeap until the max start is no longer >= the current interval's end.
Once we update the result array. We put the next interval back into the maxStartHeap, since it could be the next start for other intervals. We don't need to push other intervals we loop past, since they are guaranteed not to have a closer start than the final one.
def find_next_interval(intervals):
n = len(intervals)
# Default to -1 (no next interval)
result = [-1 for x in range(n)]
maxStartHeap = []
maxEndHeap = []
# Create both Heaps
for i in range(n):
heappush(maxStartHeap, (-intervals[i].start, i))
heappush(maxEndHeap, (-intervals[i].end, i))
# Loop though each interval, starting with the highest end,
# and determine the next start
for _ in range(n):
# Get the interval with the highest remaining end
topEnd, endIndex = heappop(maxEndHeap)
if -maxStartHeap[0][0] >= -topEnd:
topStart, startIndex = heappop(maxStartHeap)
# Look through maxStartHeap until ther are no more intervals
# with a start >= this end
while maxStartHeap and -maxStartHeap[0][0] >= -topEnd:
topStart, startIndex = heappop(maxStartHeap)
# Update the result array
result[endIndex] = startIndex
# Push the next interval back into the maxStartHeap, since it could be
# the next interval for other intervals
heappush(maxStartHeap, (topStart, startIndex))
return result