395. Longest Substring with At Least K Repeating Characters - cocoder39/coco39_LC GitHub Wiki

395. Longest Substring with At Least K Repeating Characters

Option 1: divide and conquer

class Solution:
    def longestSubstring(self, s: str, k: int) -> int:
        
        def helper(start, end):
            counters = collections.defaultdict(int)
            index_map = collections.defaultdict(list)
            for i in range(start, end+1):
                ch = s[i]
                counters[ch] += 1
                index_map[ch].append(i)
            
            mid = -1
            for ch in counters:
                if counters[ch] < k:
                    mid = index_map[ch][0]
            
            if mid == -1:
                return end - start + 1
            
            return max(helper(start, mid-1), helper(mid+1, end))
            
        return helper(0, len(s)-1)

Time complexity: O(n) = O(k) + O(n-k) + n. Worst case time complexity is O(n^2) when k is always 1. Best case and avg case time complexity is O(n log n). (link to Master theorem)

Option 2: sliding window

Inspired by 340. Longest Substring with At Most K Distinct Characters

maxUnique <= 26 so overall time complexity is O(26 * N) = O(N)

Main idea is to enumerate all substrings with 1 to maxUnique distinct letters.

With one pass (i.e., T(n)), we can get

  • all substrings with targetUnique distinct letters
  • kRepeatCount: number of distinct letters which repeats at least k times

Once kRepeatCount == targetUnique, we know that the entire substring meets the requirement

class Solution:
    def longestSubstring(self, s: str, k: int) -> int:
        maxUnique = len(set(list(s)))
        res = 0
        for targetUnique in range(1, maxUnique+1):
            left, right = 0, 0
            curUnique, kRepeatCount = 0, 0
            counters = [0] * 26
            while right < len(s):
                if curUnique <= targetUnique:
                    idx = ord(s[right]) - ord('a')
                    if counters[idx] == 0:
                        curUnique += 1
                    counters[idx] += 1
                    if counters[idx] == k:
                        kRepeatCount += 1
                    right += 1
                else:
                    idx = ord(s[left]) - ord('a')
                    if counters[idx] == k:
                        kRepeatCount -= 1
                    counters[idx] -= 1
                    if counters[idx] == 0:
                        curUnique -= 1
                    left += 1
                if curUnique == kRepeatCount:
                    res = max(res, right - left)
        return res