LC 0828 [H] Count Unique Characters of All Substrings of a Given String - ALawliet/algorithms GitHub Wiki

"res[i]" means the result of all substrings ended at index "i". Final result is sum of all "res[i]" res[i] will be composed by all substrings of res[i-1] appending "i". For each substring in res[i-1], when we append "i" to it:

If there is no "i" in this substring, e.g., append "A" to "B", substring changes from "B" to "BA", the UNIQ value of this substring changes from 1 to 2, we add 1 to all this kind of substrings. If there is one "i" in this substring, e.g. append "A" to "AB", substring changes from "AB" to "ABA", the substring UNIQ value changes from 2 to 1. We minus 1 to this kind of substrings. If there is more than one "i" in this substring, e.g. append "A" to "ABAB", substring changes from "ABAB" to "ABABA", the substring UNIQ value doesn't change. We do nothing to this kind of substrings. So for each new "i", we only care how many same char before it, and what are the indexes of these chars. Actually we only need to know 2 same chars before it. Because situations more than 2 chars can be handled the same as 2 chars.

    def uniqueLetterString(self, S: str) -> int:
        res=[0]*(len(S)+1)
        idxs=[-1,-1](/ALawliet/algorithms/wiki/-1,-1)*26
        for i,c in enumerate(S):
            code=ord(c)-ord('A')
            first,second=idxs[code]
            res[i+1]=1+res[i]+(i-1-second)-(second-first)
            idxs[code]=[second,i]
        return sum(res)%(10**9+7)

class Solution:
    def uniqueLetterString(self, s: str) -> int:
        freq = [-1, -1](/ALawliet/algorithms/wiki/-1,--1)*26
        n = len(s)
        dp = [0]*n
        dp[0] = 1
        code0 = ord(s[0]) - ord('A')
        freq[code0] = [-1, 0]
        for i in range(1, n):
            code = ord(s[i]) - ord('A')
            second_last_idx, last_idx = freq[code]
            dp[i] = dp[i - 1] + (i - last_idx) - (last_idx - second_last_idx)
            freq[code] = [last_idx, i]
        return sum(dp)
    
    def uniqueLetterString(self, S):
        index = {c: [-1, -1] for c in ascii_uppercase}
        res = 0
        for i, c in enumerate(S):
            k, j = index[c]
            res += (i - j) * (j - k)
            index[c] = [j, i]
        for c in index:
            k, j = index[c]
            res += (len(S) - j) * (j - k)
        return res % (10**9 + 7)
    
    def uniqueLetterString(self, s: str) -> int:
        """
        The key ideas behind the solution:
        1. The maximum possible (length?) substrings that can end at an index are i + 1.
        2. The contribution of a character to this substring is (i + 1) - it's last seen position.
        3. At each point, sum of all contributions, gives the number of total substrings found so far.
        4. The last seen position of char is actually i + 1.
        """
        last_position = defaultdict(int) # Used for storing the last position of each character
        contribution = defaultdict(int) # Used for storing the contribution of each character so far. This
        # will possibly be updated throughout the string traversal
        res = 0
        
        for i, char in enumerate(s):
            max_possible_substrs_at_idx = i + 1
            contribution[char] = max_possible_substrs_at_idx - last_position[char]
            
            res+=sum(contribution.values()) # total number of substrings found so far
            last_position[char] = i + 1
            
        return res

Here I use 2 maps, map1 and map2, map1 records the last index for each letter, map2 records the second last index for each letter. They are both initialized as -1(means a letter hasn't shown up) .

Then consider about introducing a new letter, it will contribute (i - last_ch_position) unique chars, and also decrease (last_ch_position - second_last_ch_position) unique chars.

Take an example: ABCBB
i = 0, ch = A, substring = A. A hasn't shown up, so cur += 1, map1:{A:0}, map2:{}
i = 1, ch = B, substring = AB, B. B hasn't shown up, so cur += 2, map1: {A:0, B:1}, map2:{}
i = 2, ch = C, substring = ABC, BC, C. cur+= 3, map1:{A:0, B:1, C:2}, map2:{}
i = 3. ch = B, substring = ABCB, BCB, CB, B. last_b_position = 1, so it only contributes unique char to the index after last_b_position, which is i-map1[B] = 2, ie CB, B. It also decreases unique char in ABCB and BCB, which is map1[B] - map2[B] = 2, so cur+= 2-2, map1:{A:0, B:3, C:2}, map2:{B:1}
i = 4, ch = B, substring = ABCBB, BCBB, CBB, BB, B. It increase one unique char which is "B" (i-map1[B]), but decreases 2 unique chars, which are "CBB, BB" (map1[B] - map2[B]), cur += 1 - 2, map1: {A:0, B:4, C:2}, map2:{B:3}

    def uniqueLetterString(self, s: str) -> int:
        map1 = collections.defaultdict(lambda:-1)    # last index for this letter
        map2 = collections.defaultdict(lambda:-1)   # second last index for this letter
        cur = 0
        res = 0
        
        for i, ch in enumerate(s):
            cur += (i - map1[ch]) - (map1[ch] - map2[ch])
            map2[ch] = map1[ch]
            map1[ch] = i
            res += cur
        return res