3 years ago
For a particular substring, our aim is to replace the characters that are least frequent. E.g. "ABABCA". In substribng "BABC" with k = 2, we can replace A and C. Do we care what characters these are? No. All we care about at any point is the most frequently occuring characters, because we can replace the other characters to form a valid substring. How do we know if a particular substring is valid? If we count the occurrences of the least occuring string and compare it to the current window length, we can get the number of characters that we need to replace. If this value is lesser than or equal to k, we can perform a valid replacement.
We can use a sliding window to keep track of the current substring, and also a variable maxf to keep track of the character with the highest count so far. The typical intuition is that we can maintain a hashmap containing character count and at each substring, we get the character with the highest frequency and check whether we can do the replacement or not. But, we don't need to do that.
Since our goal is to maximize the answer, i.e. find the longest substring, we can just keep track of the highest frequency. It doesn't matter if our left pointer moves forward. The result is only going to maximized when we find a new max frequency. Remember, our condition for success is susbtring_length - maxf <= k. Since we're maximizing the length, we need to maximize maxf to satisfy our condition. Decrementing maxf is not going to change our result. We can only perform a successful replacement if the current substring length minus maxfis less than or equal to k + maxf. If it is, we can keep expanding the window and updating res. If it is not, we have to shrink the window from the left side, because the current substring at that point contains too many characters for a successful replacement.
| Time | Space | Explanation |
|---|---|---|
O(n) | O(min(m, n)) | where n is the number of characters and m is the length of the charset used. |