Given a string and a positive integer k, find all distinct substrings of a given string of any length containing exactly k distinct characters.

For example,

A simple solution is to generate all substrings of the given string and return substrings containing exactly k distinct characters. The time complexity of this solution is O(n3), since it takes O(n2) time to generate all substrings and O(n) time to process each substring.

We can improve the time complexity using extra space. The idea is to process substrings starting with each index i in the input string. In each iteration of the loop, add each character present in the processed substring to a hash table. If the count of distinct characters becomes k at index j at any point of time, include the substring str[i…j] in the result.
Following is the C++, Java, and Python program that demonstrates it:

C++

Download   Run Code

Output:

adce
abcadc
bcadc
abcad
bcad
cadce

Java

Download   Run Code

Output:

[abcad, abcadc, cadce, adce, bcad, bcadc]

Python

Download   Run Code

Output:

{‘bcad’, ‘adce’, ‘cadce’, ‘bcadc’, ‘abcad’, ‘abcadc’}

Download   Run Code
Output:

adce
abcadc
bcadc
abcad
bcad
cadce

Download   Run Code
Output:

[abcad, abcadc, cadce, adce, bcad, bcadc]

Download   Run Code
Output:

{‘bcad’, ‘adce’, ‘cadce’, ‘bcadc’, ‘abcad’, ‘abcadc’}

The time complexity of the above solution is O(n2), and the auxiliary space used by the program is O(n).

Exercise: Modify the solution to find substrings of length k containing exactly k distinct characters.

source