Techie Delight
Coding made easy
Given an array and a positive number k, check whether the array contains any duplicate elements within range k. If k is more than size of the array, the solution should check for duplicates in the complete array.


Input:
A[] = { 5, 6, 8, 2, 4, 6, 9 }
k = 4

Output: Duplicates found

(element 6 is repeated at distance 4 which is <= k)


Input:
A[] = { 5, 6, 8, 2, 4, 6, 9 }
k = 2

Output: No Duplicates found

(element 6 is repeated at distance 4 which is > k)


Input:
A[] = { 1, 2, 3, 2, 1 }
k = 7

Output: Duplicates found

(element 1 and 2 is repeated at distance 4 and 2 respectively which are both <= k)

Naive solution is to consider every sub-array of size k and check for duplicates in it. The time complexity of this solution is O(n.k2) since there can be n subarrays of size k and each subarray might take O(k2) time for checking duplicates.

The problem can be efficiently solved using hashing in O(n) time and O(n) extra space. The idea is to traverse the array and store each element along with its index in a map. i.e. (element, index) pairs as (key, value) pairs in a map. If any element is already found present in the map, we simply check if that element is repeated within range of k using its previous occurrence information from the map.
The algorithm can be implemented as follows in C++, Java and Python:

C++

Download   Run Code

Output:

Duplicates found

Java

Download   Run Code

Output:

Duplicates found

Python

Download   Run Code

Output:

Duplicates found

Download   Run Code
Output:

Duplicates found

Download   Run Code
Output:

Duplicates found

Download   Run Code
Output:

Duplicates found

We can also use Sliding Window approach to solve above problem. The idea is process every window of size k one at a time and store its elements in a set. Now if any element repeats in the window, then we can say that it is repeated within range of k.
Initially our window will contain the first k elements of the input. Then for each element of the remaining input, we add it to the current window. While adding the i’th element of the input to the current window, we have to remove the (i-k)’th element from the window at the same time. This is done to maintain efficiency of the solution and to keep window balance at any point.
Below is C++, Java and Python implementation based on above idea –

C++

Download   Run Code

Output:

Duplicates found

Java

Download   Run Code

Output:

Duplicates found

Python

Download   Run Code

Output:

Duplicates found

Download   Run Code
Output:

Duplicates found

Download   Run Code
Output:

Duplicates found

Download   Run Code
Output:

Duplicates found

The time complexity of above solution is O(n) and uses O(n) extra space.

The complexity mentioned is average case. The worst case complexity is n.k using un-ordered map and set.
I think using map and set the work case complexity will be n.log(k)
So if average case complexity is important than un-ordered map and un-ordered set is the right data struct to use. But if worst case complexity is important than ordered set and map should be used.
Can we use min-heap? O(k +nlogk) and O(1) space?
How would a heap occupy O(1) space?

source