Find duplicates within given range k in an array

Techie Delight
Coding made easy
Given an array and a positive number k, check whether the array contains any duplicate elements within range k. If k is more than size of the array, the solution should check for duplicates in the complete array.

Input:
A[] = { 5, 6, 8, 2, 4, 6, 9 }
k = 4

Output: Duplicates found

(element 6 is repeated at distance 4 which is <= k)

Input:
A[] = { 5, 6, 8, 2, 4, 6, 9 }
k = 2

Output: No Duplicates found

(element 6 is repeated at distance 4 which is > k)

Input:
A[] = { 1, 2, 3, 2, 1 }
k = 7

Output: Duplicates found

(element 1 and 2 is repeated at distance 4 and 2 respectively which are both <= k)

Naive solution is to consider every sub-array of size k and check for duplicates in it. The time complexity of this solution is O(n.k²) since there can be n subarrays of size k and each subarray might take O(k²) time for checking duplicates.

The problem can be efficiently solved using hashing in O(n) time and O(n) extra space. The idea is to traverse the array and store each element along with its index in a map. i.e. (element, index) pairs as (key, value) pairs in a map. If any element is already found present in the map, we simply check if that element is repeated within range of k using its previous occurrence information from the map.
The algorithm can be implemented as follows in C++, Java and Python:

`C++`

#include <iostream>

#include <vector>

#include <unordered_map>

using namespace std;

bool hasDuplicates(vector<int> &vec, int k)

{

// stores (element, index) pairs as (key, value) pairs

unordered_map<int, int> map;

// traverse the vector

for (int i = 0; i < vec.size(); i++)

{

// if the current element already exists in the map

if (map.find(vec[i]) != map.end()) {

// return true if current element is repeated within range of k

if (i – map[vec[i]] <= k) {

return true;

}

// store elements along with their indices

map[vec[i]] = i;

}

// we reach here when no element is repeated within range k

return false;

}

int main()

{

vector<int> vec = { 5, 6, 8, 2, 4, 6, 9 };

int k = 4;

if (hasDuplicates(vec, k)) {

cout << “Duplicates found”;

}

else {

cout << “No Duplicates found”;

}

return 0;

}

Download Run Code

Output:

Duplicates found

`Java`

import java.util.HashMap;

import java.util.Map;

class Main

{

public static boolean hasDuplicates(int[] arr, int k)

{

// stores (element, index) pairs as (key, value) pairs

Map<Integer, Integer> map = new HashMap<>();

// traverse the array

for (int i = 0; i < arr.length; i++)

{

// if the current element already exists in the map

if (map.containsKey(arr[i])) {

// return true if current element is repeated within range of k

if (i – map.get(arr[i]) <= k) {

return true;

}

// store elements along with their indices

map.put(arr[i], i);

}

// we reach here when no element is repeated within range k

return false;

}

public static void main(String[] args) {

int[] arr = { 5, 6, 8, 2, 4, 6, 9 };

int k = 4;

if (hasDuplicates(arr, k)) {

System.out.println(“Duplicates found”);

}

else {

System.out.println(“No Duplicates found”);

}

Download Run Code

Output:

Duplicates found

`Python`

def hasDuplicates(A, k):

# stores (element, index) pairs as (key, value) pairs

dict = {}

# traverse the list

for i, e in enumerate(A):

# if the current element already exists in the dict

if e in dict:

# return true if current element is repeated within range of k

if i – dict.get(e) <= k:

return True

# store elements along with their indices

dict[e] = i

# we reach here when no element is repeated within range k

return False

if __name__ == ‘__main__’:

A = [5, 6, 8, 2, 4, 6, 9]

k = 4

if hasDuplicates(A, k):

print(“Duplicates found”)

else:

print(“No Duplicates found”)

Download Run Code

Output:

Duplicates found

Download Run Code
Output:

Duplicates found

We can also use Sliding Window approach to solve above problem. The idea is process every window of size k one at a time and store its elements in a set. Now if any element repeats in the window, then we can say that it is repeated within range of k.
Initially our window will contain the first k elements of the input. Then for each element of the remaining input, we add it to the current window. While adding the i’th element of the input to the current window, we have to remove the (i-k)’th element from the window at the same time. This is done to maintain efficiency of the solution and to keep window balance at any point.
Below is C++, Java and Python implementation based on above idea –

`C++`

#include <iostream>

#include <vector>

#include <unordered_set>

using namespace std;

// Returns true if element x is present in the given set

bool contains(unordered_set<int> const &set, int x) {

return set.find(x) != set.end();

}

bool hasDuplicates(vector<int> &vec, int k)

{

// create an empty set to store elements within range k

unordered_set<int> window;

// traverse the vector

for (int i = 0; i < vec.size(); i++)

{

// if the current element already exists in the window,

// then it is repeated within range of k

if (contains(window, vec[i])) {

return true;

}

// add current element to the window

window.insert(vec[i]);

// remove the element at k’th range from the current element

if (i >= k) {

window.erase(vec[i – k]);

}

// we reach here when no element is repeated within range k

return false;

}

int main()

{

vector<int> vec = { 5, 6, 8, 2, 4, 6, 9 };

int k = 4;

if (hasDuplicates(vec, k)) {

cout << “Duplicates found”;

}

else {

cout << “No Duplicates found”;

}

return 0;

}

Download Run Code

Output:

Duplicates found

`Java`

import java.util.HashSet;

import java.util.Set;

class Main

{

public static boolean hasDuplicates(int[] arr, int k)

{

// create an empty set to store elements within range k

Set<Integer> window = new HashSet<>();

// traverse the array

for (int i = 0; i < arr.length; i++)

{

// if the current element already exists in the window,

// then it is repeated within range of k

if (window.contains(arr[i])) {

return true;

}

// add current element to the window

window.add(arr[i]);

// remove the element at k’th range from the current element

if (i >= k) {

window.remove(arr[i – k]);

}

// we reach here when no element is repeated within range k

return false;

}

public static void main(String[] args) {

int[] arr = { 5, 6, 8, 2, 4, 6, 9 };

int k = 4;

if (hasDuplicates(arr, k)) {

System.out.println(“Duplicates found”);

}

else {

System.out.println(“No Duplicates found”);

}

Download Run Code

Output:

Duplicates found

`Python`

def hasDuplicates(A, k):

# create an empty set to store elements within range k

window = set()

# traverse the list

for i in range(len(A)):

# if the current element already exists in the window,

# then it is repeated within range of k

if A[i] in window:

return True

# add current element to the window

window.add(A[i])

# remove the element at k’th range from the current element

if i >= k:

window.remove(A[i – k])

# we reach here when no element is repeated within range k

return False

if __name__ == ‘__main__’:

A = [5, 6, 8, 2, 4, 6, 9]

k = 4

if hasDuplicates(A, k):

print(“Duplicates found”)

else:

print(“No Duplicates found”)

Download Run Code

Output:

Duplicates found

Download Run Code
Output:

Duplicates found

The time complexity of above solution is O(n) and uses O(n) extra space.

The complexity mentioned is average case. The worst case complexity is n.k using un-ordered map and set.
I think using map and set the work case complexity will be n.log(k)
So if average case complexity is important than un-ordered map and un-ordered set is the right data struct to use. But if worst case complexity is important than ordered set and map should be used.
Can we use min-heap? O(k +nlogk) and O(1) space?
How would a heap occupy O(1) space?

source

Find duplicates within given range k in an array

Published by on December 5, 2020

`C++`

`Java`

`Python`

`C++`

`Java`

`Python`

PHP

Your System Isn’t Slow – Your Bottleneck is not Optimized

PHP

PHP 8.4’s new DomHTMLDocument in Diffs

PHP

PHP 8.4’s new DomHTMLDocument in Diffs

C++

Java

Python

C++

Java

Python

Related Posts

PHP

Your System Isn’t Slow – Your Bottleneck is not Optimized

PHP

PHP 8.4’s new DomHTMLDocument in Diffs

PHP

PHP 8.4’s new DomHTMLDocument in Diffs

`C++`

`Java`

`Python`

`C++`

`Java`

`Python`