Remove Duplicates from Sorted Array

Remove Duplicates from Sorted Array

Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place such that each unique element appears only once. The relative order of the elements should be kept the same.

Since it is impossible to change the length of the array in some languages, you must instead have the result be placed in the first part of the array nums. More formally, if there are k elements after removing the duplicates, then the first k elements of nums should hold the final result. It does not matter what you leave beyond the first k elements.

Return k after placing the final result in the first k slots of nums.

Do not allocate extra space for another array. You must do this by modifying the input array in-place with O(1) extra memory.

Problem Statement

The problem is to remove the duplicates from an input integer array nums that is sorted in non-decreasing order, such that each unique element appears only once. The relative order of the elements in the array should be kept the same. The final result should be placed in the first part of the array nums with a length k, which represents the number of unique elements in the array after removing duplicates. The algorithm must not allocate extra space for another array and should modify the input array nums in-place with O(1) extra memory. The function should return k after placing the final result in the first k slots of nums.

Examples

Example 1:

Input: nums = [1,1,2]
Output: 2, nums = [1,2,_]
Explanation: Your function should return k = 2, with the first two elements of nums being 1 and 2 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Example 2:

Input: nums = [0,0,1,1,1,2,2,3,3,4]
Output: 5, nums = [0,1,2,3,4,_,_,_,_,_]
Explanation: Your function should return k = 5, with the first five elements of nums being 0, 1, 2, 3, and 4 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Clarifying Questions

  1. Can I assume that the input array nums will always be sorted in non-decreasing order?
  2. Is it okay to modify the order of the duplicate elements in the final result as long as each unique element appears only once?
  3. Can I assume that the array nums is not null or empty?
  4. Can I assume that the elements in the array nums are integers?
  5. Can I use additional data structures such as hash tables or sets to solve this problem?

Solutions

There are several different approaches to solving this problem:

  1. Two Pointers

  2. Hash Set

  3. Using built-in functions

  4. Binary Search

  5. Two Pointers with Swap

Approach 1: Two Pointers

Use two pointers to iterate through the array. One pointer will iterate through the array while the other pointer will keep track of the unique elements found so far. Duplicate elements will be skipped and unique elements will be copied to the unique elements pointer. This approach has a time complexity of O(n) and a space complexity of O(1).

Algorithm:

  • Initialize two pointers, i and j, to 0.
  • While j is less than the length of the array:
    • If the value at index j is different from the value at index i, increment i and copy the value at index j to the new position at index i.
    • Increment j.
  • Return i.
  • Time complexity: O(n), where n is the length of the input array. This is because we traverse the input array exactly once.
  • Space complexity: O(1), because we modify the input array in-place without using any extra memory.

Approach 2: Hash Set

Use a hash set to keep track of unique elements while iterating through the array. Duplicate elements will be skipped and unique elements will be added to the hash set. This approach has a time complexity of O(n) and a space complexity of O(n).

Algorithm:

  • Initialize a hash set to keep track of unique elements.
  • Initialize a variable k to 0.
  • Iterate through the array:
    • If the value at index i is not in the hash set, add it to the hash set and copy it to the new position at index k.
    • Increment k.
  • Return k.

Note: In the C++ implementation, we use an unordered_set instead of a set to achieve constant time average insertions and lookups. The .second attribute of the insert function returns a boolean indicating whether the insertion was successful or not.

  • Time complexity: O(n), where n is the length of the input array. This is because we traverse the input array exactly once, and hash set operations (such as adding elements to the set and checking if an element is in the set) take O(1) time on average.
  • Space complexity: O(k), where k is the number of unique elements in the input array. This is because we store the unique elements in a hash set, which takes up memory proportional to the number of unique elements.

Approach 3: Using built-in set data structure:

Using the built-in set data structure in some programming languages.

Algorithm:

  • Convert the input array to a set to remove duplicates.
  • Convert the set back to a list and return the length of the list.

This approach creates a new hash set or set with the elements of the input array, which automatically removes duplicates. Then, we iterate through the unique elements in the set and copy them to the new position in the original array/vector. Finally, we return the number of unique elements in the set, which is equal to the size of the set. Since we are modifying the input array/vector in-place, we don't need to return a new array/vector.

This approach has a time complexity of O(n) or O(k log k) depending on the language and method used to create the set, where n is the length of the input array and k is the number of unique elements. The space complexity is O(k) to store the unique elements in the set.

  • Time complexity: O(n), where n is the length of the input array. This is because we traverse the input array exactly once, and set operations (such as adding elements to the set and checking if an element is in the set) take O(1) time on average.
  • Space complexity: O(k), where k is the number of unique elements in the input array. This is because we store the unique elements in a set, which takes up memory proportional to the number of unique elements.

Approach 4: Binary Search

Use binary search to find the location of each unique element and then copy it to the beginning of the array. This approach has a time complexity of O(n log n) and a space complexity of O(1).

Algorithm:

  • Initialize a variable k to 0.
  • Iterate through the array:
    • If the value at index i is not equal to the value at index k, use binary search to find the location of the next unique element and copy it to the new position at index k.
    • Increment k.
  • Return k.
  • Time complexity: O(n log n), where n is the length of the input array. This is because we perform binary search on each element in the input array, which takes O(log n) time, and we perform this operation n times in the worst case.
  • Space complexity: O(1), because we modify the input array in-place without using any extra memory.
Share: