Lessons

Arrays

Dynamic Programming

Graph

Hashing

Remove Duplicates from Sorted Array

When working with sorted arrays, one common problem that arises is the need to remove duplicates while maintaining the order of elements. Removing duplicates from a sorted array is a crucial task, especially when space efficiency is essential. In this article, we will explore an in-place removal algorithm using a two-pointer technique that ensures optimal space efficiency and a low time complexity.

Understanding the Problem

The task at hand is to remove duplicates from a sorted array in such a way that each unique element appears only once. Since the array is already sorted, the duplicates are adjacent to each other, making the problem simpler.

The goal is to modify the array in-place and return the new length of the array after duplicates removal. Importantly, no extra space should be used beyond a few pointers, making the solution space-efficient.

Key Concepts for Solving the Problem

1. Sorted Array

A sorted array means that all elements are arranged in either ascending or descending order. In our case, we will assume ascending order. The sorted array property ensures that identical elements appear consecutively, making it easier to identify and remove duplicates.

2. In-Place Removal

The key to solving this problem efficiently is to perform in-place removal. This means we should avoid creating a new array and instead modify the original array directly. This approach helps to achieve space efficiency.

3. Two Pointers

To efficiently remove duplicates from a sorted array, we can use the two-pointer technique:

  • Pointer 1 (slow pointer) tracks the last unique element.
  • Pointer 2 (fast pointer) scans through the array to find elements that are different from the last unique element.

This method allows us to efficiently modify the array without unnecessary iterations or space usage.

Step-by-Step Approach

Step 1: Initialize Pointers

We begin by initializing two pointers:

  • Pointer i starts at index 0 (tracking the position of the last unique element).
  • Pointer j starts at index 1 (scanning through the rest of the array).

Step 2: Compare Elements

As we iterate through the array with pointer j, we compare each element with the element at pointer i. If they are different, we update the element at pointer i with the new unique value and move pointer i forward.

Step 3: Return the Result

Once we’ve processed all elements, the unique elements will be at the beginning of the array. The new length of the array is given by pointer i + 1.

Python Code Example: Removing Duplicates

python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def removeDuplicates(nums):
    if not nums:
        return 0
    
    i = 0  # Pointer for unique elements
    for j in range(1, len(nums)):
        if nums[j] != nums[i]:  # Found a unique element
            i += 1  # Move pointer i
            nums[i] = nums[j]  # Update array with unique element
    
    return i + 1  # Return the length of the array with unique elements

# Example usage
arr = [0, 0, 1, 1, 1, 2, 2, 3, 3, 4]
new_length = removeDuplicates(arr)
print(arr[:new_length])  # Output: [0, 1, 2, 3, 4]

Explanation:

  • Pointer i tracks the position of the last unique element.
  • Pointer j scans the rest of the array.
  • Each time a unique element is found, it is moved to position i, and i is incremented.

Time and Space Complexity

Time Complexity:

The time complexity of this algorithm is O(n), where n is the length of the array. This is because we are scanning through the array only once using pointer j, and each comparison and update operation takes constant time.

Space Complexity:

The space complexity is O(1), as we are not using any additional arrays or data structures beyond a few pointers. This is the key to achieving space efficiency.

Common Mistakes to Avoid

  • Incorrect Index Updates: Ensure that the pointer i is only moved when a unique element is found. Otherwise, the array will be improperly modified.
  • Overlooking Edge Cases: Always check for edge cases such as empty arrays or arrays with no duplicates.
  • Unnecessary Extra Space: Avoid creating new arrays; instead, manipulate the original array to achieve in-place removal.

Optimization Tips

  • If the array contains only one element or is already unique, the algorithm still runs efficiently with minimal operations.
  • To further optimize, ensure that the logic is concise and avoids any nested loops or unnecessary conditions.

Real-World Applications

The ability to remove duplicates from a sorted array efficiently is useful in various scenarios:

  • Data deduplication: Ensuring that a dataset only contains unique records.
  • Memory optimization: In applications where minimizing memory usage is essential, such as embedded systems or large-scale databases.
  • Algorithm challenges: Solving programming challenges that involve array manipulation or sorting.

Conclusion

Removing duplicates from a sorted array is a classic problem in computer science and algorithms. Using a two-pointer approach allows us to efficiently solve this problem with a minimal time complexity and space complexity. This technique is widely applicable, especially in scenarios where memory usage is a concern. By mastering this approach, you’ll be well-equipped to handle a variety of array manipulation tasks in your coding journey.

Frequently Asked Questions