Lessons
Arrays
Dynamic Programming
Graph
Hashing
Remove Duplicates from Sorted Array
When working with sorted arrays, one common problem that arises is the need to remove duplicates while maintaining the order of elements. Removing duplicates from a sorted array is a crucial task, especially when space efficiency is essential. In this article, we will explore an in-place removal algorithm using a two-pointer technique that ensures optimal space efficiency and a low time complexity.
Understanding the Problem
The task at hand is to remove duplicates from a sorted array in such a way that each unique element appears only once. Since the array is already sorted, the duplicates are adjacent to each other, making the problem simpler.
The goal is to modify the array in-place and return the new length of the array after duplicates removal. Importantly, no extra space should be used beyond a few pointers, making the solution space-efficient.
Key Concepts for Solving the Problem
1. Sorted Array
A sorted array means that all elements are arranged in either ascending or descending order. In our case, we will assume ascending order. The sorted array property ensures that identical elements appear consecutively, making it easier to identify and remove duplicates.
2. In-Place Removal
The key to solving this problem efficiently is to perform in-place removal. This means we should avoid creating a new array and instead modify the original array directly. This approach helps to achieve space efficiency.
3. Two Pointers
To efficiently remove duplicates from a sorted array, we can use the two-pointer technique:
- Pointer 1 (slow pointer) tracks the last unique element.
- Pointer 2 (fast pointer) scans through the array to find elements that are different from the last unique element.
This method allows us to efficiently modify the array without unnecessary iterations or space usage.
Step-by-Step Approach
Step 1: Initialize Pointers
We begin by initializing two pointers:
- Pointer
i
starts at index 0 (tracking the position of the last unique element). - Pointer
j
starts at index 1 (scanning through the rest of the array).
Step 2: Compare Elements
As we iterate through the array with pointer j, we compare each element with the element at pointer i
. If they are different, we update the element at pointer i
with the new unique value and move pointer i
forward.
Step 3: Return the Result
Once we’ve processed all elements, the unique elements will be at the beginning of the array. The new length of the array is given by pointer i + 1
.
Python Code Example: Removing Duplicates
python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
def removeDuplicates(nums): if not nums: return 0 i = 0 # Pointer for unique elements for j in range(1, len(nums)): if nums[j] != nums[i]: # Found a unique element i += 1 # Move pointer i nums[i] = nums[j] # Update array with unique element return i + 1 # Return the length of the array with unique elements # Example usage arr = [0, 0, 1, 1, 1, 2, 2, 3, 3, 4] new_length = removeDuplicates(arr) print(arr[:new_length]) # Output: [0, 1, 2, 3, 4]
Explanation:
- Pointer
i
tracks the position of the last unique element. - Pointer j scans the rest of the array.
- Each time a unique element is found, it is moved to position
i
, andi
is incremented.
Time and Space Complexity
Time Complexity:
The time complexity of this algorithm is O(n), where n is the length of the array. This is because we are scanning through the array only once using pointer j
, and each comparison and update operation takes constant time.
Space Complexity:
The space complexity is O(1), as we are not using any additional arrays or data structures beyond a few pointers. This is the key to achieving space efficiency.
Common Mistakes to Avoid
- Incorrect Index Updates: Ensure that the pointer i is only moved when a unique element is found. Otherwise, the array will be improperly modified.
- Overlooking Edge Cases: Always check for edge cases such as empty arrays or arrays with no duplicates.
- Unnecessary Extra Space: Avoid creating new arrays; instead, manipulate the original array to achieve in-place removal.
Optimization Tips
- If the array contains only one element or is already unique, the algorithm still runs efficiently with minimal operations.
- To further optimize, ensure that the logic is concise and avoids any nested loops or unnecessary conditions.
Real-World Applications
The ability to remove duplicates from a sorted array efficiently is useful in various scenarios:
- Data deduplication: Ensuring that a dataset only contains unique records.
- Memory optimization: In applications where minimizing memory usage is essential, such as embedded systems or large-scale databases.
- Algorithm challenges: Solving programming challenges that involve array manipulation or sorting.
Conclusion
Removing duplicates from a sorted array is a classic problem in computer science and algorithms. Using a two-pointer approach allows us to efficiently solve this problem with a minimal time complexity and space complexity. This technique is widely applicable, especially in scenarios where memory usage is a concern. By mastering this approach, you’ll be well-equipped to handle a variety of array manipulation tasks in your coding journey.