Number of Islands
The Number of Islands problem is a classic algorithmic challenge that involves identifying distinct land masses within a grid. This problem is pivotal for understanding concepts like graph traversal, connected components, and various search algorithms.
What Is the "Number of Islands" Problem?
Given a grid matrix consisting of '1'
s (representing land) and '0'
s (representing water), the task is to count the number of distinct islands. An island is formed by connecting adjacent lands horizontally or vertically.
Real-World Applications
This problem has practical applications in areas such as:
- Geographic Information Systems (GIS): Identifying land masses in satellite imagery.
- Computer Vision: Detecting connected components in binary images.
- Network Analysis: Finding connected clusters in network graphs.
Approaches to Solve the Problem
There are several algorithms to tackle this problem, each with its own advantages and trade-offs.
1. Depth-First Search (DFS)
DFS is a graph traversal technique that explores as far as possible along each branch before backtracking.
Implementation Steps:
- Iterate through each cell in the grid.
- When a land cell (
'1'
) is found, initiate a DFS to mark all connected land cells. - Increment the island count for each DFS initiation.
Code Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
def numIslands(grid): if not grid: return 0 rows, cols = len(grid), len(grid[0]) count = 0 def dfs(r, c): if r < 0 or c < 0 or r >= rows or c >= cols or grid[r][c] != '1': return grid[r][c] = '0' # Mark as visited dfs(r+1, c) dfs(r-1, c) dfs(r, c+1) dfs(r, c-1) for r in range(rows): for c in range(cols): if grid[r][c] == '1': dfs(r, c) count += 1 return count
Line-by-Line Explanation:
if not grid:
Checks if the input is empty. If so, return 0 (no islands).rows, cols = len(grid), len(grid[0])
Captures dimensions of the grid.count = 0
Stores the final number of islands.def dfs(r, c):
This is a helper function that uses recursion to explore all adjacent land cells (connected components).- Inside
dfs
:- The bounds and water checks ensure we don’t go out of the matrix or revisit cells.
grid[r][c] = '0'
marks the current land cell as visited (turning it into water).- Recursively calls
dfs
in all 4 directions: up, down, left, and right.
for r in range(rows)
andfor c in range(cols)
Loops through each cell in the matrix.if grid[r][c] == '1'
:
When unvisited land is found, we calldfs
and increase the island count.
Time and Space Complexity:
- Time Complexity: O(M × N), where M and N are the number of rows and columns in the grid.
- Space Complexity: O(M × N) in the worst case due to the recursion stack.
2. Breadth-First Search (BFS)
BFS explores all neighbors at the current depth before moving on to nodes at the next depth level.
Implementation Steps:
- Iterate through each cell in the grid.
- When a land cell (
'1'
) is found, initiate a BFS to mark all connected land cells. - Increment the island count for each BFS initiation.
Code Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
from collections import deque def numIslands(grid): if not grid: return 0 rows, cols = len(grid), len(grid[0]) count = 0 for r in range(rows): for c in range(cols): if grid[r][c] == '1': queue = deque() queue.append((r, c)) grid[r][c] = '0' # Mark as visited while queue: row, col = queue.popleft() for x, y in [(row-1,col), (row+1,col), (row,col-1), (row,col+1)]: if 0 <= x < rows and 0 <= y < cols and grid[x][y] == '1': queue.append((x, y)) grid[x][y] = '0' count += 1 return count
Line-by-Line Explanation:
- Uses a queue (deque) for level-by-level traversal of the island cells.
- When a
'1'
is found:- Add it to the queue and mark it as visited by turning it to
'0'
. - While the queue isn't empty, pop one cell at a time.
- Explore all 4 adjacent cells and enqueue any connected
'1'
land cells. - After finishing traversal of that island, increment
count
.
- Add it to the queue and mark it as visited by turning it to
Time and Space Complexity:
- Time Complexity: O(M × N)
- Space Complexity: O(min(M, N)) for the queue in the worst case.
3. Disjoint Set Union (Union-Find)
The Union-Find algorithm is used to keep track of elements partitioned into disjoint sets and is particularly efficient for dynamic connectivity problems.
Implementation Steps:
- Initialize a parent array where each land cell is its own parent.
- Iterate through the grid, and for each land cell, union it with its adjacent land cells.
- After processing, count the number of unique parents, which represents the number of islands.
Time and Space Complexity:
- Time Complexity: O(M × N × α(M × N)), where α is the inverse Ackermann function.
- Space Complexity: O(M × N) for the parent array.
Common Questions and Answers
Q1: Why is marking visited cells important?
Marking visited cells prevents revisiting the same land cell, which could lead to overcounting islands and increased time complexity.
Q2: Can diagonal connections be considered in island formation?
In the standard problem definition, only horizontal and vertical connections are considered. Including diagonals would change the definition and the implementation.
Q3: How does Union-Find improve performance over DFS/BFS?
Union-Find is more efficient in scenarios with dynamic connectivity queries, as it provides near-constant time operations for union and find, making it suitable for large-scale problems.
Conclusion
The Number of Islands problem is a fundamental exercise in understanding graph traversal, connected components, and various search algorithms. By exploring different approaches like DFS, BFS, and Union-Find, one can gain a deeper insight into algorithm optimization and efficient problem-solving techniques.