Number of Islands

The Number of Islands problem is a classic algorithmic challenge that involves identifying distinct land masses within a grid. This problem is pivotal for understanding concepts like graph traversal, connected components, and various search algorithms.

What Is the "Number of Islands" Problem?

Given a grid matrix consisting of '1's (representing land) and '0's (representing water), the task is to count the number of distinct islands. An island is formed by connecting adjacent lands horizontally or vertically.

Real-World Applications

This problem has practical applications in areas such as:

  • Geographic Information Systems (GIS): Identifying land masses in satellite imagery.
  • Computer Vision: Detecting connected components in binary images.
  • Network Analysis: Finding connected clusters in network graphs.

Approaches to Solve the Problem

There are several algorithms to tackle this problem, each with its own advantages and trade-offs.

1. Depth-First Search (DFS)

DFS is a graph traversal technique that explores as far as possible along each branch before backtracking.

Implementation Steps:

  1. Iterate through each cell in the grid.
  2. When a land cell ('1') is found, initiate a DFS to mark all connected land cells.
  3. Increment the island count for each DFS initiation.

Code Example:

python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def numIslands(grid):
    if not grid:
        return 0
    rows, cols = len(grid), len(grid[0])
    count = 0

    def dfs(r, c):
        if r < 0 or c < 0 or r >= rows or c >= cols or grid[r][c] != '1':
            return
        grid[r][c] = '0'  # Mark as visited
        dfs(r+1, c)
        dfs(r-1, c)
        dfs(r, c+1)
        dfs(r, c-1)

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1':
                dfs(r, c)
                count += 1
    return count

Line-by-Line Explanation:

  • if not grid:
    Checks if the input is empty. If so, return 0 (no islands).
  • rows, cols = len(grid), len(grid[0])
    Captures dimensions of the grid.
  • count = 0
    Stores the final number of islands.
  • def dfs(r, c):
    This is a helper function that uses recursion to explore all adjacent land cells (connected components).
  • Inside dfs:
    • The bounds and water checks ensure we don’t go out of the matrix or revisit cells.
    • grid[r][c] = '0' marks the current land cell as visited (turning it into water).
    • Recursively calls dfs in all 4 directions: up, down, left, and right.
  • for r in range(rows) and for c in range(cols)
    Loops through each cell in the matrix.
  • if grid[r][c] == '1':
    When unvisited land is found, we call dfs and increase the island count.

Time and Space Complexity:

  • Time Complexity: O(M × N), where M and N are the number of rows and columns in the grid.
  • Space Complexity: O(M × N) in the worst case due to the recursion stack.

2. Breadth-First Search (BFS)

BFS explores all neighbors at the current depth before moving on to nodes at the next depth level.

Implementation Steps:

  • Iterate through each cell in the grid.
  • When a land cell ('1') is found, initiate a BFS to mark all connected land cells.
  • Increment the island count for each BFS initiation.

Code Example:

python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from collections import deque

def numIslands(grid):
    if not grid:
        return 0
    rows, cols = len(grid), len(grid[0])
    count = 0

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1':
                queue = deque()
                queue.append((r, c))
                grid[r][c] = '0'  # Mark as visited
                while queue:
                    row, col = queue.popleft()
                    for x, y in [(row-1,col), (row+1,col), (row,col-1), (row,col+1)]:
                        if 0 <= x < rows and 0 <= y < cols and grid[x][y] == '1':
                            queue.append((x, y))
                            grid[x][y] = '0'
                count += 1
    return count

Line-by-Line Explanation:

  • Uses a queue (deque) for level-by-level traversal of the island cells.
  • When a '1' is found:
    • Add it to the queue and mark it as visited by turning it to '0'.
    • While the queue isn't empty, pop one cell at a time.
    • Explore all 4 adjacent cells and enqueue any connected '1' land cells.
    • After finishing traversal of that island, increment count.

Time and Space Complexity:

  • Time Complexity: O(M × N)
  • Space Complexity: O(min(M, N)) for the queue in the worst case.

3. Disjoint Set Union (Union-Find)

The Union-Find algorithm is used to keep track of elements partitioned into disjoint sets and is particularly efficient for dynamic connectivity problems.

Implementation Steps:

  • Initialize a parent array where each land cell is its own parent.
  • Iterate through the grid, and for each land cell, union it with its adjacent land cells.
  • After processing, count the number of unique parents, which represents the number of islands.

Time and Space Complexity:

  • Time Complexity: O(M × N × α(M × N)), where α is the inverse Ackermann function.
  • Space Complexity: O(M × N) for the parent array.

Common Questions and Answers

Q1: Why is marking visited cells important?

Marking visited cells prevents revisiting the same land cell, which could lead to overcounting islands and increased time complexity.

Q2: Can diagonal connections be considered in island formation?

In the standard problem definition, only horizontal and vertical connections are considered. Including diagonals would change the definition and the implementation.

Q3: How does Union-Find improve performance over DFS/BFS?

Union-Find is more efficient in scenarios with dynamic connectivity queries, as it provides near-constant time operations for union and find, making it suitable for large-scale problems.

Conclusion

The Number of Islands problem is a fundamental exercise in understanding graph traversal, connected components, and various search algorithms. By exploring different approaches like DFS, BFS, and Union-Find, one can gain a deeper insight into algorithm optimization and efficient problem-solving techniques.

Frequently Asked Questions