LeetCopilot Logo
LeetCopilot
LeetCode Pattern/Hash Map/Hash Map Grouping Pattern: Categorize and Aggregate Data in O(n)

Hash Map Grouping Pattern: Categorize and Aggregate Data in O(n)

LeetCopilot Team
Dec 22, 2025
13 min read
Hash MapGroupingCategorizationInterview Prep
Master the grouping pattern for 'group by' problems. Learn how to choose the perfect grouping key, handle multi-value storage, and solve categorization problems like Group Anagrams efficiently.

You see a problem that says "group anagrams together" or "group strings by pattern." You know you need to categorize things... but what do you use as the key?

Should you sort the string? Use a character count? A hash? How do you store multiple items per group?

This is the gap between knowing you need to group and mastering the grouping pattern.

The grouping pattern is one of the most elegant hash map techniques. It appears in 30+ LeetCode problems, from "Group Anagrams" to "Group Shifted Strings" to custom categorization problems. But the pattern isn't just "make a dict"—it's a systematic approach to choosing grouping keys and aggregating data.

This guide will teach you:

  • How to choose the perfect grouping key
  • When to use sorted strings, tuples, or custom hashes
  • The defaultdict pattern for clean grouping
  • Multi-key grouping (composite keys)
  • Common mistakes and edge cases
  • How to optimize grouping operations

Master this, and "group by" problems become pattern recognition.

What Is the Grouping Pattern?

The core concept

The grouping pattern uses a hash map to categorize elements by a computed property, collecting all elements that share that property together.

Structure:

  • Key: The grouping property (what makes items belong together)
  • Value: List of items in that group

Example:

python
words = ["eat", "tea", "tan", "ate", "nat", "bat"]

# Group by sorted characters
groups = {
    "aet": ["eat", "tea", "ate"],
    "ant": ["tan", "nat"],
    "abt": ["bat"]
}

Why it works

Without hash map (brute force):

python
# Compare every pair - O(n²)
for i in range(len(words)):
    for j in range(i+1, len(words)):
        if are_anagrams(words[i], words[j]):
            # group them

With hash map:

python
# Compute key once per word - O(n)
groups = defaultdict(list)
for word in words:
    key = compute_key(word)  # e.g., sorted(word)
    groups[key].append(word)

Trade-off: Spend O(n) space to save O(n²) → O(n) time.

Real-world analogy

Imagine organizing books in a library:

  • Without grouping: Search through all books to find similar ones (slow)
  • With grouping: Use genre/author as key, instantly find all books in that category (fast)

The key is choosing the right categorization scheme.

Choosing the Right Grouping Key

This is the most critical decision in the grouping pattern.

The three requirements

A good grouping key must be:

  1. Deterministic: Same input → same key
  2. Unique per group: Different groups → different keys
  3. Hashable: Can be used as dict key

Common key types

1. Sorted string (for anagrams)

python
# Group anagrams
key = ''.join(sorted(word))

"eat""aet"
"tea""aet"  # Same key!
"tan""ant"

Pros: Simple, works for any characters
Cons: O(k log k) per word where k = word length

2. Character count tuple (for anagrams, optimized)

python
# Group anagrams (faster for long words)
from collections import Counter
key = tuple(sorted(Counter(word).items()))

# Or for lowercase English only:
count = [0] * 26
for char in word:
    count[ord(char) - ord('a')] += 1
key = tuple(count)

Pros: O(k) per word, faster for long words
Cons: More complex, only works for known alphabet

3. Pattern string (for isomorphic strings)

python
# Group isomorphic strings
def get_pattern(s):
    mapping = {}
    pattern = []
    next_id = 0
    
    for char in s:
        if char not in mapping:
            mapping[char] = next_id
            next_id += 1
        pattern.append(str(mapping[char]))
    
    return ','.join(pattern)

"egg""0,1,1"
"add""0,1,1"  # Same pattern!
"foo""0,1,1"  # Same pattern!
"bar""0,1,2"  # Different pattern

Pros: Captures structure
Cons: More complex to compute

4. Tuple of properties (multi-key)

python
# Group by multiple properties
key = (item.category, item.size, item.color)

# Example: Group points by (row, column sum)
key = (point.row, sum(point.columns))

Pros: Flexible, handles multiple criteria
Cons: Need to ensure all parts are hashable

5. Custom hash (rare)

python
# Prime number product (for anagrams, risky)
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]

def get_hash(word):
    result = 1
    for char in word:
        result *= primes[ord(char) - ord('a')]
    return result

Pros: O(k) time, compact
Cons: Risk of overflow, only for small alphabets

Decision framework

code
What are you grouping?
├─ Anagrams → Sorted string or character count
├─ Isomorphic/pattern → Pattern string
├─ By multiple properties → Tuple
├─ Custom similarity → Custom hash function
└─ Unknown → Start with sorted/tuple

For more on key selection, see Hash Map Key Selection Guide.

The Universal Template

python
from collections import defaultdict

def groupItems(items):
    """
    Group items by computed key.
    
    Time: O(n * k) where k = key computation time
    Space: O(n)
    """
    groups = defaultdict(list)
    
    for item in items:
        key = compute_key(item)
        groups[key].append(item)
    
    return list(groups.values())

Why defaultdict?

  • No need to check if key exists
  • Cleaner code
  • Automatically creates empty list for new keys

Python: Manual dict

python
def groupItems(items):
    groups = {}
    
    for item in items:
        key = compute_key(item)
        
        if key not in groups:
            groups[key] = []
        groups[key].append(item)
    
    return list(groups.values())

When to use: When you need more control or want to avoid defaultdict import.

Java Template

java
import java.util.*;

public List<List<String>> groupItems(String[] items) {
    Map<String, List<String>> groups = new HashMap<>();
    
    for (String item : items) {
        String key = computeKey(item);
        
        groups.putIfAbsent(key, new ArrayList<>());
        groups.get(key).add(item);
    
        // Or use computeIfAbsent (Java 8+)
        // groups.computeIfAbsent(key, k -> new ArrayList<>()).add(item);
    }
    
    return new ArrayList<>(groups.values());
}

JavaScript Template

javascript
function groupItems(items) {
  const groups = new Map();
  
  for (const item of items) {
    const key = computeKey(item);
    
    if (!groups.has(key)) {
      groups.set(key, []);
    }
    groups.get(key).push(item);
  }
  
  return Array.from(groups.values());
}

Pattern 1: Group Anagrams

Problem: Group Anagrams (LeetCode 49)

Problem: Given an array of strings, group the anagrams together.

Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]

Solution 1: Sorted String Key

python
from collections import defaultdict

def groupAnagrams(strs: List[str]) -> List[List[str]]:
    groups = defaultdict(list)
    
    for word in strs:
        # Sorted string is the key
        key = ''.join(sorted(word))
        groups[key].append(word)
    
    return list(groups.values())

Complexity: O(n * k log k) where n = number of words, k = max word length

Trace:

text
word="eat"  → key="aet" → groups={"aet": ["eat"]}
word="tea"  → key="aet" → groups={"aet": ["eat", "tea"]}
word="tan"  → key="ant" → groups={"aet": ["eat", "tea"], "ant": ["tan"]}
word="ate"  → key="aet" → groups={"aet": ["eat", "tea", "ate"], "ant": ["tan"]}
word="nat"  → key="ant" → groups={"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"]}
word="bat"  → key="abt" → groups={"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"], "abt": ["bat"]}

Solution 2: Character Count Key (Optimized)

python
from collections import defaultdict

def groupAnagrams(strs: List[str]) -> List[List[str]]:
    groups = defaultdict(list)
    
    for word in strs:
        # Count characters (only for lowercase English)
        count = [0] * 26
        for char in word:
            count[ord(char) - ord('a')] += 1
        
        # Tuple of counts is the key
        key = tuple(count)
        groups[key].append(word)
    
    return list(groups.values())

Complexity: O(n * k) where n = number of words, k = max word length

When to use: When words are long and alphabet is small (lowercase English).

Solution 3: Counter Key (General)

python
from collections import defaultdict, Counter

def groupAnagrams(strs: List[str]) -> List[List[str]]:
    groups = defaultdict(list)
    
    for word in strs:
        # Counter as key (need to convert to hashable)
        key = tuple(sorted(Counter(word).items()))
        groups[key].append(word)
    
    return list(groups.values())

Complexity: O(n * k) for counting, but sorting items adds overhead

For deep dive on all approaches, see Group Anagrams Deep Dive.

Pattern 2: Group by Pattern

Problem: Isomorphic Strings (LeetCode 205)

Problem: Determine if two strings are isomorphic (same pattern).

Examples:

  • "egg" and "add" → true (both have pattern 0,1,1)
  • "foo" and "bar" → false (0,1,1 vs 0,1,2)
python
def isIsomorphic(s: str, t: str) -> bool:
    def get_pattern(string):
        mapping = {}
        pattern = []
        next_id = 0
        
        for char in string:
            if char not in mapping:
                mapping[char] = next_id
                next_id += 1
            pattern.append(mapping[char])
        
        return tuple(pattern)
    
    return get_pattern(s) == get_pattern(t)

Key insight: Convert both strings to pattern, compare patterns.

Problem: Word Pattern (LeetCode 290)

Problem: Check if a pattern matches a sequence of words.

Example: pattern = "abba", s = "dog cat cat dog" → true

python
def wordPattern(pattern: str, s: str) -> bool:
    words = s.split()
    
    if len(pattern) != len(words):
        return False
    
    # Need bidirectional mapping
    char_to_word = {}
    word_to_char = {}
    
    for char, word in zip(pattern, words):
        if char in char_to_word:
            if char_to_word[char] != word:
                return False
        else:
            char_to_word[char] = word
        
        if word in word_to_char:
            if word_to_char[word] != char:
                return False
        else:
            word_to_char[word] = char
    
    return True

Key insight: Need bidirectional mapping (both directions must be consistent).

Pattern 3: Group by Property

Problem: Group Shifted Strings (LeetCode 249)

Problem: Group strings that can be shifted to match each other.

Example: "abc", "bcd", "xyz" → all shift by same amount

python
from collections import defaultdict

def groupStrings(strings: List[str]) -> List[List[str]]:
    def get_shift_key(s):
        # Compute shift pattern relative to first character
        if not s:
            return ()
        
        shifts = []
        for i in range(1, len(s)):
            shift = (ord(s[i]) - ord(s[i-1])) % 26
            shifts.append(shift)
        
        return tuple(shifts)
    
    groups = defaultdict(list)
    
    for s in strings:
        key = get_shift_key(s)
        groups[key].append(s)
    
    return list(groups.values())

Key insight: Compute relative shifts between adjacent characters.

Problem: Group by Sum of Digits

Problem: Group numbers by sum of their digits.

python
from collections import defaultdict

def groupByDigitSum(nums: List[int]) -> List[List[int]]:
    def digit_sum(n):
        total = 0
        while n:
            total += n % 10
            n //= 10
        return total
    
    groups = defaultdict(list)
    
    for num in nums:
        key = digit_sum(num)
        groups[key].append(num)
    
    return list(groups.values())

Pattern: Group by computed property (sum of digits).

Multi-Key Grouping (Composite Keys)

Sometimes you need to group by multiple properties.

Example: Group by (Length, First Character)

python
from collections import defaultdict

def groupByLengthAndFirst(words: List[str]) -> List[List[str]]:
    groups = defaultdict(list)
    
    for word in words:
        # Tuple of (length, first char)
        key = (len(word), word[0] if word else '')
        groups[key].append(word)
    
    return list(groups.values())

# Example
words = ["apple", "ant", "banana", "bat", "cat"]
# Groups: {(5, 'a'): ["apple"], (3, 'a'): ["ant"], (6, 'b'): ["banana"], (3, 'b'): ["bat"], (3, 'c'): ["cat"]}

Example: Group Points by Row and Column Sum

python
from collections import defaultdict

def groupPoints(points: List[List[int]]) -> List[List[List[int]]]:
    groups = defaultdict(list)
    
    for point in points:
        row, col = point
        # Group by (row, sum of coordinates)
        key = (row, row + col)
        groups[key].append(point)
    
    return list(groups.values())

Key insight: Use tuple of properties as key.

Handling Mutable Keys

Wrong (lists are not hashable):

python
key = [item.x, item.y]  # Can't use list as key!
groups[key].append(item)  # TypeError!

Correct (use tuple):

python
key = (item.x, item.y)  # Tuple is hashable
groups[key].append(item)

Aggregation After Grouping

Sometimes you need to aggregate data after grouping.

Example: Sum Values per Group

python
from collections import defaultdict

def sumByCategory(items: List[Tuple[str, int]]) -> Dict[str, int]:
    # items = [("fruit", 10), ("veg", 5), ("fruit", 20), ("veg", 15)]
    
    sums = defaultdict(int)
    
    for category, value in items:
        sums[category] += value
    
    return dict(sums)
    # {"fruit": 30, "veg": 20}

Example: Count Items per Group

python
from collections import defaultdict

def countByGroup(items: List[str]) -> Dict[str, int]:
    counts = defaultdict(int)
    
    for item in items:
        key = compute_key(item)
        counts[key] += 1
    
    return dict(counts)

Example: Find Max per Group

python
from collections import defaultdict

def maxByGroup(items: List[Tuple[str, int]]) -> Dict[str, int]:
    maxes = defaultdict(lambda: float('-inf'))
    
    for category, value in items:
        maxes[category] = max(maxes[category], value)
    
    return dict(maxes)

Common Mistakes

Mistake 1: Using Mutable Keys

Wrong:

python
# Lists are not hashable
key = sorted(word)  # Returns list!
groups[key].append(word)  # TypeError!

Correct:

python
# Convert to string or tuple
key = ''.join(sorted(word))  # String
# Or
key = tuple(sorted(word))  # Tuple
groups[key].append(word)

Mistake 2: Wrong Key Choice

Wrong (for anagrams):

python
# Using first character as key
key = word[0]
# "eat" and "egg" would be in same group!

Correct:

python
# Use sorted characters
key = ''.join(sorted(word))

Mistake 3: Not Handling Empty Input

Wrong:

python
key = word[0]  # Crashes if word is empty!

Correct:

python
key = word[0] if word else ''
# Or check before
if not word:
    continue

Mistake 4: Forgetting to Convert to List

Wrong (returns dict_values):

python
return groups.values()  # Returns dict_values object

Correct:

python
return list(groups.values())  # Convert to list

Complexity Analysis

Time Complexity

Building groups: O(n * k) where k = key computation time

  • Iterate through n items
  • Compute key for each (depends on key type)
    • Sorted string: O(k log k)
    • Character count: O(k)
    • Pattern: O(k)

Total:

  • With sorted key: O(n * k log k)
  • With count/pattern key: O(n * k)

Space Complexity

Groups storage: O(n)

  • Store all n items across groups

Keys storage: O(g) where g = number of groups

  • Best case: O(1) (all items in one group)
  • Worst case: O(n) (each item in separate group)

Practice Problems

Beginner

  1. Group Anagrams (#49) - Classic grouping
  2. Isomorphic Strings (#205) - Pattern matching
  3. Word Pattern (#290) - Bidirectional mapping
  4. Valid Anagram (#242) - Compare groups

Intermediate

  1. Group Shifted Strings (#249) - Custom key
  2. Find Duplicate Subtrees (#652) - Tree serialization as key
  3. Encode and Decode Strings (#271) - Custom encoding
  4. Longest Word in Dictionary (#720) - Group by prefix

Advanced

  1. Alien Dictionary (#269) - Topological sort with grouping
  2. Accounts Merge (#721) - Union-find with grouping
  3. Number of Distinct Islands (#694) - Shape as key

Summary

The grouping pattern is a powerful technique for categorizing data by computed properties.

Key principles:

  • Choose the right key: Deterministic, unique per group, hashable
  • Use defaultdict: Cleaner code, automatic list creation
  • Consider complexity: Sorted keys are O(k log k), count keys are O(k)
  • Handle edge cases: Empty input, mutable keys, bidirectional mapping

The template:

python
from collections import defaultdict

groups = defaultdict(list)
for item in items:
    key = compute_key(item)
    groups[key].append(item)
return list(groups.values())

Common key types:

  • Sorted string (anagrams)
  • Character count tuple (anagrams, optimized)
  • Pattern string (isomorphic)
  • Tuple of properties (multi-key)
  • Custom hash (rare)

When to use:

  • ✅ Group anagrams
  • ✅ Group by pattern
  • ✅ Group by computed property
  • ✅ Categorize by multiple criteria

Common mistakes:

  • ❌ Using mutable keys (lists)
  • ❌ Wrong key choice
  • ❌ Not handling empty input
  • ❌ Forgetting to convert to list

Next steps:

  1. Master the complete hash map guide
  2. Learn hash map key selection
  3. Study group anagrams approaches
  4. Practice the beginner problem set

Master grouping, and categorization problems become pattern recognition.

Want to Practice LeetCode Smarter?

LeetCopilot is a free browser extension that enhances your LeetCode practice with AI-powered hints, personalized study notes, and realistic mock interviews — all designed to accelerate your coding interview preparation.

Also compatible with Edge, Brave, and Opera

Related Tutorials