You see a problem that says "group anagrams together" or "group strings by pattern." You know you need to categorize things... but what do you use as the key?
Should you sort the string? Use a character count? A hash? How do you store multiple items per group?
This is the gap between knowing you need to group and mastering the grouping pattern.
The grouping pattern is one of the most elegant hash map techniques. It appears in 30+ LeetCode problems, from "Group Anagrams" to "Group Shifted Strings" to custom categorization problems. But the pattern isn't just "make a dict"—it's a systematic approach to choosing grouping keys and aggregating data.
This guide will teach you:
- How to choose the perfect grouping key
- When to use sorted strings, tuples, or custom hashes
- The defaultdict pattern for clean grouping
- Multi-key grouping (composite keys)
- Common mistakes and edge cases
- How to optimize grouping operations
Master this, and "group by" problems become pattern recognition.
What Is the Grouping Pattern?
The core concept
The grouping pattern uses a hash map to categorize elements by a computed property, collecting all elements that share that property together.
Structure:
- Key: The grouping property (what makes items belong together)
- Value: List of items in that group
Example:
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
# Group by sorted characters
groups = {
"aet": ["eat", "tea", "ate"],
"ant": ["tan", "nat"],
"abt": ["bat"]
}Why it works
Without hash map (brute force):
# Compare every pair - O(n²)
for i in range(len(words)):
for j in range(i+1, len(words)):
if are_anagrams(words[i], words[j]):
# group themWith hash map:
# Compute key once per word - O(n)
groups = defaultdict(list)
for word in words:
key = compute_key(word) # e.g., sorted(word)
groups[key].append(word)Trade-off: Spend O(n) space to save O(n²) → O(n) time.
Real-world analogy
Imagine organizing books in a library:
- Without grouping: Search through all books to find similar ones (slow)
- With grouping: Use genre/author as key, instantly find all books in that category (fast)
The key is choosing the right categorization scheme.
Choosing the Right Grouping Key
This is the most critical decision in the grouping pattern.
The three requirements
A good grouping key must be:
- Deterministic: Same input → same key
- Unique per group: Different groups → different keys
- Hashable: Can be used as dict key
Common key types
1. Sorted string (for anagrams)
# Group anagrams
key = ''.join(sorted(word))
"eat" → "aet"
"tea" → "aet" # Same key!
"tan" → "ant"Pros: Simple, works for any characters
Cons: O(k log k) per word where k = word length
2. Character count tuple (for anagrams, optimized)
# Group anagrams (faster for long words)
from collections import Counter
key = tuple(sorted(Counter(word).items()))
# Or for lowercase English only:
count = [0] * 26
for char in word:
count[ord(char) - ord('a')] += 1
key = tuple(count)Pros: O(k) per word, faster for long words
Cons: More complex, only works for known alphabet
3. Pattern string (for isomorphic strings)
# Group isomorphic strings
def get_pattern(s):
mapping = {}
pattern = []
next_id = 0
for char in s:
if char not in mapping:
mapping[char] = next_id
next_id += 1
pattern.append(str(mapping[char]))
return ','.join(pattern)
"egg" → "0,1,1"
"add" → "0,1,1" # Same pattern!
"foo" → "0,1,1" # Same pattern!
"bar" → "0,1,2" # Different patternPros: Captures structure
Cons: More complex to compute
4. Tuple of properties (multi-key)
# Group by multiple properties
key = (item.category, item.size, item.color)
# Example: Group points by (row, column sum)
key = (point.row, sum(point.columns))Pros: Flexible, handles multiple criteria
Cons: Need to ensure all parts are hashable
5. Custom hash (rare)
# Prime number product (for anagrams, risky)
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
def get_hash(word):
result = 1
for char in word:
result *= primes[ord(char) - ord('a')]
return resultPros: O(k) time, compact
Cons: Risk of overflow, only for small alphabets
Decision framework
What are you grouping?
├─ Anagrams → Sorted string or character count
├─ Isomorphic/pattern → Pattern string
├─ By multiple properties → Tuple
├─ Custom similarity → Custom hash function
└─ Unknown → Start with sorted/tupleFor more on key selection, see Hash Map Key Selection Guide.
The Universal Template
Python: defaultdict (Recommended)
from collections import defaultdict
def groupItems(items):
"""
Group items by computed key.
Time: O(n * k) where k = key computation time
Space: O(n)
"""
groups = defaultdict(list)
for item in items:
key = compute_key(item)
groups[key].append(item)
return list(groups.values())Why defaultdict?
- No need to check if key exists
- Cleaner code
- Automatically creates empty list for new keys
Python: Manual dict
def groupItems(items):
groups = {}
for item in items:
key = compute_key(item)
if key not in groups:
groups[key] = []
groups[key].append(item)
return list(groups.values())When to use: When you need more control or want to avoid defaultdict import.
Java Template
import java.util.*;
public List<List<String>> groupItems(String[] items) {
Map<String, List<String>> groups = new HashMap<>();
for (String item : items) {
String key = computeKey(item);
groups.putIfAbsent(key, new ArrayList<>());
groups.get(key).add(item);
// Or use computeIfAbsent (Java 8+)
// groups.computeIfAbsent(key, k -> new ArrayList<>()).add(item);
}
return new ArrayList<>(groups.values());
}JavaScript Template
function groupItems(items) {
const groups = new Map();
for (const item of items) {
const key = computeKey(item);
if (!groups.has(key)) {
groups.set(key, []);
}
groups.get(key).push(item);
}
return Array.from(groups.values());
}Pattern 1: Group Anagrams
Problem: Group Anagrams (LeetCode 49)
Problem: Given an array of strings, group the anagrams together.
Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
Solution 1: Sorted String Key
from collections import defaultdict
def groupAnagrams(strs: List[str]) -> List[List[str]]:
groups = defaultdict(list)
for word in strs:
# Sorted string is the key
key = ''.join(sorted(word))
groups[key].append(word)
return list(groups.values())Complexity: O(n * k log k) where n = number of words, k = max word length
Trace:
word="eat" → key="aet" → groups={"aet": ["eat"]}
word="tea" → key="aet" → groups={"aet": ["eat", "tea"]}
word="tan" → key="ant" → groups={"aet": ["eat", "tea"], "ant": ["tan"]}
word="ate" → key="aet" → groups={"aet": ["eat", "tea", "ate"], "ant": ["tan"]}
word="nat" → key="ant" → groups={"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"]}
word="bat" → key="abt" → groups={"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"], "abt": ["bat"]}Solution 2: Character Count Key (Optimized)
from collections import defaultdict
def groupAnagrams(strs: List[str]) -> List[List[str]]:
groups = defaultdict(list)
for word in strs:
# Count characters (only for lowercase English)
count = [0] * 26
for char in word:
count[ord(char) - ord('a')] += 1
# Tuple of counts is the key
key = tuple(count)
groups[key].append(word)
return list(groups.values())Complexity: O(n * k) where n = number of words, k = max word length
When to use: When words are long and alphabet is small (lowercase English).
Solution 3: Counter Key (General)
from collections import defaultdict, Counter
def groupAnagrams(strs: List[str]) -> List[List[str]]:
groups = defaultdict(list)
for word in strs:
# Counter as key (need to convert to hashable)
key = tuple(sorted(Counter(word).items()))
groups[key].append(word)
return list(groups.values())Complexity: O(n * k) for counting, but sorting items adds overhead
For deep dive on all approaches, see Group Anagrams Deep Dive.
Pattern 2: Group by Pattern
Problem: Isomorphic Strings (LeetCode 205)
Problem: Determine if two strings are isomorphic (same pattern).
Examples:
- "egg" and "add" → true (both have pattern 0,1,1)
- "foo" and "bar" → false (0,1,1 vs 0,1,2)
def isIsomorphic(s: str, t: str) -> bool:
def get_pattern(string):
mapping = {}
pattern = []
next_id = 0
for char in string:
if char not in mapping:
mapping[char] = next_id
next_id += 1
pattern.append(mapping[char])
return tuple(pattern)
return get_pattern(s) == get_pattern(t)Key insight: Convert both strings to pattern, compare patterns.
Problem: Word Pattern (LeetCode 290)
Problem: Check if a pattern matches a sequence of words.
Example: pattern = "abba", s = "dog cat cat dog" → true
def wordPattern(pattern: str, s: str) -> bool:
words = s.split()
if len(pattern) != len(words):
return False
# Need bidirectional mapping
char_to_word = {}
word_to_char = {}
for char, word in zip(pattern, words):
if char in char_to_word:
if char_to_word[char] != word:
return False
else:
char_to_word[char] = word
if word in word_to_char:
if word_to_char[word] != char:
return False
else:
word_to_char[word] = char
return TrueKey insight: Need bidirectional mapping (both directions must be consistent).
Pattern 3: Group by Property
Problem: Group Shifted Strings (LeetCode 249)
Problem: Group strings that can be shifted to match each other.
Example: "abc", "bcd", "xyz" → all shift by same amount
from collections import defaultdict
def groupStrings(strings: List[str]) -> List[List[str]]:
def get_shift_key(s):
# Compute shift pattern relative to first character
if not s:
return ()
shifts = []
for i in range(1, len(s)):
shift = (ord(s[i]) - ord(s[i-1])) % 26
shifts.append(shift)
return tuple(shifts)
groups = defaultdict(list)
for s in strings:
key = get_shift_key(s)
groups[key].append(s)
return list(groups.values())Key insight: Compute relative shifts between adjacent characters.
Problem: Group by Sum of Digits
Problem: Group numbers by sum of their digits.
from collections import defaultdict
def groupByDigitSum(nums: List[int]) -> List[List[int]]:
def digit_sum(n):
total = 0
while n:
total += n % 10
n //= 10
return total
groups = defaultdict(list)
for num in nums:
key = digit_sum(num)
groups[key].append(num)
return list(groups.values())Pattern: Group by computed property (sum of digits).
Multi-Key Grouping (Composite Keys)
Sometimes you need to group by multiple properties.
Example: Group by (Length, First Character)
from collections import defaultdict
def groupByLengthAndFirst(words: List[str]) -> List[List[str]]:
groups = defaultdict(list)
for word in words:
# Tuple of (length, first char)
key = (len(word), word[0] if word else '')
groups[key].append(word)
return list(groups.values())
# Example
words = ["apple", "ant", "banana", "bat", "cat"]
# Groups: {(5, 'a'): ["apple"], (3, 'a'): ["ant"], (6, 'b'): ["banana"], (3, 'b'): ["bat"], (3, 'c'): ["cat"]}Example: Group Points by Row and Column Sum
from collections import defaultdict
def groupPoints(points: List[List[int]]) -> List[List[List[int]]]:
groups = defaultdict(list)
for point in points:
row, col = point
# Group by (row, sum of coordinates)
key = (row, row + col)
groups[key].append(point)
return list(groups.values())Key insight: Use tuple of properties as key.
Handling Mutable Keys
❌ Wrong (lists are not hashable):
key = [item.x, item.y] # Can't use list as key!
groups[key].append(item) # TypeError!✅ Correct (use tuple):
key = (item.x, item.y) # Tuple is hashable
groups[key].append(item)Aggregation After Grouping
Sometimes you need to aggregate data after grouping.
Example: Sum Values per Group
from collections import defaultdict
def sumByCategory(items: List[Tuple[str, int]]) -> Dict[str, int]:
# items = [("fruit", 10), ("veg", 5), ("fruit", 20), ("veg", 15)]
sums = defaultdict(int)
for category, value in items:
sums[category] += value
return dict(sums)
# {"fruit": 30, "veg": 20}Example: Count Items per Group
from collections import defaultdict
def countByGroup(items: List[str]) -> Dict[str, int]:
counts = defaultdict(int)
for item in items:
key = compute_key(item)
counts[key] += 1
return dict(counts)Example: Find Max per Group
from collections import defaultdict
def maxByGroup(items: List[Tuple[str, int]]) -> Dict[str, int]:
maxes = defaultdict(lambda: float('-inf'))
for category, value in items:
maxes[category] = max(maxes[category], value)
return dict(maxes)Common Mistakes
Mistake 1: Using Mutable Keys
❌ Wrong:
# Lists are not hashable
key = sorted(word) # Returns list!
groups[key].append(word) # TypeError!✅ Correct:
# Convert to string or tuple
key = ''.join(sorted(word)) # String
# Or
key = tuple(sorted(word)) # Tuple
groups[key].append(word)Mistake 2: Wrong Key Choice
❌ Wrong (for anagrams):
# Using first character as key
key = word[0]
# "eat" and "egg" would be in same group!✅ Correct:
# Use sorted characters
key = ''.join(sorted(word))Mistake 3: Not Handling Empty Input
❌ Wrong:
key = word[0] # Crashes if word is empty!✅ Correct:
key = word[0] if word else ''
# Or check before
if not word:
continueMistake 4: Forgetting to Convert to List
❌ Wrong (returns dict_values):
return groups.values() # Returns dict_values object✅ Correct:
return list(groups.values()) # Convert to listComplexity Analysis
Time Complexity
Building groups: O(n * k) where k = key computation time
- Iterate through n items
- Compute key for each (depends on key type)
- Sorted string: O(k log k)
- Character count: O(k)
- Pattern: O(k)
Total:
- With sorted key: O(n * k log k)
- With count/pattern key: O(n * k)
Space Complexity
Groups storage: O(n)
- Store all n items across groups
Keys storage: O(g) where g = number of groups
- Best case: O(1) (all items in one group)
- Worst case: O(n) (each item in separate group)
Practice Problems
Beginner
- Group Anagrams (#49) - Classic grouping
- Isomorphic Strings (#205) - Pattern matching
- Word Pattern (#290) - Bidirectional mapping
- Valid Anagram (#242) - Compare groups
Intermediate
- Group Shifted Strings (#249) - Custom key
- Find Duplicate Subtrees (#652) - Tree serialization as key
- Encode and Decode Strings (#271) - Custom encoding
- Longest Word in Dictionary (#720) - Group by prefix
Advanced
- Alien Dictionary (#269) - Topological sort with grouping
- Accounts Merge (#721) - Union-find with grouping
- Number of Distinct Islands (#694) - Shape as key
Summary
The grouping pattern is a powerful technique for categorizing data by computed properties.
Key principles:
- Choose the right key: Deterministic, unique per group, hashable
- Use defaultdict: Cleaner code, automatic list creation
- Consider complexity: Sorted keys are O(k log k), count keys are O(k)
- Handle edge cases: Empty input, mutable keys, bidirectional mapping
The template:
from collections import defaultdict
groups = defaultdict(list)
for item in items:
key = compute_key(item)
groups[key].append(item)
return list(groups.values())Common key types:
- Sorted string (anagrams)
- Character count tuple (anagrams, optimized)
- Pattern string (isomorphic)
- Tuple of properties (multi-key)
- Custom hash (rare)
When to use:
- ✅ Group anagrams
- ✅ Group by pattern
- ✅ Group by computed property
- ✅ Categorize by multiple criteria
Common mistakes:
- ❌ Using mutable keys (lists)
- ❌ Wrong key choice
- ❌ Not handling empty input
- ❌ Forgetting to convert to list
Next steps:
- Master the complete hash map guide
- Learn hash map key selection
- Study group anagrams approaches
- Practice the beginner problem set
Master grouping, and categorization problems become pattern recognition.
Want to Practice LeetCode Smarter?
LeetCopilot is a free browser extension that enhances your LeetCode practice with AI-powered hints, personalized study notes, and realistic mock interviews — all designed to accelerate your coding interview preparation.
Also compatible with Edge, Brave, and Opera
