You're in an interview. You know you need to store data, but which structure?
Hash map? Array? Set? Tree? Two pointers? They all seem like they could work...
This is the gap between knowing data structures and choosing the right one.
The ability to quickly choose the optimal data structure is what separates senior engineers from juniors. It's not about knowing all structures—it's about having a systematic decision framework.
This guide will teach you:
- The decision tree for choosing data structures
- Hash map vs. array (when each is better)
- Hash map vs. set (when you need values)
- Hash map vs. tree (ordered vs. unordered)
- Hash map vs. two pointers (space vs. time)
- Hash map vs. sliding window (when to combine)
- Complexity comparison tables
- Common decision mistakes
Master this, and data structure choice becomes systematic, not guesswork.
The Decision Framework
The master decision tree
What do you need to do?
├─ Look up by arbitrary key → Hash map or Set
├─ Look up by index → Array
├─ Maintain order → Tree or Sorted Array
├─ Find pairs/ranges → Two Pointers
└─ Process windows → Sliding Window
Do you need associated values?
├─ Yes → Hash map
└─ No → Set
Do you need ordering?
├─ Yes → Tree (TreeMap/TreeSet)
└─ No → Hash map/Set
Is space critical?
├─ Yes → Two pointers, in-place, or array
└─ No → Hash map/Set
Is input sorted?
├─ Yes → Two pointers or binary search
└─ No → Hash map or sort firstLet's break down each comparison.
Hash Map vs. Array
When to use array
Use array when:
- ✅ Access by index (not arbitrary key)
- ✅ Fixed or known size
- ✅ Keys are small integers (0 to n)
- ✅ Need O(1) space (in-place)
- ✅ Need to preserve order
Example: Counting lowercase letters
# Array is better (only 26 letters)
count = [0] * 26
for char in s:
count[ord(char) - ord('a')] += 1Why array wins:
- Space: O(26) = O(1) vs. O(n) for hash map
- Faster: Direct indexing vs. hashing
- Simpler: No hash function needed
When to use hash map
Use hash map when:
- ✅ Keys are arbitrary (not just 0 to n)
- ✅ Keys are strings, objects, or large integers
- ✅ Sparse data (many possible keys, few used)
- ✅ Unknown key range
Example: Counting words
# Hash map is better (unknown words)
count = {}
for word in words:
count[word] = count.get(word, 0) + 1Why hash map wins:
- Flexibility: Works with any hashable key
- Space: Only stores used keys (sparse)
- No need to know range in advance
Comparison table
| Aspect | Array | Hash Map |
|---|---|---|
| Access | O(1) by index | O(1) by key (average) |
| Keys | Small integers | Any hashable type |
| Space | O(n) fixed | O(k) where k = distinct keys |
| Order | Preserved | Not preserved |
| Use case | Known range, indexed | Arbitrary keys, sparse |
Example: Two Sum
Array approach (only if values are small):
# Only works if 0 <= nums[i] <= 10^5
def twoSum(nums, target):
seen = [False] * 100001
for i, num in enumerate(nums):
complement = target - num
if 0 <= complement <= 100000 and seen[complement]:
return True
seen[num] = TrueHash map approach (general):
# Works for any values
def twoSum(nums, target):
seen = {}
for i, num in enumerate(nums):
if target - num in seen:
return [seen[target - num], i]
seen[num] = iVerdict: Hash map is more flexible and handles general case.
Hash Map vs. Set
When to use set
Use set when:
- ✅ Only need presence (exists or not)
- ✅ Don't need associated values
- ✅ Checking membership
- ✅ Deduplication
Example: Contains Duplicate
# Set is better (only need presence)
def containsDuplicate(nums):
return len(nums) != len(set(nums))
# Or
def containsDuplicate(nums):
seen = set()
for num in nums:
if num in seen:
return True
seen.add(num)
return FalseWhy set wins:
- Clearer intent: "I only care about presence"
- Slightly more memory efficient
- Simpler API (add vs. setting key/value)
When to use hash map
Use hash map when:
- ✅ Need associated values (count, index, etc.)
- ✅ Need to track metadata per element
- ✅ Need to update values
Example: Two Sum (need indices)
# Hash map is needed (need to store indices)
def twoSum(nums, target):
seen = {} # Can't use set - need indices!
for i, num in enumerate(nums):
if target - num in seen:
return [seen[target - num], i]
seen[num] = iWhy hash map wins:
- Need to store index (value)
- Set only stores keys
Comparison table
| Aspect | Set | Hash Map |
|---|---|---|
| Purpose | Presence only | Key-value pairs |
| Space | Slightly less | Slightly more |
| Use case | Membership, dedup | Need associated data |
| API | add, remove, in | get, set, in |
Decision rule
Do you need to store anything besides the key?
├─ Yes → Hash map
└─ No → SetFor detailed comparison, see Hash Map vs Hash Set.
Hash Map vs. Tree (TreeMap/TreeSet)
When to use tree
Use tree when:
- ✅ Need sorted order
- ✅ Need range queries (elements between x and y)
- ✅ Need predecessor/successor
- ✅ Need min/max efficiently
Example: Kth Largest Element (with updates)
from sortedcontainers import SortedList
class KthLargest:
def __init__(self, k, nums):
self.k = k
self.tree = SortedList(nums)
def add(self, val):
self.tree.add(val)
return self.tree[-self.k]Why tree wins:
- Maintains sorted order: O(log n) insert
- Kth largest: O(1) access
- Hash map can't maintain order
When to use hash map
Use hash map when:
- ✅ Don't need ordering
- ✅ Only need exact lookup (not range)
- ✅ Want O(1) average (not O(log n))
Example: Two Sum
# Hash map is better (don't need order)
def twoSum(nums, target):
seen = {} # O(1) lookup
for i, num in enumerate(nums):
if target - num in seen:
return [seen[target - num], i]
seen[num] = iWhy hash map wins:
- O(1) lookup vs. O(log n) for tree
- Don't need ordering
- Simpler and faster
Comparison table
| Aspect | Hash Map | Tree (TreeMap) |
|---|---|---|
| Lookup | O(1) average | O(log n) |
| Insert | O(1) average | O(log n) |
| Order | No | Yes (sorted) |
| Range query | No | Yes |
| Min/max | No | O(1) or O(log n) |
| Use case | Exact lookup | Ordered data, ranges |
Decision rule
Do you need sorted order or range queries?
├─ Yes → Tree
└─ No → Hash map (faster)Hash Map vs. Two Pointers
When to use two pointers
Use two pointers when:
- ✅ Array is sorted (or can be sorted)
- ✅ Need O(1) space
- ✅ Finding pairs that meet a condition
- ✅ Don't need to preserve original indices
Example: Two Sum II (sorted array)
# Two pointers is better (sorted + O(1) space)
def twoSum(numbers, target):
left, right = 0, len(numbers) - 1
while left < right:
current_sum = numbers[left] + numbers[right]
if current_sum == target:
return [left + 1, right + 1]
elif current_sum < target:
left += 1
else:
right -= 1
return []Why two pointers wins:
- O(1) space vs. O(n) for hash map
- Array is already sorted
- Don't need original indices
When to use hash map
Use hash map when:
- ✅ Array is unsorted (and can't sort)
- ✅ Need to preserve original indices
- ✅ Can afford O(n) space
- ✅ Need O(1) lookup per element
Example: Two Sum (unsorted, need indices)
# Hash map is better (unsorted + need indices)
def twoSum(nums, target):
seen = {}
for i, num in enumerate(nums):
if target - num in seen:
return [seen[target - num], i]
seen[num] = iWhy hash map wins:
- Can't sort (would lose indices)
- Need original positions
- O(1) lookup per element
Comparison table
| Aspect | Hash Map | Two Pointers |
|---|---|---|
| Time | O(n) | O(n) or O(n log n) with sort |
| Space | O(n) | O(1) |
| Sorted? | No | Yes |
| Indices? | Yes | No (lost after sort) |
| Use case | Unsorted, need indices | Sorted, O(1) space |
Decision rule
Is the array sorted?
├─ Yes → Two pointers (O(1) space)
└─ No → Do you need original indices?
├─ Yes → Hash map
└─ No → Sort + two pointersFor detailed comparison, see Hash Map vs Two Pointers.
Hash Map vs. Sliding Window
When to use sliding window
Use sliding window when:
- ✅ Finding longest/shortest contiguous subarray
- ✅ Array has no negative numbers (for sum problems)
- ✅ Window has monotonic property
- ✅ Want O(1) space (or O(k) for window state)
Example: Longest Substring Without Repeating Characters
# Sliding window is better (longest substring)
def lengthOfLongestSubstring(s):
left = 0
seen = set()
max_len = 0
for right, char in enumerate(s):
while char in seen:
seen.remove(s[left])
left += 1
seen.add(char)
max_len = max(max_len, right - left + 1)
return max_lenWhy sliding window wins:
- Efficiently finds longest window
- O(n) time with O(k) space (k = alphabet size)
- Natural for contiguous subarray problems
When to use hash map (prefix sum)
Use hash map when:
- ✅ Array has negative numbers
- ✅ Need exact sum (not longest/shortest)
- ✅ Counting subarrays
- ✅ Sliding window monotonicity breaks
Example: Subarray Sum Equals K
# Hash map is needed (negative numbers possible)
def subarraySum(nums, k):
count = 0
prefix_sum = 0
sum_count = {0: 1}
for num in nums:
prefix_sum += num
if prefix_sum - k in sum_count:
count += sum_count[prefix_sum - k]
sum_count[prefix_sum] = sum_count.get(prefix_sum, 0) + 1
return countWhy hash map wins:
- Handles negative numbers
- Counts all subarrays (not just longest)
- Sliding window fails with negatives
When to combine both
Some problems use hash map + sliding window:
Example: Longest Substring with At Most K Distinct
# Combine: sliding window + hash map for frequency
def lengthOfLongestSubstringKDistinct(s, k):
left = 0
freq = {} # Hash map for window state
max_len = 0
for right, char in enumerate(s):
freq[char] = freq.get(char, 0) + 1
# Shrink while too many distinct
while len(freq) > k:
freq[s[left]] -= 1
if freq[s[left]] == 0:
del freq[s[left]]
left += 1
max_len = max(max_len, right - left + 1)
return max_lenPattern: Sliding window for the window, hash map for tracking state.
Comparison table
| Aspect | Hash Map (Prefix Sum) | Sliding Window |
|---|---|---|
| Use case | Exact sum, count | Longest/shortest |
| Negatives? | Yes | No (breaks monotonicity) |
| Space | O(n) | O(1) or O(k) |
| Pattern | Prefix sum + complement | Expand/shrink window |
Decision rule
Is it a contiguous subarray problem?
├─ Yes → Does it have negative numbers?
│ ├─ Yes → Hash map (prefix sum)
│ └─ No → Finding longest/shortest?
│ ├─ Yes → Sliding window
│ └─ No (exact sum) → Hash map
└─ No → Not sliding windowCommon Decision Mistakes
Mistake 1: Using Hash Map for Small Fixed Range
❌ Wrong (overkill):
# Counting lowercase letters with hash map
freq = {}
for char in s:
freq[char] = freq.get(char, 0) + 1✅ Better (use array):
# Only 26 letters - array is simpler and faster
freq = [0] * 26
for char in s:
freq[ord(char) - ord('a')] += 1Mistake 2: Using Hash Map When Set Suffices
❌ Wrong (unnecessary):
# Contains duplicate with hash map
seen = {}
for num in nums:
if num in seen:
return True
seen[num] = True # Don't need value!✅ Better (use set):
seen = set()
for num in nums:
if num in seen:
return True
seen.add(num)Mistake 3: Not Using Two Pointers for Sorted Array
❌ Wrong (suboptimal):
# Two Sum II with hash map (O(n) space)
def twoSum(numbers, target):
seen = {}
for i, num in enumerate(numbers):
if target - num in seen:
return [seen[target - num] + 1, i + 1]
seen[num] = i✅ Better (use two pointers):
# O(1) space
def twoSum(numbers, target):
left, right = 0, len(numbers) - 1
while left < right:
s = numbers[left] + numbers[right]
if s == target:
return [left + 1, right + 1]
elif s < target:
left += 1
else:
right -= 1Mistake 4: Using Sliding Window with Negative Numbers
❌ Wrong (fails):
# Trying sliding window for subarray sum with negatives
# Doesn't work - monotonicity is broken✅ Better (use prefix sum + hash map):
def subarraySum(nums, k):
count = 0
prefix_sum = 0
sum_count = {0: 1}
for num in nums:
prefix_sum += num
if prefix_sum - k in sum_count:
count += sum_count[prefix_sum - k]
sum_count[prefix_sum] = sum_count.get(prefix_sum, 0) + 1
return countMaster Comparison Table
| Data Structure | Lookup | Insert | Order | Space | Best For |
|---|---|---|---|---|---|
| Array | O(1) by index | O(1) at end | Yes | O(n) | Small range, indexed |
| Hash Map | O(1) by key | O(1) | No | O(k) | Arbitrary keys |
| Hash Set | O(1) | O(1) | No | O(k) | Presence only |
| Tree | O(log n) | O(log n) | Yes | O(n) | Sorted, ranges |
| Two Pointers | N/A | N/A | N/A | O(1) | Sorted pairs |
| Sliding Window | N/A | N/A | N/A | O(1)-O(k) | Contiguous windows |
Practice Problems
Hash Map vs Array
- Valid Anagram (#242) - Array for lowercase, hash map for general
- First Unique Character (#387) - Array or hash map
Hash Map vs Set
- Contains Duplicate (#217) - Set is enough
- Two Sum (#1) - Hash map needed (indices)
Hash Map vs Tree
- Kth Largest Element (#215) - Heap or tree
- Contains Duplicate III (#220) - Tree for range
Hash Map vs Two Pointers
- Two Sum (#1) - Hash map (unsorted)
- Two Sum II (#167) - Two pointers (sorted)
- 3Sum (#15) - Sort + two pointers
Hash Map vs Sliding Window
- Subarray Sum Equals K (#560) - Hash map (negatives)
- Longest Substring Without Repeating (#3) - Sliding window
- Minimum Window Substring (#76) - Both (window + map)
Summary
Choosing the right data structure is about understanding trade-offs and requirements.
Key decision factors:
- Keys: Arbitrary → hash map, small range → array
- Values: Need them → hash map, don't need → set
- Order: Need it → tree, don't need → hash map
- Space: Critical → two pointers/array, not critical → hash map
- Sorted: Yes → two pointers, no → hash map
The decision framework:
- What do you need to look up? (key type)
- Do you need associated values? (map vs set)
- Do you need ordering? (tree vs hash map)
- Is space critical? (in-place vs hash map)
- Is input sorted? (two pointers vs hash map)
Common patterns:
- Arbitrary keys → Hash map
- Presence only → Set
- Sorted data → Tree or two pointers
- Small fixed range → Array
- Contiguous subarray → Sliding window or prefix sum + hash map
Next steps:
- Master the complete hash map guide
- Learn when NOT to use hash maps
- Study hash map vs two pointers
- Study hash map vs hash set
- Practice the problem set
Master this framework, and data structure choice becomes systematic.
Want to Practice LeetCode Smarter?
LeetCopilot is a free browser extension that enhances your LeetCode practice with AI-powered hints, personalized study notes, and realistic mock interviews — all designed to accelerate your coding interview preparation.
Also compatible with Edge, Brave, and Opera
