Lecture 13 — HashMap & HashSet Deep Dive

💡

Why does this topic matter?
The HashMap is the single most frequently used data structure in coding interviews. It powers Two Sum, Group Anagrams, Top K Frequent elements, counting, caching, and dozens of other patterns. You will use a HashMap in more than 60% of all medium-difficulty LeetCode problems. Understanding why it's O(1) — and when it degrades to O(n) — is what separates good candidates from great ones.

🔗

How it connects: Lecture 1 introduced HashMaps as Java Collections. Lecture 8 used frequency arrays (int[26]) as an O(1) alternative for character maps. This lecture formalises the internals — hashing, collision resolution, load factor — and teaches the 5 interview patterns that appear repeatedly across Google, Amazon, Meta and Microsoft problems.

📚 Why HashMaps Beat Every Other Structure

Before HashMaps, solving "Does this element exist?" required O(n) linear scan or O(log n) binary search in a sorted structure. HashMaps provide O(1) average-case lookup, insert, and delete — an extraordinary advantage.

Operation	Array	Sorted Array	BST	HashMap
Search	O(n)	O(log n)	O(log n)	O(1) avg
Insert	O(1) amortised	O(n)	O(log n)	O(1) avg
Delete	O(n)	O(n)	O(log n)	O(1) avg
Ordered?	No	Yes	Yes	No (use TreeMap)

⚠

When NOT to use HashMap: If you need sorted order → use TreeMap (O(log n)). If keys are small integers → use an int[] frequency array (faster in practice). HashMaps have high constant factors due to hashing overhead.

⚙ How Hashing Works — The Internals

A HashMap stores entries in an internal array of buckets. To find where a key belongs:

Compute hash = key.hashCode()
Map to bucket index: index = hash & (capacity - 1)
Store the (key, value) pair in that bucket

HashMap internals (capacity = 8, default): Key hashCode() index = hash & 7 Bucket ────────────────────────────────────────────── "cat" 98532 4 bucket[4] → ("cat", value) "dog" 99557 5 bucket[5] → ("dog", value) "art" 96478 6 bucket[6] → ("art", value) "rat" 113608 0 bucket[0] → ("rat", value) ← collision risk

Collision Resolution — Separate Chaining vs Open Addressing

When two keys hash to the same bucket, we have a collision. Java's HashMap uses Separate Chaining — each bucket holds a linked list (or a red-black tree when the chain length > 8).

Strategy	How It Works	Java Uses?	Worst Case
Separate Chaining	Each bucket = list/tree of entries	✅ Yes (HashMap)	O(n) — all into one bucket
Open Addressing	Linear/quadratic probing for next empty slot	No	O(n) with bad hash
Robin Hood	Steal from "rich" entries, give to "poor"	No	O(log n) expected

🔭

Java HashMap's tree upgrade (Java 8+): When a bucket's chain length exceeds TREEIFY_THRESHOLD = 8, the linked list is converted to a red-black tree. This makes worst-case per-bucket lookup O(log n) instead of O(n). This is the "HashMap treeification" interview question at Google.

Load Factor & Resizing

Java's HashMap has a default load factor of 0.75. When size / capacity > 0.75, the map doubles capacity and rehashes all entries — an O(n) operation. This is why HashMap insertions are O(1) amortised, not O(1) worst-case.

☕ Java · Key HashMap API

Map<String, Integer> map = new HashMap<>();

// Core operations — all O(1) average
map.put("apple", 3);                    // insert / update
map.get("apple");                        // → 3
map.getOrDefault("banana", 0);           // → 0 (safe default)
map.containsKey("apple");               // → true
map.remove("apple");                    // delete key

// Frequency counting idiom (interview staple)
map.put("apple", map.getOrDefault("apple", 0) + 1);
// OR: Java 8+
map.merge("apple", 1, Integer::sum);

// Iterating
for (Map.Entry<String, Integer> e : map.entrySet())
    System.out.println(e.getKey() + " → " + e.getValue());

// HashSet — same internals, no values
Set<Integer> seen = new HashSet<>();
seen.add(5);
seen.contains(5);   // O(1)
seen.remove(5);

🎯 The 5 Core HashMap Interview Patterns

Every HashMap problem you will encounter in FAANG interviews is a variation of one of these five patterns. Learn to recognise the pattern first — the code will follow naturally.

#	Pattern	Trigger phrase	Canonical Problem
1	Frequency Map	"count occurrences", "most frequent"	Top K Frequent (LC 347)
2	Complement Lookup	"find two that add to target"	Two Sum (LC 1)
3	Prefix Sum + Map	"subarray sum equals K"	Subarray Sum = K (LC 560)
4	Group-by-Key	"group", "anagram", "same signature"	Group Anagrams (LC 49)
5	Set Membership	"duplicate", "seen before", "longest sequence"	Longest Consecutive (LC 128)

Pattern 1: Frequency Map

Count how many times each element appears. The result is a Map<T, Integer> where value = count.

☕ Java · Frequency Map template

// Count frequency of each number
Map<Integer, Integer> freq = new HashMap<>();
for (int x : nums)
    freq.merge(x, 1, Integer::sum);  // Java 8 idiom
// OR: freq.put(x, freq.getOrDefault(x, 0) + 1);

Pattern 2: Complement Lookup

For "find two elements summing to target": as you scan, store what you've seen. For each new element, check if its complement = target - element is already stored.

☕ Java · Complement Lookup template

Map<Integer, Integer> seen = new HashMap<>();
for (int i = 0; i < nums.length; i++) {
    int complement = target - nums[i];
    if (seen.containsKey(complement))
        return new int[]{seen.get(complement), i};
    seen.put(nums[i], i);  // add AFTER checking
}

Pattern 3: Prefix Sum + Map

For "subarray sum equals K": maintain a running prefix sum. At each index, check if prefixSum - K has been seen before — that gap is a valid subarray.

☕ Java · Prefix Sum + Map template

Map<Integer, Integer> prefixCount = new HashMap<>();
prefixCount.put(0, 1); // empty prefix sum seen once
int sum = 0, count = 0;
for (int x : nums) {
    sum += x;
    count += prefixCount.getOrDefault(sum - k, 0);
    prefixCount.merge(sum, 1, Integer::sum);
}

Pattern 4: Group-by-Key

Transform each element into a canonical key (sorted string, character frequency array, etc.) and group elements sharing the same key.

☕ Java · Group-by-Key template

Map<String, List<String>> groups = new HashMap<>();
for (String s : strs) {
    char[] c = s.toCharArray();
    Arrays.sort(c);
    String key = new String(c);  // sorted = canonical form
    groups.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}

Pattern 5: Set Membership

Add all elements to a HashSet first. Then traverse, using the set for O(1) existence checks. Classic trick for "longest consecutive sequence": only start counting from sequence beginnings (nums where num - 1 is NOT in the set).

☕ Java · Set Membership template

Set<Integer> set = new HashSet<>();
for (int x : nums) set.add(x);

int longest = 0;
for (int x : set) {
    if (!set.contains(x - 1)) {    // x is a sequence start
        int len = 1;
        while (set.contains(x + len)) len++;
        longest = Math.max(longest, len);
    }
}

💪 In-Lecture Practice Problems

Work through in order. Click a card to expand. Try solving before revealing the solution.

Problem 01 · Complement Lookup

Two Sum

🔗 LC 1 — Easy

Easy GoogleAmazonMeta Complement Lookup

›

Problem

Given an array and a target, return indices of the two numbers that add to target. Exactly one solution exists.

Input: nums=[2,7,11,15], target=9 → Output: [0,1] Input: nums=[3,2,4], target=6 → Output: [1,2]

Think First — Before looking at solution

Brute force: Check all pairs → O(n²). Too slow.
Key insight: For each element x, the complement we need is target - x. If we store every element we've seen so far in a HashMap (value → index), we can check if the complement already exists in O(1).
Critical detail: Put into map after checking — prevents using the same element twice.

▶Solution with full dry run

☕ Java · One-Pass HashMap

int[] twoSum(int[] nums, int target) {
    Map<Integer, Integer> seen = new HashMap<>(); // value → index
    for (int i = 0; i < nums.length; i++) {
        int complement = target - nums[i];
        if (seen.containsKey(complement))
            return new int[]{seen.get(complement), i};
        seen.put(nums[i], i); // add AFTER checking
    }
    return new int[]{};
}

// Dry run: nums=[2,7,11,15], target=9
// i=0: complement=7, seen={} → miss → seen={2:0}
// i=1: complement=2, seen={2:0} → HIT → return [0,1] ✓

TimeO(n)

SpaceO(n)

Problem 02 · Character Map

Roman to Integer

🔗 LC 13 — Easy

Easy AmazonMicrosoft Character Map

›

Problem

Convert a Roman numeral string to an integer. Roman symbols: I=1, V=5, X=10, L=50, C=100, D=500, M=1000. Subtraction rule: if a smaller value precedes a larger one, subtract it (e.g. IV=4, IX=9).

Input: "III" → 3 Input: "LVIII" → 58 (L=50, V=5, III=3) Input: "MCMXCIV" → 1994 (M=1000, CM=900, XC=90, IV=4)

Think First

Build a HashMap from Roman characters to integer values. Scan right to left — if current value is less than the next value to the right, subtract it; otherwise add it.

▶Solution with dry run

☕ Java · HashMap + Right-to-Left Scan

int romanToInt(String s) {
    Map<Character, Integer> val = Map.of(
        'I', 1, 'V', 5, 'X', 10, 'L', 50,
        'C', 100, 'D', 500, 'M', 1000);
    int result = 0, prev = 0;
    for (int i = s.length() - 1; i >= 0; i--) {
        int curr = val.get(s.charAt(i));
        result += (curr < prev) ? -curr : curr;
        prev = curr;
    }
    return result;
}

// Dry run: "MCMXCIV" (right to left)
// i=6: V=5,  prev=0 → 5>=0 → add 5.  result=5,  prev=5
// i=5: I=1,  prev=5 → 1< 5 → sub 1.  result=4,  prev=1
// i=4: C=100,prev=1 → 100>=1→ add 100.result=104,prev=100
// i=3: X=10, prev=100→ 10<100→sub 10.result=94, prev=10
// i=2: M=1000...→ result=1994 ✓

TimeO(n)

SpaceO(1)

Problem 03 · Group-by-Key

Group Anagrams

🔗 LC 49 — Medium

Medium GoogleAmazon Group-by-Key

›

Problem

Given a list of strings, group the anagrams together. An anagram is a word formed by rearranging another word's letters.

Input: ["eat","tea","tan","ate","nat","bat"] Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Think First

Key insight: All anagrams of a word share the same sorted character sequence. "eat", "tea", "ate" all sort to "aet". Use the sorted string as the HashMap key, and group original words under it.

▶Solution with dry run

☕ Java · Sort Key Grouping

List<List<String>> groupAnagrams(String[] strs) {
    Map<String, List<String>> map = new HashMap<>();
    for (String s : strs) {
        char[] c = s.toCharArray();
        Arrays.sort(c);
        String key = new String(c);          // canonical form
        map.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
    }
    return new ArrayList<>(map.values());
}

// Dry run: ["eat","tea","tan"]
// "eat" → sort → "aet" → map={"aet":["eat"]}
// "tea" → sort → "aet" → map={"aet":["eat","tea"]}
// "tan" → sort → "ant" → map={"aet":[...], "ant":["tan"]}
// Return [[eat,tea],[tan]] ✓

TimeO(n · k log k)

SpaceO(n · k)

💡

O(n · k) alternative: Instead of sorting each string, build a frequency-count key: a fixed 26-element int array like [1,0,0,...,1]. Sorting is O(k log k); frequency key is O(k). Both work — mention the tradeoff in interviews.

Problem 04 · Frequency Map + Heap

Top K Frequent Elements

🔗 LC 347 — Medium

Medium AmazonGoogle Frequency MapMin-Heap

›

Problem

Given an integer array and k, return the k most frequent elements.

Input: nums=[1,1,1,2,2,3], k=2 → Output: [1,2] Input: nums=[1], k=1 → Output: [1]

Think First

Step 1: Build a frequency map (element → count). Step 2: Find the top-K most frequent. Options:
a) Sort by frequency descending — O(n log n).
b) Use a min-heap of size K — O(n log k). Better when k << n.
c) Bucket sort on frequency — O(n). Best theoretical complexity.

▶Solution — Min-Heap approach O(n log k)

☕ Java · Freq Map + Min-Heap

int[] topKFrequent(int[] nums, int k) {
    // Step 1: frequency map
    Map<Integer, Integer> freq = new HashMap<>();
    for (int x : nums) freq.merge(x, 1, Integer::sum);

    // Step 2: min-heap of size k (keeps top-k frequent)
    PriorityQueue<Integer> pq =
        new PriorityQueue<>((a, b) -> freq.get(a) - freq.get(b));
    for (int key : freq.keySet()) {
        pq.offer(key);
        if (pq.size() > k) pq.poll(); // evict least frequent
    }

    // Step 3: extract answers
    int[] res = new int[k];
    for (int i = k - 1; i >= 0; i--) res[i] = pq.poll();
    return res;
}

// nums=[1,1,1,2,2,3], k=2
// freq={1:3, 2:2, 3:1}
// heap after all: [2(freq2), 1(freq3)] → [1,2] ✓

TimeO(n log k)

SpaceO(n)

Problem 05 · Prefix Sum + Map

Subarray Sum Equals K

🔗 LC 560 — Medium

Medium GoogleFacebook Prefix Sum + Map

›

Problem

Given an integer array and k, return the number of contiguous subarrays that sum to k. Array may contain negative numbers.

Input: nums=[1,1,1], k=2 → 2 (subarrays [1,1] at indices 0-1 and 1-2) Input: nums=[1,2,3], k=3 → 2 ([1,2] and [3])

Think First — The Key Insight

If prefix[j] - prefix[i] = k, then subarray [i+1..j] sums to k. Rearranging: prefix[i] = prefix[j] - k.
As we scan, for each new prefix sum, we ask: how many times has (prefixSum - k) appeared before? Each such occurrence is a valid subarray ending here.

▶Solution with dry run

☕ Java · Prefix Sum HashMap

int subarraySum(int[] nums, int k) {
    Map<Integer, Integer> prefixCount = new HashMap<>();
    prefixCount.put(0, 1);  // empty prefix sum seen once
    int sum = 0, count = 0;
    for (int x : nums) {
        sum += x;
        count += prefixCount.getOrDefault(sum - k, 0);
        prefixCount.merge(sum, 1, Integer::sum);
    }
    return count;
}

// Dry run: nums=[1,1,1], k=2
// Start: map={0:1}, sum=0, count=0
// x=1: sum=1, need sum-k=1-2=-1 → map has 0 → count=0 → map={0:1,1:1}
// x=1: sum=2, need 2-2=0 → map has 1 → count=1  → map={0:1,1:1,2:1}
// x=1: sum=3, need 3-2=1 → map has 1 → count=2  → map={...,3:1}
// Return 2 ✓

TimeO(n)

SpaceO(n)

⚠

Why not sliding window? Sliding window only works for non-negative arrays. Since this problem allows negative numbers, prefix sum + HashMap is the correct O(n) approach. Never use a two-pointer here.

Problem 06 · Set Membership

Longest Consecutive Sequence

🔗 LC 128 — Medium

Medium GoogleAmazon Set Membership

›

Problem

Given an unsorted array, return the length of the longest consecutive elements sequence. Must run in O(n).

Input: [100,4,200,1,3,2] → 4 (sequence: [1,2,3,4]) Input: [0,3,7,2,5,8,4,6,0,1] → 9 ([0,1,2,3,4,5,6,7,8])

Think First

Key insight: Add all numbers to a HashSet. Only start building a sequence from a "sequence start" — a number where num - 1 is NOT in the set. From there, count upwards. This ensures each number is visited at most twice → O(n).

▶Solution with dry run

☕ Java · HashSet O(n)

int longestConsecutive(int[] nums) {
    Set<Integer> set = new HashSet<>();
    for (int x : nums) set.add(x);

    int longest = 0;
    for (int x : set) {
        if (!set.contains(x - 1)) {   // x is a sequence start
            int len = 1;
            while (set.contains(x + len)) len++;
            longest = Math.max(longest, len);
        }
    }
    return longest;
}

// [100,4,200,1,3,2] → set={100,4,200,1,3,2}
// x=100: set has 99? No → start. 101? No → len=1
// x=4:   set has 3? Yes → NOT a start → skip
// x=1:   set has 0? No → start. 2?Yes,3?Yes,4?Yes,5?No → len=4
// longest = 4 ✓

TimeO(n)

SpaceO(n)

Problem 07 · Design (Hard)

LRU Cache

🔗 LC 146 — Hard

Hard AmazonGoogleMeta HashMap + DLL

›

Problem

Design a data structure that implements an LRU (Least Recently Used) cache with capacity constraint. Both get and put must run in O(1).

LRUCache(2) // capacity = 2 put(1, 1) // cache: {1=1} put(2, 2) // cache: {1=1, 2=2} get(1) // returns 1, cache: {2=2, 1=1} (1 moves to front) put(3, 3) // evicts key 2 (LRU), cache: {1=1, 3=3} get(2) // returns -1 (not found)

Think First — The Key Insight

O(1) get requires a HashMap. O(1) ordering (move to front, remove last) requires a Doubly Linked List. Together: HashMap (key → DLL node) + DLL (ordered from most-recent to least-recent).

Java shortcut: LinkedHashMap with access-order mode does this automatically — great to mention in interviews, but implement from scratch if asked.

▶Solution — LinkedHashMap (concise) + DLL (full)

☕ Java · LinkedHashMap (O(1) solution)

class LRUCache extends LinkedHashMap<Integer, Integer> {
    private int capacity;

    LRUCache(int capacity) {
        // true = access-order (most recently used first)
        super(capacity, 0.75f, true);
        this.capacity = capacity;
    }

    public int get(int key) {
        return super.getOrDefault(key, -1);
    }

    public void put(int key, int value) {
        super.put(key, value);
    }

    // Called automatically by LinkedHashMap after each put
    @Override
    protected boolean removeEldestEntry(Map.Entry<Integer, Integer> e) {
        return size() > capacity; // evict when over capacity
    }
}

🔭

Interview note: Mention LinkedHashMap first (shows Java knowledge), then say: "If you need me to implement from scratch, I'd use a HashMap of key → DLL node, plus sentinel head/tail nodes to avoid null checks." The interviewer will usually say LinkedHashMap is fine — but know the DLL approach too for follow-ups.

get TimeO(1)

put TimeO(1)

SpaceO(capacity)

📝 Assignment

📋

Topic 13 Assignment — 25 Problems
Complete the assignment before moving to Matrix Problems. Covers frequency counting, Two-Sum family variants, subarray sum patterns, isomorphic strings, and cache design problems.

📄 Open Assignment →

✅ Lecture Completion Checklist

Check each item before advancing to Lecture 14.

✓

I can explain why HashMap is O(1) average — and when it degrades to O(n)

✓

I know the difference between Separate Chaining and Open Addressing collision resolution

✓

I can implement Two Sum from scratch in under 3 minutes, including the "add after check" detail

✓

I understand the prefix sum + HashMap pattern and can derive it from scratch

✓

I can explain the Group Anagrams sorted-key approach AND the frequency-array key alternative

✓

I know why Longest Consecutive Sequence can be solved in O(n) using "sequence start" trick

✓

I can design LRU Cache using LinkedHashMap AND describe the DLL+HashMap approach from scratch

✓

I know when to use HashMap vs int[] frequency array vs TreeMap vs LinkedHashMap

✓

I completed all 25 assignment problems and can explain the approach to each

🚀

You're ready for Topic 14: Matrix Problems
Matrix problems require combining array traversal with directional logic (spiral, BFS over grid, DFS flood fill). The HashMap patterns you just mastered appear inside those solutions too — especially for island counting and multi-source BFS state tracking.

HashMap & HashSet — Deep Dive

📚 Why HashMaps Beat Every Other Structure

⚙ How Hashing Works — The Internals

Collision Resolution — Separate Chaining vs Open Addressing

Load Factor & Resizing

🎯 The 5 Core HashMap Interview Patterns

Pattern 1: Frequency Map

Pattern 2: Complement Lookup

Pattern 3: Prefix Sum + Map

Pattern 4: Group-by-Key

Pattern 5: Set Membership

💪 In-Lecture Practice Problems

📝 Assignment

✅ Lecture Completion Checklist