Alien Dictionary

4 min read
Graph
LC-Hard

Oct 21, 2022

3 years ago

LeetCode

Solution

Uses Topological Sort
Create a graph from the given rules
If there's a cycle in this graph, return an empty string because a valid answer is not possible
- See the example below: does w come before e, or does e come before w? We don't know. We cannot know.
The first part of the code is just bookeeping, checking for boundary conditions, and setting up the adjacency list

adj = defaultdict(set) for i in range(len(words) - 1): w1, w2 = words[i], words[i + 1] min_len = min(len(w1), len(w2)) if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]: return '' for j in range(min_len): if w1[j] != w2[j]: adj[w1[j]].add(w2[j]) break

Loop though each word pair
If two pairs have the same suffix but the longer word comes before the shorter word, immediately return because this is not lexicographically possible.
- We know that the words are in sorted order already
- If there are two words "app" and "apple" but apple comes before app, we immediately return an empty string
We maintain a adjacency list where the key is the character and the value is a set of characters that come AFTER the key character
Loop through each character in both the words, when a character that doesn't match is seen, we add the character from the second word (remember, the words are sorted) to the the set

The next part of the code is the graph traversal

def dfs(node): if node in visiting: return False if node in visited: return True visiting.add(node) for nbr in adj[node]: if not dfs(nbr): return False visited.add(node) path.append(node) visiting.remove(node) return True for ch in set(''.join(words)): if not dfs(ch): return '' return ''.join(path[::-1])

We do something called post-order DFS, where instead of adding a character immediately to the output variable, we add it once we finished dealing with all of it's neighbours.
- Why are we doing post-order DFS? Because if we do a regular DFS, we have a chance to get an invalid ordering:
- Here, ABC would be the correct order, not ACB
  - We will get ACB if we do regular DFS, i.e. add the value to path immediately once visited
  - We will get ABC (the correct ordering) if we do post-order DFS, i.e. add the value to the path only after we deal with all it's neighbours
We need to maintain two sets called visited and visiting
visited is to recur out of the DFS if a character we've seen already is seen, we've already dealt with this character before and hence no need to process it again
visiting is to check for cycles in the graph. If we visit a node and then visit it again, return False because if a loop exists, we cannot figure out the ordering of characters
Do the Depth-first Search on each character, and if at any point we return False, return an empty string
Else, return the path, but remember to reverse it since we did a post-order DFS.
Why do we need to do it for each character:
- The graph is not guaranteed to be a single component:
  - Given example words = ["zy", "zx"], the only graph would be:
    - y -> x
  - How can z come into the picture?
    - Only be looping through each character can z come here
  - Because from the given input, it is immediately clear that y comes before x.
  - But where does z come from? It has to come after both x and y because there's no other ordering mentioned and the only other place it can come is after both x and y (I think)

Time	Space	Explanation
`O()`	`O()`

def alienOrder(self, words: List[str]) -> str: adj = defaultdict(set) for i in range(len(words) - 1): w1, w2 = words[i], words[i + 1] min_len = min(len(w1), len(w2)) if len(w1) > len(w2) and w1[:min_len] == w2[:min_len]: return '' for j in range(min_len): if w1[j] != w2[j]: adj[w1[j]].add(w2[j]) break visited, visiting, path = set(), set(), [] def dfs(node): if node in visiting: return False if node in visited: return True visiting.add(node) for nbr in adj[node]: if not dfs(nbr): return False visited.add(node) path.append(node) visiting.remove(node) return True for ch in set(''.join(words)): if not dfs(ch): return '' return ''.join(path[::-1])