Podcast: Play in new window | Download

Subscribe: Apple Podcasts | Android | Google Podcasts | Stitcher | TuneIn | Spotify | RSS

It’s time we discuss algorithms we all need to know as we continue diving into Rob Conery’s *The Imposter’s Handbook* while Michael will read anything, Allen questions Greenland’s name, and Joe talks wormholes.

## Sponsors

- Datadog.com/codingblocks – Sign up today for a free 14 day trial
get a free Datadog t-shirt after creating your first dashboard.*and*

## Survey Says

## News

- As always, we take a moment to give a huge
*thank you*to everyone that left us a review:- iTunes: AKeith047, ThinkingWithMyHareBrain, T.Todorov, MeroIT, YourWorstEn3my, fiery_ginger, Emchar86, Joshua Reynvaan, Michael-not-an-outlaw, “Michael Joe Allen, SerialKilla”
- Stitcher: LiamM, daniel65, CBFan7629045419, Radulan

- Default interface methods might be coming to C# 8. (Info Q)
- We released our second Community Talk, this time discussing RAD Studio. (YouTube)
- Joe releases more videos adding to the QIT Diary. (YouTube)
- Allen explains why a cube or data warehouse doesn’t solve the search problems we discussed in episode 83.

## Algorithms Made Easy

### Bubble Sort

#### What is it?

- Iterate through the list to be sorted, comparing adjacent items, swapping them if they are in the wrong order. Repeat this process until no swaps are required.
- Consider the values: 1, 3 5, 2, 4
- First compare the 1, 3. They are in the correct order. Then compare the 3 and 5. They are also in the correct order.
- Now compare the 5 and 2. They need to be swapped. The values are now 1, 3, 2, 5, 4.
- Now compare the 5 and 4. They also need to be swapped. The values are now 1, 3, 2, 4, 5
- We need to iterate through the list again. Compare 1 and 3. They’re good. Now 2 and 3. Swap ’em! The values are now 1, 2, 3, 4, 5.
- Now we iterate over the list again. Nothing needs to be swapped. But we had to do this to verify that.
- So we’ve sorted the list. It took 3 iterations over the list to rearrange 1, 3, 5, 2, 4 into 1, 2, 3, 4, 5.

- Best time complexity is O(n) while the average and worst time complexities are O(n^2). The worst space complexity is O(1).
- Visualizations available at Wikipedia and Toptal.

#### Strengths

- Might be the easiest of sorting algorithms to implement. Certainly easy to understand.
- A case can be made that this makes it a nice introduction to algorithms, although, some argue that better algorithms should be used as an introduction instead, like insertion sort or quicksort.

#### Weaknesses

- Might also be one of the worst sorting algorithms in regards to time complexity.
- Horribly inefficient in regards to time. There is almost always a better choice.

#### When to use it?

- According to Wikipedia, there is a popular use for it in a very specific use for graphics, but aside from that avoid it unless you specifically know why it’s the best choice.
- Basically when you can *assume* that the list is already “almost-sorted”
- From Wikipedia:
*_”In computer graphics bubble sort is popular for its capability to detect a very small error (like swap of just two elements) in almost-sorted arrays and fix it with just linear complexity (2n). For example, it is used in a polygon filling algorithm, where bounding lines are sorted by their x coordinate at a specific scan line (a line parallel to the x axis) and with incrementing y their order changes (two elements are swapped) only at intersections of two lines.”_*

### Merge Sort

#### What is it?

- Most common sort, out performs most other sorts.
- Keep splitting your array in half, until you can’t anymore – then sort them back together.
- Note: you don’t have to allocate new arrays every time you split, but you probably will end up creating new arrays when you put Humpty back together again.
- Considered a stable sort.
- Doesn’t
*_sound_*more efficient than bubble sort, but the trick is that merging sorted lists is much faster (less repeated work). - Best, average, and worst time complexities are O(n log(n)) and the worst space complexity is O(n).
- Visualizations available at Wikipedia and Toptal.

#### Strengths

- Solid performer, great all around routine if you don’t know much about your list going in.
- In the worst case, merge sort does about 39% fewer comparisons than quicksort does in the average case.

#### Weaknesses

- Mildly annoying to program, generally going to want two functions – one to call and another to merge.
- Not commonly done inline, because of the merge step.

#### When to use it

- This is your default choice.
- Despite it’s memory overhead, it scales well because with large data because of it’s “divide and conquer” nature.

### Quicksort

#### What is it?

- Breaks a larger list into smaller lists with the use of a pivoting technique
- Smaller lists use pivoting technique until they are sorted
- The partitioning is done at random which can be a disadvantage for the list
- Let’s consider this example list: 5, 2, 7, 4, 1, 8, 3, 6
- Take the last number and use it as the partition value: 6 (Lumoto Partition Scheme).
- All values less than the partition stay to the left of the partition value, 6.
- All values more than the partition value move to the right.
- 5 is less than 6, so it stays where it is.
- 2 is less than 6, so it stays where it is.
- 7 is greater than 6, so it needs to move after it, but doing that requires you to move the pivot down one position, but there’s already a number there, 3, so three is going to move back to where the 7 was.
- Now the list is 5, 2, 3, 4, 1, 8, 6, 7.
- Now we have to look at the 3 that just took the 7’s place, and it’s less than 6, so it can stay there.
- 4 and 1 are both less than 6, so we’ll leave those alone as well.
- 8 is more than 6, so it’s going to swap places with the 6 now.
- Now the list is 5, 2, 3, 4, 1, 6, 8, 7.
- That’s the end of the first partition sort…
- Now we’ve got two new lists – the left of the old pivot 6, and the right: 5, 2, 3, 4, 1,
*_6_,*8, 7. - 6 is where it will be in the end, so you leave that one alone for now…
- 1 is the new pivot point of the left list, 7 is the pivot point of the right list…let’s start with the left list.
- 5 is greater than 1, so we’re going to move it after 1, which means 4 needs to move where 5 was.
- So our remaining comparison operations will transition like this as the 1 moves to the left:
- 4, 2, 3, 1, 5
- 3, 2, 1, 4, 5
- 2, 1, 3, 4, 5
- 1, 2, 3, 4, 5

- That was it for the left list…
- Now we need to sort the right list: 8, 7.
- Which is simple in this case, so the 7 moves and we get: 7, 8.
- That’s it for the right…

- Best and average time complexity is O(n log(n)) while the worst is O(n^2). Worst space complexity is O(log(n)).
- Visualizations available at Wikipedia and Toptal.

#### Strengths

- If you were to choose your pivot intelligently by finding the median value (requires a list scan first), then it will usually outperform the merge sort – because the sorting happens in place, the space complexity is less and fewer operations

#### Weaknesses

- So the interesting thing here is
*if*the list was already sorted, and you do this version, you get to O(n^2) – terrible performance.

### Selection Sort

#### What is it?

- This is how
*we*might actually sort something, like playing cards for example. - Iterate through the list, finding the smallest element, and swap it with the first element. Repeat for each additional element in the list.
- Consider the values: 1, 3 5, 2, 4.
- Scan the list looking for the smallest value. After going through the list, we determine that 1 is already in the correct position.
- Scan the list again looking for the next smallest value (which could be equal to 1 by the way). We eventually find 2. Swap 2 and 3. Now the values are 1, 2, 5, 3, 4.
- Scan the list again looking for the next smallest value and we eventually find 3. Swap 3 and 5. Now the values are 1, 2, 3, 5, 4.
- Scan the list again and we eventually find 4. Swap 4 and 5 and now our values are 1, 2, 3, 4, 5.
- We iterated over the list 4 times.

- Best, average, and worst time complexities are O(n^2) and the worst case space complexity is O(1).
- Visualizations available at Wikipedia and Toptal.

#### Strengths?

- No recursion necessary meaning no stack overflow exceptions to worry about.
- Good for small lists
- Can be advantageous in some circumstances, particularly when memory is limited.

#### Weaknesses?

- Similar to bubble sort, it’s not very efficient. Worst case, the time complexity is O(n^2).
- We have to scan each element in order to find the smallest value.
- Not a good choice for large arrays.

#### When to use it?

- Good (enough?) for small arrays.
- But how many items in the array is still considered small? Some say 10-20 elements, others say 20-30 elements …

- Can result in fewer write operations, which could be important if you’re sorting (and writing) on Flash memory.

### Heapsort

#### What is it?

- Divide the input into a sorted and an unsorted region, then you keep shrinking the unsorted region

by moving the largest number out of the unsorted region. - The trick here, is that the the heap data structure allows you to quickly lookup the maximum, so rather than having to look at each item O(n).
- It’s “in place” meaning low over head, but it is not stable.

##### So…what’s a heap?

- It’s a tree where the parent node has a numerical relationship with it’s children – either the parent is bigger or equal (max) or smaller or equal (min).
- For our purpose here, the main point is that the root of the tree is always bigger than either of it’s children.

So, lets revisit – start with an unsorted, build a binary max heap, cut the root off, then decide which child is the new root.

- Best, average, and worst time complexities are O(n log(n)) and the worst space complexity is O(1).
- Visualizations available at Wikipedia and Toptal.

#### Strengths

- Better worst case scenario than QuickSort – O(log n) vs O(n^2).
- Can be done in place.

#### Weaknesses

- On average, performs worse than Quicksort despite the better worse case scenario.

#### When to use it

- When you really care about the average sort time and memory usage – heap is nice and consistent (Similar best, worst, average).

### Binary Search

- The list to be searched has to be sorted first.
- Divide and conquer algorithm.
- Keep splitting the list in half and ignore the part you don’t need.
- Let’s consider the following list: 1, 3, 5, 7, 9, 11.
- Let’s say we’re looking for 9.
- We divide the list, so we have 1, 3, 5 in one list, 7, 9, 11 in the other.
- 1, 3, 5 can be ignored as we know 9 is greater all of those. Remember, we know the list is already sorted.
- 7, 9, 11 gets split again in the middle…but because we can’t evenly split an odd number sized list, we’ll put 7, 9 on the left and 11 on the right.
- 11 is greater than 9, so we’ll ignore it…
- We’re just left with 7 and 9.
- One last comparison and we now know where our item is.

- Best time complexity is O(1). Average and worst time complexity is O(log n). Worst space complexity is O(1).

### Graph Traversal

#### What is it?

- Graph traversal is the concept of navigating a graph.
- This gets tricky. We can do this either via recursion or an iterator. If we choose recursion, we risk stack overflow exceptions. So choose wisely.
- Recall that every time we recurse a function, the variables in that function remain on the stack.

- This gets tricky. We can do this either via recursion or an iterator. If we choose recursion, we risk stack overflow exceptions. So choose wisely.
- Consider your family tree. There are a couple ways we could search that tree (er, graph).
- One way, might be to go as far down one path as possible before going to the next node.
- Think of researching your ancestry. You might explore the lineage as far back as you can go starting with your mother, before exploring your father’s lineage.
- This is a Depth First Search.

- Another search, might be to first search all of the nodes on the same level, before going to the next.
- Again, considering the family example, you might be more inclined to ask closer family members for money (or help) before going further down the tree. In other words, you might ask a sibling before your parents, or your parents before your grandparents, etc.
- This is a Breadth First Search.

### Depth First Search

#### What is it?

- If you take a little bit of time up front you can save a lot of time retrieving data.
- By a little time I mean x.
- You can implement with recursion or a stack, stack is nice on memory since you only ever have a max of tree depth nodes and it’s easy to limit your algorithm based on the depth.
- You can store a pointer back to your parent in order to avoid having to track your path in some situations (binary search tree) but if you’re in a cyclic graph you need to know the full path.
- Lots of different types of trees – generally the more balanced the tree is, the better.

#### Strengths

- Simple way of traversing a tree.
- Better at finding “farther” data points.

#### Weaknesses

- Worse at finding closer data points.
- Some people might mention a problem with infinitely deep trees, but that’s a silly example since you can just as easily imagine a tree that’s infinitely broad.

### Breadth First Search

- From a starting node, traverse the tree by level, or layer-wise, before moving to the next level.
- You’re searching every level of the tree, from the starting node, fully before going on to the next level of the tree.
- You’ll visit all peers/siblings on a given level before moving to the next level.
- Need to track each node in a tree before going to the next level.
- Have to track every node and it’s children in order.
- Use a queue for storing that info.

#### Strengths

- Good for finding the shortest path between two nodes.
- According to Wikipedia:
*_”When applied to infinite graphs represented implicitly, breadth-first search will eventually find the goal state, but depth-first search may get lost in parts of the graph that have no goal state and never return.”_*

#### Weaknesses

- Can take more time if the node being searched for is further down the tree and/or the tree is wide.
- Can be memory intensive when the tree is wide.

## Resources We Like

*The Imposter’s Handbook*(bigmachine.io)- Sorting Algorithms Animations (sorting-algorithms.com)
- Big-O Cheat Sheet (bigocheatsheet.com)
- Sorting Algorithm Stability (Wikipedia)
- Bubble Sort (Wikipedia)
- Merge Sort (Wikipedia)
- Quicksort (Wikipedia)
- Selection Sort (Wikipedia)
- Heapsort (Wikipedia)
- Binary Search Algorithm (Wikipedia)
- Graph Traversal (Wikipedia)
- Depth-First Search (Wikipedia)
- Breadth-First Search (Wikipedia)
*When should one use Insertion vs. Selection sort?*(Quora)*What are the advantages of using BFS over DFS or using DFS over BFS? What are the applications and downsides of each?*(Quora)- Some more awesome sorting algorithm animations (thanks @Tyriar): (Growing the Web)
- Additional information about properties of sorting algorithms (also thanks @Tyriar): (Growing the Web)

## Tip of the Week

- Use Logstash to easily ingest your data into Elasticsearch.
- Preview your markdown in Visual Studio Code. (Visual Studio Code Docs)
- Bonus tip: Use CTRL+K, M to change your language mode, then use SHIFT+ALT+F to format the document. Especially useful for those JSON blobs you’re inspecting.

- Navigate your bookmarks using the Bookmark Navigator extension for Chrome. (Chrome Web Store)