Learning data structures and algorithms


When I say that you need to learn algorithms and data structures, what I really mean is:
You Need to Learn Problems.
No, you don’t need to memorize algorithms and data structures to the point where you can implement them on a whiteboard from memory.
Nobody actually uses that skill, except during a job interview.
If you want to write code at an exceptional level of talent, what you really need to do is:
You Need to Learn Problems.
First example for you.
Let’s say that I have a large set of points in space.
And then I choose an arbitrary point in space, and I say, tell me, as quickly as you can, which k points in my set, are closest to this arbitrary point.
Now you could have several responses to this question.
One is to throw up your hands and have no idea where to start.
Another is to simply compare the distance between every point in the set and the arbitrary point. If you did that, you should be able to sort that list by distance, and produce the k nearest points.
Another response is to partition up the space into cells, so you can quickly find the nearest point by testing which cell contains it. And then you can do a search outward from that cell to connecting cells, in order to find the nearest points.
I’m sure you can come up with other ways to do it too.
Or — better than all of these answers — you could simply recognize that this problem is called the nearest neighbor problem.
And you could Google and read up on the latest research into the problem, which would inform you that there are several dozen new algorithms for solving this exact problem.
That research in turn might cause you to find this page, which benchmarks several nearest-neighbor search libraries.
And that page in turn might take you to this paper, in which the authors detail their approach— Hierarchical Navigable Small World graphs — which is currently the fastest practical algorithm for a particular 3500 point data set.
And you’ve never heard of this algorithm before. Because the paper was written in 2016, and updated in 2018!
Which brings us to the central crux of this answer:
It is more important to be able to recognize a problem, than to remember an algorithm that solves it.
This is because the “best” algorithm to solve any given problem today, is not necessarily the same as it was yesterday!
Here’s the second example.
Name the fastest algorithm to find a string in another string.
Maybe you have no idea how to do this.
Maybe you would just write an algorithm that stepped, character by character, through the string to be searched.
Maybe, if you’re a computer science graduate, you may have learned that Boyer-Moore is the “fastest” substring search algorithm.
Maybe, if you had known the name of the problem, you would have researched the problem and found Turbo-BM. Or Apostolico-Giancarlo. Both of these algorithms have superior worst-case and/or average performance guarantees to Boyer-Moore.
And none of your brilliant programmer friends even knows that these algorithms exist!
That’s why Learning Problems is so important. If you know that this problem, generically, is called string matching, then you can look up not only the well-known algorithmic solutions, but even the more obscure, more modern, and faster algorithms that haven’t become standard college curriculum yet.
And — for the best of the best programmers — it is entirely possible that you may read all that literature, and come up with a superior new algorithm that no one else has thought of yet.
If your work is ordinary business problems, then a box of standard algorithms, such as the C++ standard template library, can be perfectly adequate for day-to-day work.
But if you need to do incredibly fast code, or incredibly specialized code, nothing beats your own ability to research the academic literature on that problem.
And the academic literature, by definition, is something you can’t know cold, and it’s something you can’t parrot during an interview quiz.
Especially on subtle or important programming questions, the ability to Learn Problems, can help you produce the best possible algorithm to solve it.

Comments

Popular posts from this blog

How to use Django Bootstrap Modal Forms

Everything you need to know when developing an on demand service app

Documentation is Very vital before you develop any system or app