I originally wrote this article for the Uber engineering blog.
With UberEATS, our aim is to make ordering food from your favourite restaurants as seamless as requesting a ride with uberX or uberPOOL. Like launching any new product, building out a food delivery network came with its fair share of engineering triumphs and surprises. Although tasty, this new flavourful passenger (food!) also comes with its fair share of challenges. For instance, it cannot specify its preferred route or chit chat with the driver and it does require more steps at pickup and dropoff. In this article, we focus on one challenge in particular: how Uber Engineering handled introducing a third party to what had previously been a two-sided marketplace. Continue reading »
Substring searching has become so ubiquitous in recent years that it’s easy to take for granted the beauty and ingenuity of automata-based pattern matching. Remarkably, the importance of these algorithms has, if anything, continued to grow. Knuth-Morris-Pratt (KMP) was published in 1974 but Skiena’s ‘killer applications’ for string processing — online search and computational biology — wouldn’t emerge until decades later.
The key insight behind KMP is that it’s never necessary to backtrack when searching the text. Examining each character once is enough. Consequently the algorithmic complexity is reduced from O(mn) in the naïve case to O(m+n). KMP achieves this by re-framing the problem. Rather than whether a substring exists, it asks: what currently is the longest prefix of the pattern that is also a suffix of the input string? Substrings can then be detected by checking whether the prefix length is equal to the pattern length. Continue reading »
As my original introduction to data structures was in the context of C/C++, it’s been interesting recently to see how these translate into a more concise, magical language such as Python. DFS path finding provides a classic example of it making life easier.
A recursive algorithm to find a path between two vertices, as in almost any other language, is trivial. Continue reading »
As part of an exercise I recently to implemented agglomerative clustering for a small set of news articles. After fiddling around for a while it occurred to me that the solution would be much more elegant if cluster membership was tracked with a disjoint-set. In fact it turns out that the data structure is quite commonly used to solve this particular problem.
Agglomerative clustering is a hierarchical clustering method that starts from the ‘bottom up’. Each element initially forms its own cluster, with the nearest two clusters being progressively merged until only one remains. Continue reading »