Machine Learning is the looking glass through which we can observe a wide variety of phenomena. It serves as a tool for scientists and practitioners to identify patterns and capture the inner workings of almost any process, from the mutations taking place in your DNA to why are your customers excessively purchasing “candy-shaped sunglasses.” Naturally, one has to ask how in the world is this technology assisting us in understanding so much about how stuff works?


I believe that to understand how we use machine learning to understand the world around us; we must first discuss how the data that these algorithms use is being generated. In other words, we need to figure out that data doesn’t just exist. It is the result of a process. In general, you can imagine a process as a black box that transforms information. We have some information coming in that is modified and then outputted. It is then naturally to think of a process as something that processes and generates data. However, to make things even more complicated, what we call a process, doesn’t necessary need to produce data, even the absence of output is data (this is a nice tangent you can go off to if you and your mates are bored at the pub).

In the real world, these processes are very complicated and messy. It is impossible in most situations to make sense of how they work or what does their output means just by observing them. Fortunately, some very smart people invented maths, especially statistics and probability theory, to help us describe and understand messy processes. Using this language, we can formulate simple models and test hypotheses. In other words, we create representations that bring us closer to the inner workings of the processes we observe.

Sounds great in theory, but where does machine learning come into play? Imagine you see the output of a process that generates data in the form of a few billions of rows with thousands of columns. How would you go by to formulate and test your models using such a large dataset? You can start by making some assumptions about the process, and based on these assumptions, construct some mathematical models. But how are you going to test your models? Or, if your models have many unknown components, i.e. parameters, how are you going to find what their values are? Luckily you have a very powerful ally to assist you in this Herculean task: the computer!

Computers are cyber-glasses-1938449_1280excellent because they can take a generic model and find all the parameters for that model that best describe the observed data. In other words, using a computer, we can fine tune the mathematical representation of the observed phenomenon. However, it is important to note that this description doesn’t always tell the full picture. It is best to think of it as a glimpse into the workings of the process. We never see all the possible data that a process can generate. What this means is that we will end up developing models that are better or worse at explaining the observations. As a consequence, we create models that are only partially capturing the inner workings of the process that generated the data. However, in most practical situations, these models are sufficient to give us enough insight into what is going on. 

This last step of fine tuning the parameters is partially what machine learning is doing. Using clever computational tricks, we can parse large data sets and search for the right combination of parameters that makes our model best describe the observed data. I like to think of machine learning as a set of algorithms that enable me to make sense of the messy the messy world that’s out there. For me, they are tools which help me build accurate representations of what is happening in the world using data.