How Machine Learning Really Works

Jun 25, 2024 | Blog

City AI Connection is a series of posts on emerging ideas in artificial intelligence, with a focus on government application. These articles are initially shared exclusively with the City AI Connect community and subsequently with a broader audience via the Bloomberg Center for Government Excellence at Johns Hopkins University website, govex.jhu.edu. City AI Connect is a global learning community and digital platform for cities to trial and advance the usage of generative artificial intelligence to improve public services. The community is open exclusively to local government employees. To join, visit cityaiconnect.jhu.edu

BY Maeve Mulholland

Automata and robots walking alongside humans, astronauts arguing with sentient space stations, and self-driving cars whipping past each other while passengers enjoy coffee and read the news—few phrases can evoke a panoply of images and scenarios the way “Artificial Intelligence” does. Despite everyone having ideas—nightmares even—of what Artificial Intelligence (AI) is, few people actually understand the foundational concepts of this technology.

In a previous City AI Connection post, Sara Bertran de Lis broke down a wide range of AI concepts and taxonomy. In this post, I’ll drill down into “Machine Learning,” a particular category of AI that undergirds most modern AI infrastructure.

Machine Learning (ML) is a field of study and a body of methods for using data and statistical inference to develop pieces of software called “models.” ML models are the core decision-making components of the software, and they are used in image recognition, natural language processing, anomaly detection, classification, and many other tasks.

ML approaches to developing AI tools have been wildly successful, and a major contributor to their success is the fact that a human developer does not need to instruct a model how to perform a task. The developer needs only to provide a general outline of the task and a large body of examples of successfully completed tasks. The computer (machine) uses the examples to tune the model (learn) in order to successfully complete the task (hence: machine learning).

A computer using example data to refine model parameters is analogous to humans using trial and error (otherwise known as experience) to learn a new task. A real world example of this is a baker developing intuition for cake bake times. A cake’s bake time depends on the size of the cake, the temperature of the oven, the ingredients in the batter, and humidity among other factors. 

An experienced baker can make a batter from scratch and then intuitively provide a bake time estimate–but where does that intuition come from? It comes from the experience of baking countless cakes, and both the failures and successes are important in that body of experience. The baker adjusts their intuition upon failures, and their intuition is reinforced by successes. 

One takes the same iterative approach to training an ML model to predict baking times. The model starts with an initial set of parameters that it uses to calculate a bake time with inputs of the  batter ingredients, temperature, and pan size. The initial state of the model may not be very accurate, and it is akin to a new inexperienced baker. The parameters need to to be updated, just like the novice baker’s intuition.

The baker develops intuition by baking, but the ML model cannot even turn on the mixer, given that it is just a bunch of 1’s and 0’s on a hard drive. It needs someone to bake for it and tell it what happened, which is exactly what the developer does. The developer compiles a dataset of baking experiences– temperatures, batter ingredients, pan sizes, baking times, and whether the cake was done, dry, or underbaked. 

The ML algorithm iterates over the examples provided by the developer, guessing the correct temperature, and then referring to the actual outcome and adjusting parameters to produce a better guess. Sometimes this adjustment is an overcompensation and sometimes an undercompensation, so many examples are necessary for the parameters to be tuned such that the algorithm can predict accurately for a range of batter ingredients, pan sizes, and temperatures. 

Actual systems AI tools are applied to can be vastly more complex than a bakery. The tasks that the Machine Learning models take on are ones that developers may not have any idea how to approach, such as gesture recognition in a video feed. For these complex applications, the ability of ML models to learn by example is indispensable. Developers need not understand the best way to differentiate photos of people’s faces, predict the next purchase in an online marketplace, or identify fraud in a list of credit card transactions. The developer need only to choose a class of machine learning models, define some parameters, and then provide a list of examples of the sorts of predictions they would like the algorithm to make. The algorithm, like the human fiddling with baking times in the kitchen, adjusts internal settings over and over again until it can correctly predict the outcomes of the example trials it was given.

Everything from Amazon recommendations to ChatGPT chat sessions are powered by machine-learning models. ML algorithms can be developed without programmers needing to write out complicated rules or capture the nuances of human decision making. The algorithms optimize their decisions, making predictions based on the data provided. This is a blessing and a curse. On one hand, programmers do not need to foresee and write a rule for every possible situation, and problems that were previously intractable suddenly become soluble. On the other hand, when there are implicit biases in the training examples, those biases manifest in the algorithm’s behavior. High profile examples include face-tracking software not recognizing people of color because the training set only includes white faces or resume review software overlooking women candidates because men were overrepresented in the training set. 

We are preparing a follow-up post discussing how machine learning models are evaluated. How does one measure performance and catch bias in an AI system? The evaluation process differs from traditional computer code. A careful reading of the code and documentation should tell a reviewer what to expect from most traditional pieces of software. Very complex pieces of software can be tested to make sure that they are performing as intended, but ML systems require special care and consideration beyond what is typical for traditional software.

Maeve Mulholland is a data scientist on GovEx’s Research and Analytics team.