Editor's Note: Enjoy this guest post by Daniel Bauman from Stanford University's Institute of Computational Mathematics and Engineering. Do you have an AI related research topic or project that you would like to share on our blog? Contact Chris Markman, Digital Services Library Supervisor for details.
While mathematics provokes confusion or empty stares amongst many adults, I believe this is a case of bad PR. The math you learned in school likely combined repetitive drills with memorizing formulas—not exactly a recipe for engaging teenagers. However, I firmly believe that mathematical thinking is an incredibly valuable skill for all of us, not just scientists and engineers. Believe it or not, most math isn’t about solving equations. Simple mathematical literacy can help us know what’s (probably) true, deal with uncertainty, and understand more about new technologies like artificial intelligence (AI). Forget the quadratic formula, the law of cosines and the chain rule—let’s travel to the world of probability and statistics.
We all experience randomness constantly. For example, none of us know what the stock price of Facebook will be next week, or how long it will take to drive somewhere. Just because something is “random” does not mean that we cannot know anything about it. We might believe that Facebook will have a similar price next week, and we may think that we’ll make it home by 5:45. Even though the world is filled with uncertainty, we can still decide when to buy/sell stocks or plan when to drive home. Most of us do this very naturally thanks to our experience of the world and external knowledge. Thinking in terms of what might happen is a complex skill that is fundamental to human intelligence.
Understanding Probability
Probability is a mathematical tool that lets us measure things which are uncertain, just like we can measure the temperature, or the height of a building. Measuring does many useful things for us: it lets us compare how likely things are, it lets us understand what we expect uncertain things to be like, and it helps us understand how far something is from what we expected it to be. The examples I described above show how we all use probabilistic reasoning every day, even without doing any math. But sometimes these instincts are wrong in unexpected ways. Using probability lets you ask and answer precise questions about uncertain things. We can’t ever know exactly what will happen in the future, but probability can let us know what to expect and how uncertain we are.
Once we understand probability, we can use it to make new predictions. This is what the field of statistics is all about: how can we take some observations about a system and learn from them? A collection of observations is what we call data, and there are many sources of data that we care about and can learn from. A summary of data is known as a statistic. It’s a little confusing, but a single statistic is one way to capture data, while statistics refers to this general field of math that studies data.
For instance, each time an election poll is taken of 1,000 people, this creates 1,000 observations which make up our data. The proportion of people who will vote for one candidate is recorded, a single statistic which describes the data. The theory of statistics tells us how confident we are that this is true, because we assume that the observations that we actually make are random themselves. While probability tells us how to measure the values an uncertain observation may take, statistics takes a collection of real observations we have already seen (and which are therefore not random) and finds the best random pattern which can explain them. Understanding statistics is a crucial part of being informed. It is easy to hear a statistic in the news and think that this is the absolute truth because a number is presented. But knowing the theory of statistics will let you recognize how this number was found, know how certain we are about it and understand what that number can and cannot tell us.
Statistical Models
When people make predictions using statistics, they use what is called a statistical model. A statistical model is a way to take a new observation of a system and predict something we don’t know about it. For instance, a statistical model could look at an observation of a house (including its size, number of bedrooms, and neighborhood) and try to predict the price it will sell for. Classical statistical methods use relatively simple models, and we understand how they behave very well. This means that when we make a prediction using these models, we know how good the prediction is likely to be, and what sorts of errors we expect to make.
However, researchers discovered that very complicated models can perform extremely well at making predictions of unknown things, even though we might not understand the errors that the models can make. These classes of models, such as neural networks, form what is known as machine learning, and their use has proliferated in the past couple of decades. Machine learning is far more interested in making new predictions using data than understanding what the statistics of the data are. While machine learning and statistics intersect in many ways, this is their main distinction. Machine learning is a part of a broader effort, known as artificial intelligence (AI), to create systems that can do many of the tasks that humans can do.
I described earlier how, when you make decisions about how to act, you implicitly rely on your knowledge about the randomness in the world to understand the probability of different possibilities. This knowledge has been built up using the many observations of the world you have already made. We use this knowledge to make predictions of what the future will look like, and we use this to act on those decisions. This is exactly how you should think about AI systems: they recognize that there is some randomness in a system, they use data to create a model which tries to capture patterns in this randomness, and they use the model to make new predictions about what the system may look like. These predictions can take the form of a textual response to a prompt (like Chat GPT), an image that is generated from a piece of text (like DALLE-2), or the choice of the ad Instagram shows you. The specific details of how to turn text responses or images into a model are complicated and required many great recent ideas, but the general process that AI uses mirrors how we interact with the world.
I hope it is apparent by now that probability and statistics are fundamental ways to think about the world. Even without writing down any math, it is possible to learn to think more probabilistically and better grapple with uncertainty and randomness. This will help you in your everyday life and allow you to better understand how AI works. As a great place to start, check out some of the books I recommend below. Happy learning!
Add a comment to: AI @ PACL: Mathematics for All