# How will Machine Learning help me? Two hypothetical Machine Learning projects

The most important questions people have about machine learning are the hardest ones to answer. People don’t wonder about the nuts and bolts of gradient descent and neural networks even though there are thousands of great posts to answer those questions — they wonder what Machine Learning is going to do for them: *How is machine learning going to help me? What will an implementation look like at my organization?*

This post is going to take you through two real world examples of how machine learning can make you better at whatever you do by increasing the efficiency of your programs and processes.

The first example is going to use machine learning to maximize the efficiency of a direct mail (pay-for-postage) marketing program. The second will use machine learning to increase the efficiency of an email program. My goal here is twofold: (1) get you to understand what I mean by *programs and processes* and (2) put real (well-faked but plausible) numbers behind the lofty language.

# Direct Mail Marketing

*The scenario:* We are a clothing company that sends out a magazine to its subscriber list. Every magazine printed and sent adds to the cost of the marketing program. We want to maximize our profit on this marketing venture. To that end, we want to figure out how to optimize the number of magazines we send. Let’s take a look at our costs and revenue.

Our costs are going to be a combination of Fixed Costs to develop and design the magazine and variable costs that depend on the number of magazines sent:

`Costs = Fixed Costs + Cost per Magazine * Number of Magazines`

Now we need to calculate our revenue due to magazines. This can be a bit tricky since we need to attribute sales to magazines sent, and not every magazine sent results in a purchase. We end up with three factors that determine our revenue:

`Revenue = Number of Magazines * $ Avg Purchase * Likelihood to Buy`

Let’s look at each in turn.

- Number of Magazines: the more we send, the more clothing we sell. The downside is costs increase as well.
- $ Avg. Purchase: the average amount of clothing bought, also a target for machine learning but not our focus in this example. Targeting this would be analogous to the second example in this post.
**Likelihood to Buy:**This number is a probability between 0 and 1. Traditional forecasting would use a single probability for everyone (e.g., we expect 1% of people receiving the magazine to buy). A more advanced approach would break it down into a few demographic categories (e.g., of men over the age of 50, we expect 1% of them to buy). However, even breaking this down using demographics leaves a lot of room for improvement.

Machine learning is going to transform the revenue forecast from a broad brush to a pinpoint by learning individualized Likelihood to Buys. Using past purchase history matched with demographic data, we can predict each subscriber’s Likelihood to improve our forecast.

## Traditional Approach

Now that we have an understanding of the decision-making-information to send out a mailer, Costs and Revenue, let’s evaluate the traditional and the machine learning approaches. In the traditional approach (below), profit increases the more magazines that we send.

What we see above is the traditional balancing of fixed costs, variable costs, and linear sales forecast. Because everyone is assumed to have the same Likelihood to Buy — see below, we never reach a point where the next magazine is inefficient. Once we have eclipsed our fixed costs, this model says any magazine sent after that point will make money.

In order for this project to make sense for an organization, we just need to be able to fund enough sends to get past the break even point. If we have the budget for that, this project should definitely be funded.

But wait, this model says the more magazines we print the more money we make. Why don’t we print a million magazines instead of 50,000? This model does not take into account the fact that not everyone has the same likelihood to buy. Because we can’t distinguish between high and low likelihood subscribers, every piece of mail is calculated to have roughly the same value, to whomever it is sent. This approach can never recommend sending less mail.

Now let’s imagine a Machine Learning approach to this scenario.

## Machine Learning Approach

With machine learning, instead of the Likelihood to Buy being an average across all our subscribers, we can predict the probability that each individual subscriber will buy.

The end result would look like the chart to the left. For every individual, we have a probability between 0 and 1 that they will make the purchase. We have modeled their **Likelihood to Buy, **the key to understanding revenue in our formula above.

Now we can rank our subscribers by their likelihood to buy. We could even create an arbitrary cutoff and say we are not comfortable sending to people with less than a X% chance of buying.

The key takeaway is that now we know a lot more about the Likelihood to Buy, we can rank our subscribers and send to the most likely first. The first 5,000 pieces of mail we send are going to go out to the most likely buyers, the second 5,000 to the second most likely set and so on. Each set of sends is expected to diminish in returns. Now we can calculate the incremental value from each cohort of sends and stop when each additional send nets no new revenue.

Our final profit cost curve will look like the below:

Profit rises quickly for the first 4 groups as we overcome fixed costs, but shrinks once we get past 20K sends. The next sends are much less likely to buy and so bring in less profit. The cohort from 20K to 25K are the first to bring in less revenue than their costs.

This story becomes more clear when looking at the incremental subscriber in each cohort.

The costs are the same for every cohort except for the first one which carries all the fixed costs. However, because we are able to distinguish our likely customers from our unlikely ones, we are able to stop sending at the point that the cohort makes no profit. So the 4th cohort makes a profit while the 5th does not. We would stop sending there, maximizing our profit.

In conclusion, by only sending costly magazines to our customers with the highest propensity to buy from them, we were able to change the model of how our magazines get sent. Using machine learning we predicted profits of $18K on 20,000 sends, a far cry from the $5K profit on 65K sends of the traditional approach. In a nutshell: *by only sending to our most likely buyers, we minimize costs, maximize profit, and still capture most of the revenue of a traditional maximalist cost approach.*

# Daily Email Program

In an email program, there are no variable costs to consider. Once you decide to have the program, you send to everyone as long as it keeps the unsubscribe rate low and inbox deliverability high. But we can still optimize the program by asking for something that closely aligns with what the subscriber wants. That could mean asking them to buy something they are likely interested in, or, asking for the correct size of donation. The more closely we align what we ask with what a subscriber is likely to do, the more likely we are to get what we ask for.

## Traditional Approach

Let’s say we are a non-profit that is asking for donations via email on a recurring basis. Every week, an email goes out asking every subscriber to donate with the message of the week. Right now, we ask for a $10 donation every time that email goes out. We want to maximize our returns to our emails.

Let’s examine a group of 10 representative people from our pool of subscribers. We just asked each of them for $10.

We ask everyone for the same amount of money for a total of $100 asked for. We get $40 from 4 donations. (In the real world, most people will not donate, but I exaggerate to show examples). To generalize, given a $10 ask we expect:

- The most common (median) donation is exactly what we asked for, $10
- Few people donate more than $10
- Average donation is close to $10

## Machine Learning Approach

Now let’s use machine learning to figure out how much to ask each person for. Based on their demographics and their donation history, we can predict how much a person is likely to give on their next donation. Our asks to our representative group of 10 looks like the below:

Here, the ask is customized to the individual. People likely to donate more are asked for more and people likely to donate less are asked for less. Overall, we ask for $28 more but only ask 3 people for $10 or more.

We see the same qualities in this group of 10 people that we saw in the traditional approach, but we can substitute the Machine Learning ask for $10.

- The most common donation is the same as the ask
- Few people donate more than what they are asked for
- The average donation is close to the average ask

This list is nearly exactly the same as above, even though we have totally changed what we are asking people. So what gives? How is this making more efficient? We increased the total amount asked for, but what if fewer people donate?

These are valid concerns. However, our machine learning model has a big advantage over a flat ask: it matches subscriber preferences as closely as possible, thus, more subscribers are likely to donate.

## How Machine Learning Raises the Donation Curve

With or without a model, the higher our ask, the fewer donations it will solicit. At a low average ask, we get more donations than a high average ask — that makes intuitive sense. The advantage of ML is that it raises the donation rate at every ask. You can see this to the left as the red Traditional curve being below the blue Machine Learning curve for all asks amounts. That shift represents the close matching of individual preferences giving Machine Learning the advantage. Even at higher average asks overall, people with smaller budgets are getting asked for less.

This difference in donation rates between approaches will also lead to different decisions on where revenue is maximized.The model can maintain a higher average ask for the same donation rate as the traditional approach as seen above. Since a higher ask will lead to a higher average donation, the machine learning model maximizes revenue per send at a higher average ask than the traditional approach and makes more money overall.

Going back to our above example we see that even though the average ask increased from $10 to $13 and the number of donations increased from 4 to 6, with nearly double the amount raised. That’s the power of matching your customers preferences in a nutshell: *more revenue for the same level of activity.*

Obviously these are just example numbers but the trends hold. Your average donation or sale is directly correlated to how much you ask for. Asking for more will get more. But asking for more means getting fewer donations or sales. By using machine learning to match the ask to the subscriber, we can increase the donation rate at every suggested ask level and bring in more money overall. In the places I’ve worked I have seen both an increase in average donation and an increase in number of donations, just from being smarter about the ask. Machine learning gives us the tools to run a much more efficient program overall.

# Conclusions

Both of these examples could be applied to for-profit and non-profit contexts, for email, mail and phone. Really these are generalizable to whatever your challenge is. Any organization trying to market itself through individualizable media — that is anything but TV and internet ads — could benefit from this kind of process.

And there are many, many more things you can do! The sky is the limit. My goal here today is only to give someone who might have trouble conceptualizing how Machine Learning works at a business to get a better idea so they can start their own journey.