371. Framing ML Problems

Identifying the Task

How you frame an ML problem can significantly affect the difficulty of solving it. It’s important to appropriately identify the task type of your ML problem to avoid such situations.

Let’s say we want to predict what app a phone user would use next.

First, we should know that an ML problem is defined by inputs, outputs, and the objective function that guides the learning process.

Framing as a multi-class classification

Considering the above, one setup is to frame this as a multi-class classification task.
Having the “User Info” and “User Environment” as input, and having the output as much as the number of apps. This would be a poor choice of framing because, whenever the user deletes or downloads a new app, there is a need to retrain the model because the number of outputs would not match.

Framing as a Regression

The better way is to frame this as a regression task.
Having “User Info”, “User Environment”, and “App Info” as input, and having a single output that contains the percentage of using the app. By doing this, there is no need to retrain the model whenever the user downloads a new app. We just need to use this model to predict the percentage for each currently using app.

Reference: Designing Machine Learning Systems