Category Machine Learning

390. Monitoring GPU Metrics

▮ Monitoring After you have successfully deployed your machine learning model, it is crucial to be able to collect not only metrics such as throughput and latency but also GPU usage and utilization(since your ML model is most likely to…

389. Basic Error Analysis

▮ Error Analysis Error analysis is a process of examining the dev set that your ML model misclassified to understand the underlying causes of error. This can help you decide what to prioritize and the direction where the project should…

388. Build or Buy ML Infrastructure

▮ Build Or Buy? When settings up an ML infrastructure, at one extreme, a company can outsource everything except data movement. At the other extreme, a company can build everything and maintain all the required infrastructure. However, most companies are…

387. Model Explainability With SHAP

▮ Model Explainability Inside an ML model can easily become a black box. Increasing the explainability of an ML model can help developers debug and also communicate with the client about why the model is predicting a certain outcome. Here…

386. Packaging ML Models

▮ Advantage Of Packaging Packaging ML models means getting a model into a container to take advantage of the following. Fig.1 – Container Deployment You can run a container locally as long as container runtime is installed You can easily…

385. The 4 Layers Of MLOps

▮ Hierarchy Of Needs Every ML system works efficiently only when the basic foundation exists. Here are the 4 layers required to construct true ML Automation(MLOps). The ML engineering hierarchy of needs is shown in the Figure below. Fig.1 –…

384. Types Of Labeling

▮ Labeling Most ML models in production today adopt models with supervised learning. This means they all need data to learn to do a task, and there will rarely be a situation where label data is overwhelmingly abundant. Here are…

383. Test In Production

▮ After Deployment Previously, I shared how to evaluate your model offline; before production. So for this post, I’d like to share several model evaluation methods after deployment. Blog: Model Offline Evaluation Reference: Designing Machine Learning Systems ▮ Shadow Deployment…

382. Continual Learning

▮ After Deployment After deploying our ML models we want to continually update them to be able to adapt whenever the data distribution shifts. This is why being able to “continually learn” by setting up an infrastructure in a way…

381. ML Model Monitoring

▮ ML-Specific Metrics Here are the main 4 ML-specific metrics to monitor after you’ve deployed your model. Fig.1 – The 4 Metrics Accuracy-related Metrics You should always log and track any type of user feedback. If you’re at the phase…