Category Data

336. RandAugment

Data Augmentation Recent work has shown that data augmentation has the potential to significantly improve the generalization of deep learning models. Recently, automated augmentation strategies have led to state-of-the-art results in computer vision tasks. However, when it comes to adopting…

227. Polynomial Features

Adding Linear Complexity When we want to train a model, we can easily imagine that we are unable to capture patterns of the training data if only using straight lines. Polynomial features is useful when you want to add more…

195. Creating Your Own Dataset

Creating Your Original Dataset Here is 1 way you can create your own dataset to train an object detection model. Install jmd_imagescraper pip install jmd_imagescraper Get Images #Imports from jmd_imagescraper.core import * from pathlib import Path #Set Paths root_path =…

190. Data-Level Methods

Data Imbalance Data-Leveling is used when your training data is imbalanced. Methods Upsample: This is where you add new instance for the minority class to balance the data. Downsample: This is where you delete data from the majority class to…

118. Pre-Fetching Data

Pre-fetching your data can help your data pre-processing pipeline more smoother. Without pre-fetching, the cpu would wait for the process in the GPU to end, and then start to prepare the next data. You can do this process in parallel…

116. Active Machine Learning

Labeling Data is time-consuming and boring. Active Machine Learning may help reduce that labeling process down to 10~20%. Fig.1 – Active Learning Workflow The basic workflow is as follows: 1. Label Data Partially 2. Train Model only with the labeled…

113. Check Your Data..

When I was evaluating the model, both f1 score and jaccard score, for some reason, was decreasing as the model finishes more epochs. (Which is quite insane.) I’ve been checking the dimensions of the variables I was using to calculate…

93. Dataset ⇒ DataLoader Pipeline in Pytorch

Here is 1 way to prepare your data using DataLoader in Pytorch. 1. Create Custom Dataset Class When using a custom dataset the following 3 functions has to be overloaded. class Dataset: #Initialize the dataset with the input data and…

92. DataLoader in Pytorch

When you want to load your data for training, the data preparation pipeline would be the following. Randomly shuffle your data Turn them into batches Iterate Although you can manually do this, when the data size becomes large this can…

83. Precision/Recall Tradeoff

How do you decide a threshold for, let’s say classification? The higher the threshold, the lower the recall, but the higher the precision, and vice versa. This is called the Precision/Recall Tradeoff. One way is to plot all possible thresholds.…