93. Dataset ⇒ DataLoader Pipeline in Pytorch

Here is 1 way to prepare your data using DataLoader in Pytorch.

1. Create Custom Dataset Class
When using a custom dataset the following 3 functions has to be overloaded.

class Dataset:
    #Initialize the dataset with the input data and the corresponding labels
    def __init__(self):
        self.data=[1,2,3,4,5,6]
        self.label=[1,0,0,0,1,1]

    # This method is called whenever you would use object[index] to access any element in the dataset
    def __getitem__(self,index):
        return self.data[index],self.label[index]

    # Method to simply return the number of training samples
    def __len__(self):

#Now you can create an instance of your dataset
prepared_dataset = Dataset()

2. Load Data Using DataLoader


# These are some examples of possible arguments
loaded_data = DataLoader(prepared_dataset, batch_size=1, shuffle=False, sampler=None,
        batch_sampler=None, num_workers=0, collate_fn=None,
        pin_memory=False, drop_last=False, timeout=0,
        worker_init_fn=None, *, prefetch_factor=2,
        persistent_workers=False)

References:
Blog Post by Manpreet Singh Minhas

Related Posts

403. Data Distribution Shifts

402. Data Leakage

397. Finding Data