92. DataLoader in Pytorch

When you want to load your data for training, the data preparation pipeline would be the following.

Randomly shuffle your data
Turn them into batches
Iterate

Although you can manually do this, when the data size becomes large this can be time consuming.
That is where DataLoader in Pytorch comes in handy. It let’s you skip the batching phase and go directly to the iteration phase.

An implementation might look something like this.

DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
           batch_sampler=None, num_workers=0, collate_fn=None,
           pin_memory=False, drop_last=False, timeout=0,
           worker_init_fn=None, *, prefetch_factor=2,
           persistent_workers=False)

Join the Newsletter

Related Posts

403. Data Distribution Shifts

402. Data Leakage

397. Finding Data