Skip to content

ShardListDataset cache miss rate with wids #3

@omri-cavnue

Description

@omri-cavnue

Hello,

I am currently implementing a pipeline with DDP and wids. My dataloaders look like the following:

chunk_size = math.ceil(dataset_length / int(os.environ["WORLD_SIZE"]))
dataset = (
            wids.ShardListDataset(
                wids_map["shardlist"],
                cache_dir=cache_dir,
                keep=True
            )
        ).add_transform(preprocess)
loader = torch.utils.data.DataLoader(
            dataset,
            num_workers=num_workers,
            batch_size=batch_size,
            collate_fn=identify_fn,
            pin_memory=True,
            sampler=wids.DistributedChunkedSampler(dset, chunksize=chunk_size, shuffle=True) if "train" else None,
        )

While everything seems to be working correctly, I am seeing messages around cache miss rate similar to Warning: ShardListDataset has a cache miss rate of 9901.0%%. I haven't found any information on this and was wondering what these signify as it relates to ShardListDataset given the data is already cached locally on disk and the cache_dir simply points there? So I'm not sure how it would miss but still train through the iter and epoch without any performance impact (or so it seems)?

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions