Dataset splits do not have exactly the requested weights

**Short description**
When I split the tf_flowers dataset into subsplits with weights 10, 15 and 75, I actually get datasets of size 400, 600, and 2670. This translates to 10.9%, 16.3%, 72.8%, which is pretty different from what I requested.
Moreover, apart from iterating through the whole datasets, there does not seem to be a way to know the size of the splits.

**Environment information**
* Operating System: MacOSX 10.13.6
* Python version: 3.6.8
* `tfds-nightly` version: tfds-nightly-1.0.1.dev201903180105
* `tf-nightly-2.0-preview` version: tf-nightly-2.0-preview-2.0.0.dev20190319

**Reproduction instructions**

```python
import tensorflow_datasets as tfds

test_split, valid_split, train_split = tfds.Split.TRAIN.subsplit([10, 15, 75])

test_set = tfds.load("tf_flowers", split=test_split, as_supervised=True)
valid_set = tfds.load("tf_flowers", split=valid_split, as_supervised=True)
train_set = tfds.load("tf_flowers", split=train_split, as_supervised=True)

def dataset_length(dataset):
    count = 0
    for image in dataset:
        count += 1
    return count

print(dataset_length(test_set)) # 400
print(dataset_length(valid_set)) # 600
print(dataset_length(train_set)) # 2670
```

**Expected behavior**
I expected split sizes with the requested ratios (rounded up or down to the nearest integer): in this example, the correct sizes should have been 367, 550 and 2753 (or 551 and 2752).
I also expect to be able to know the subsplit sizes without iterating through the datasets.

**Additional context**
TFDS is cool.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset splits do not have exactly the requested weights #292

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dataset splits do not have exactly the requested weights #292

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions