Datasets List¶
General Description of a Dataset¶
Dataset Composition¶
A Benzina dataset is, in essence, an indexing over a concatenation of inputs, targets and possibly filenames with indexing
Dataset Structure¶
A Benzina dataset is structured using the mp4 format
ftyp: | Defines the compatibilities of the mp4 container |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mdat: | Concatenation in 2-3 blocks of the inputs, targets and possibly filenames |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
moov: | Contains the metadata needed to load and present the raw data of mdat
|
Dataset’s Input Sample Structure¶
A Benzina dataset’s input sample can also be structured using the mp4 format. It is roughly the same as the dataset’s structure with the differences that mdat will contains the raw concatenation of a single input, its target, possibly filename and possibly a 512 x 512 thumbnails stream.
ImageNet 2012¶
ImageNet 2012 classification dataset. It contains two size of the images along with their classification target and filename:
- Resized high resolution images each with a smaller edge of at most 512 while preserving the aspect ratio. This set is accessed by referencing the bzna_input track of the input samples.
- Resized images each with a longer edge of at most 512 while preserving the aspect ratio. This set is accessed by referencing the bzna_thumb track of the input samples.
The dataset is represented by ImageNet
which
simplifies the iteration of the data as a classification dataset.
Warning
81 images are currently missing from the dataset and 111 had to be first transcoded to PNG prior to the final H.265 format. More details can be found in the dataset’s README.
Warning
High resolution images stored in the the bzna_input track of the input
samples are currently not available through the
DataLoader
. Their widely varying sizes
prevent them from being decoded using a single hardware decoder
configuration. The selected solution is to represent the images in the HEIF
format which will be completed in future development.
Dataset Composition¶
The dataset is composed of a train set, followed by a validation set then a test set for a total of 1 431 167 entries. Targets and filenames are provided for each sets:
- Train setEntries 1 to 1281167 (1 281 167 entries)
- Validation setEntries 1281168 to 1331167 (50 000 entries)
- Test setEntries 1331168 to 1431167 (100 000 entries)
Dataset Structure¶
ilsvrc2012.bzna¶
ftyp: | Defines the compatibilities of the mp4 container
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mdat: | Raw concatenation in 3 blocks of the images, targets and filenames
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
moov: | Contains the metadata needed to load and present the raw data of mdat
|
Dataset’s Input Samples Structure¶
A Benzina ImageNet dataset’s input sample is structured using the mp4 format.
ftyp: | Defines the compatibilities of the mp4 container
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mdat: | Raw concatenation of the image, thumbnail, target and filename:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
moov: | Contains the metadata needed to load and present the raw data of mdat
|