.t7 caption files #7

shenkev · 2017-10-26T19:40:20Z

Hi Scot,

Quick question about the bird dataset you're using.

I downloaded the bird dataset as per your instructions:

#####How to train a char-CNN-RNN model:
1. Download the birds and flowers data.

Inside the cvpr2016_cub/text_c10 directory, there are .t7 files. E.G 200.Common_Yellowthroat.t7

Upon opening them, I found that they were 60x201x10 tensors of integers. I guessed 60 is the images/specie, 10 is the caption/image. What is the 201 dimension? Is it the vocabulary size of the captions? What are the actual integers? I notice values from 0 to 70ish with a lot of the values being 0.

The text was updated successfully, but these errors were encountered:

GaryLMS · 2017-10-28T04:57:28Z

I think 201 is the length of the sentence, if the length is shorter than 201, it will pad zero, otherwise the sentence will be cut.

jayelm · 2019-02-06T00:43:00Z

Since there are only ~70 possible values, the actual integers here seem to be character indices. Not sure what the precise mapping is. For word-level encodings see the word_c10 directory (see #8).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.t7 caption files #7

.t7 caption files #7

shenkev commented Oct 26, 2017

GaryLMS commented Oct 28, 2017

jayelm commented Feb 6, 2019

.t7 caption files #7

.t7 caption files #7

Comments

shenkev commented Oct 26, 2017

GaryLMS commented Oct 28, 2017

jayelm commented Feb 6, 2019