You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Quick question about the bird dataset you're using.
I downloaded the bird dataset as per your instructions:
#####How to train a char-CNN-RNN model:
1. Download the birds and flowers data.
Inside the cvpr2016_cub/text_c10 directory, there are .t7 files. E.G 200.Common_Yellowthroat.t7
Upon opening them, I found that they were 60x201x10 tensors of integers. I guessed 60 is the images/specie, 10 is the caption/image. What is the 201 dimension? Is it the vocabulary size of the captions? What are the actual integers? I notice values from 0 to 70ish with a lot of the values being 0.
The text was updated successfully, but these errors were encountered:
Since there are only ~70 possible values, the actual integers here seem to be character indices. Not sure what the precise mapping is. For word-level encodings see the word_c10 directory (see #8).
Hi Scot,
Quick question about the bird dataset you're using.
I downloaded the bird dataset as per your instructions:
Inside the cvpr2016_cub/text_c10 directory, there are .t7 files. E.G
200.Common_Yellowthroat.t7
Upon opening them, I found that they were 60x201x10 tensors of integers. I guessed 60 is the images/specie, 10 is the caption/image. What is the 201 dimension? Is it the vocabulary size of the captions? What are the actual integers? I notice values from 0 to 70ish with a lot of the values being 0.
The text was updated successfully, but these errors were encountered: