About Attention for Memory Addressig #12

simaiden · 2020-06-15T20:47:15Z

In the paper at section 3.3.2, in eq 4 shows that in MemAE each weight is computed using softmax operation and cosine similarity, but I can't find this in the code, so where this operation actually is used?

Thanks

Wolfybox · 2020-07-17T08:16:30Z

Cosine Similarity is not implemented in the code. Instead, the weight computing process was replaced as applying Softmax on the inner product of z and m. I tried modified it to the way as written in the paper, however I found using cosine similarity only got me to an attention weight with all elements down to zero. So I guess the cosine distance measure was not applicable as it was said in the paper, which explained why they replaced it with inner product similarity in the actual implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Attention for Memory Addressig #12

About Attention for Memory Addressig #12

simaiden commented Jun 15, 2020

Wolfybox commented Jul 17, 2020

About Attention for Memory Addressig #12

About Attention for Memory Addressig #12

Comments

simaiden commented Jun 15, 2020

Wolfybox commented Jul 17, 2020