Home / Expert Answers / Computer Science / solve-all-parts-of-this-question-2-transformers-self-attention-20-points-in-this-question-we-wi-pa694

(Solved): Solve all parts of this question 2 Transformers-Self Attention [20 Points] In this question we wi ...



2 Transformers-Self Attention [20 Points]
In this question we will compute the transformer self-attention covered in the lect

Solve all parts of this question

2 Transformers-Self Attention [20 Points] In this question we will compute the transformer self-attention covered in the lecture. You will be required to compute the value of different matrices as mentioned below and in the lecture notes. For all questions in this section, unless otherwise stated work must be shown in the form of matrix multiplications to receive full credit (i.e. ). For performing the computations, using Numpy, Excel 2 Table 1: Word Embeddings or other software is recommended to avoid computation errors. When writing your answers please round to 2 decimal places. You may use scientific notation to represent your answers if necessary. 2.1 In this question we will consider a single attention head. Given the set of word embeddings (Table 1), projection matrices ( ), and a normalization factor of 3 instead of , fill out this table with the normalized query-key score for each possible pair of words. Hint: Compute the value of 2.2 Given the word embeddings, the previously calculated query-key values, and the value projection matrix, calculate the output embeddings of this attention head. Output embedding is computed using the Attention formula discussed in the lecture slides. Fill in the table with your results.


We have an Answer from Expert

View Expert Answer

Expert Answer


We have an Answer from Expert

Buy This Answer $5

Place Order

We Provide Services Across The Globe