Mastering Torch.Bmm For Efficient Attention Model Implementation