Can you provide some more details? What does 'wrong answer' mean? How do you know the weights you are seeing are not correct? Are you getting an error?
In any case, looking at your code I suspect that you are using the initial 'actor' variable as argument to 'getLearnableParameters'. After you complete training, you need to extract the updated actor first as shown on this doc page.