Neural Networks Algorithm for predicting the fourth word in a sentence
1 view (last 30 days)
Show older comments
I have this assignment with where we're required to analyze a learning algorithm for predicting the fourth word in a sentence. It is a 4-grams model, with the first three words given. The parameters we can change are d=the number of dimensions we can represent a word, and numHid= the number of hidden units in the hidden layer (here we're using a single hidden layer). So I trained the algorithm with different d every time and different numHid, the algorithm stops automatically when the validation error starts increasing. My question is: What does the number of epochs represent? is it better for the epochs to be minimum? provided that the learning rate is kept constant throughout the algorithm. Should I use the parameters that give me the minimum cross Entropy error?
Thanks
0 Comments
Answers (1)
Greg Heath
on 29 Jan 2012
>I have this assignment with where we're required to analyze a learning algorithm for predicting the fourth word in a sentence. It is a 4-grams model, with the first three words given. The parameters we can change are d=the number of dimensions we can represent a word,
How, exactly, are words represented? How many 4-word combinations do you have? Are there multiple combinations that have the same 4th word?
>and numHid= the number of hidden units in the hidden layer (here we're using a single hidden layer). So I trained the algorithm
What kind of algorithm? What is it's name? Are you using the NN Toolbox?
> with different d every time and different numHid, the algorithm stops automatically when the validation error starts increasing. My question is: What does the number of epochs represent?
The interval between successive weight update stages is an epoch
> is it better for the epochs to be minimum? provided that the learning rate is kept constant throughout the algorithm.
Regardless of learning rate, the ultimate goal is to minimize the performance error on nondesign data. Speed is of secondary importance.
>Should I use the parameters that give me the minimum cross Entropy error?
Use the parameters that optimize YOUR measure of performance. From my point of view you have a classification problem and should try to minimize the rate of failure to choose the correct 4th word. However, classification error rate is not continuous. Therefore, it is much better to use a continuous objective function like mean-square-error or cross-entropy.
In the words of Confusious: "Try both, choose best"
Hope this helps.
Greg
2 Comments
Greg Heath
on 20 Feb 2015
I don't remember this 3 year old post. However, it sure would have been useful to see a few inputs and the corresponding targets.
See Also
Categories
Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!