なぜ、Deep Learning Toolbox において大規模な検証データを設定しているとメモリエラーになりますか？

Question

MathWorks Support Team on 31 Aug 2020

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/586715-deep-learning-toolbox

Answered: MathWorks Support Team on 31 Aug 2020

Accepted Answer: MathWorks Support Team

学習画像数が 1,580,470枚、クラス数は81,313 種類となるような大規模な分類ネットワークを学習しようとしています。

メモリエラーを回避するために、trainingOptions 側で学習における MiniBatchSize を十分に小さく(例えば、MiniBatchSize = 1) 設定した場合でも、学習時にメモリエラーで停止します。

※ 検証データ数は、全データの30%にあたる 474,141 枚に設定しています。

下記はエラーメッセージとなります。

Starting parallel pool (parpool) using the 'local' profile ...

Connected to the parallel pool (number of workers: 2).

Initializing input data normalization.

======================================================================================================================

======================================================================================================================

Error using trainNetwork (line 170)

Out of memory.

Error in overall (line 97)

net = trainNetwork(trDsAug, lgraph, options);

Caused by:

Error using nnet.internal.cnn.ParallelTrainer/train (line 96)

Error detected on worker 1.

Error using nnet.internal.cnn.util.aggregateArrayFromObservations (line

12)

Out of memory.

Sign in to answer this question.

Answer 1

MathWorks Support Team on 31 Aug 2020

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/586715-deep-learning-toolbox#answer_487322

このエラーは、全ての検証データセットに対して、 CPU上において単一の配列を生成する過程で発生します。

本例では、 474171 枚が検証用データとなります。

また、クラス数が 81,313種類となるため、内部で作成される単一の配列サイズが

[1 1 1 numberOfClasses numberOfValidationPoints] = [1 1 1 474141 81313] で

約 143GB の連続メモリ領域が必要となります。

また、この操作を実行する際に配列サイズ分のコピーを行う過程があるため、少なくともその倍のメモリが必要になります。

検証用のデータセットが巨大かつ識別クラス数の場合 (それを乗算する配列数が必要)に生じることになるため、検証用のデータ数を減らす方向で対応をお願いします。

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

なぜ、Deep Learning Toolbox において大規模な検証データを設定しているとメモリエラーになりますか？

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

なぜ、Deep Learning Toolbox において大規模な検証​データを設定している​とメモリエラーになり​ますか？

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

なぜ、Deep Learning Toolbox において大規模な検証データを設定しているとメモリエラーになりますか？

0 Comments
Show -2 older commentsHide -2 older comments