Classification Learner Dataset Preparation

Asked by Steve
on 4 Sep 2019
on 10 Sep 2019
In regards to the Classification Learner, does it matter how the responses are organized? For example, for 2 response classes, if I have 100,000 observations, is it ok to have the first 50,000 be all of the 0s and the last 50,000 be all of the 1s?


Answer by Bhargavi Maganuru on 10 Sep 2019
You can perform supervised machine learning by supplying a known set of input data and the corresponding responses to that data (labels or classes). You use this data to train a model that generates predictions for the response to new data and it doesn’t matter how the data is organized.

