# Documentation

### This is machine translation

Translated by
Mouse over text to see original. Click the button below to return to the English verison of the page.

## Fit Probability Distribution Objects to Grouped Data

This example shows how to fit probability distribution objects to grouped sample data, and create a plot to visually compare the pdf of each group.

### Step 1. Load sample data.

`load carsmall;`

The data contains miles per gallon (`MPG`) measurements for different makes and models of cars, grouped by country of origin (`Origin`), model year (`Model_Year`), and other vehicle characteristics.

### Step 2. Create a nominal array.

Transform `Origin` into a nominal array and remove the Italian car from the sample data. Since there is only one Italian car, `fitdist` cannot fit a distribution to that group. Removing the Italian car from the sample data prevents `fitdist` from returning an error.

```Origin = nominal(Origin); MPG2 = MPG(Origin~='Italy'); Origin2 = Origin(Origin~='Italy'); Origin2 = droplevels(Origin2,'Italy');```

### Step 3. Fit kernel distributions to each group.

Use `fitdist` to fit kernel distributions to each country of origin group in the `MPG` data.

`[KerByOrig,Country] = fitdist(MPG2,'Kernel','by',Origin2)`
```KerByOrig = Column 1 [1x1 prob.KernelDistribution] Column 2 [1x1 prob.KernelDistribution] Column 3 [1x1 prob.KernelDistribution] Column 4 [1x1 prob.KernelDistribution] Column 5 [1x1 prob.KernelDistribution] Country = 'France' 'Germany' 'Japan' 'Sweden' 'USA'```

The cell array `KerByOrig` contains five kernel distribution objects, one for each country represented in the sample data. Each object contains properties that hold information about the data, the distribution, and the parameters. The array `Country` lists the country of origin for each group in the same order as the distribution objects are stored in `KerByOrig`.

### Step 4. Compute the pdf for each group.

Extract the probability distribution objects for Germany, Japan, and USA. Use the positions of each country in `KerByOrig` shown in Step 3, which indicates that Germany is the second country, Japan is the third country, and USA is the fifth country. Compute the pdf for each group.

```Germany = KerByOrig{2}; Japan = KerByOrig{3}; USA = KerByOrig{5}; x = 0:1:50; USA_pdf = pdf(USA,x); Japan_pdf = pdf(Japan,x); Germany_pdf = pdf(Germany,x);```

### Step 5. Plot the pdf for each group.

Plot the pdf for each group on the same figure.

```figure; plot(x,USA_pdf,'r-'); hold on; plot(x,Japan_pdf,'b-.'); plot(x,Germany_pdf,'k:'); legend({'USA','Japan','Germany'},'Location','NW'); title('MPG by Country of Origin'); xlabel('MPG');```

The resulting plot shows how miles per gallon (`MPG`) performance differs by country of origin (`Origin`). Using this data, the USA has the widest distribution, and its peak is at the lowest `MPG` value of the three origins. Japan has the most regular distribution with a slightly heavier left tail, and its peak is at the highest `MPG` value of the three origins. The peak for Germany is between the USA and Japan, and the second bump near 44 miles per gallon suggests that there might be multiple modes in the data.