# Simulating a Markov chain

423 views (last 30 days)

Show older comments

Hello,

Would anybody be able to help me simulate a discrete time markov chain in Matlab?

I have a transition probability matrix with 100 states (100x100) and I'd like to simulate 1000 steps with the initial state as 1.

I'd appreciate any help as I've been trying to do this myself all week with no success and have not been able to source suitable code online.

Kind Regards

John

##### 0 Comments

### Answers (6)

John D'Errico
on 20 Apr 2019

Edited: John D'Errico
on 21 Apr 2019

There seems to be many followup questions, it may be worth discussing the problem in some depth, how you might attack it in MATLAB. So lets start out with a discussion of such a Markov process, and how we would work with it. First, create a simple markov process. I'm not feeling terribly creative right now, so lets just pick something simple, thus a 5x5 transition matrix.

T = triu(rand(5,5),-1);

T = T./sum(T,2)

T =

0.17362 0.0029508 0.33788 0.19802 0.28752

0.059036 0.16812 0.29036 0.36644 0.11604

0 0.15184 0.30054 0.275 0.27262

0 0 0.42829 0.30672 0.26499

0 0 0 0.28417 0.71583

I've arbitrarily chosen a matrix with no absorbing state, but one where at each step, the process will tend to push the state to a higher number, but some chance you can climb backwards too.

If we look at the matrix above, if you are in state 5, with probability 0.71583 you will stay in state 5, but 28% of the time, you will drop back to state 4, etc. Next, consider a vector that describes your current state. Suppose we start out in state 1 (thus initially, 100% of the time, we are in state 1.)

X = [1 0 0 0 0];

After one step of the process, we don't know what state we will be in, but we can determine the probability that we will lie in any given state. That is found simply using a matrix multiplication.

% After one time step

X = X*T

X =

0.17362 0.0029508 0.33788 0.19802 0.28752

% After two time steps

>> X = X*T

X =

0.030319 0.052314 0.24588 0.27082 0.40066

% After three time steps

>> X = X*T

X =

0.0083525 0.04622 0.21532 0.28971 0.44039

% After four time steps

>> X = X*T

X =

0.0041788 0.040491 0.20504 0.29181 0.45848

Gradually, the probability grows that we will lie in state 5 MOST of the time. So after 100 time steps, the probability keeps growing that we lie in state 5.

X100 = [1 0 0 0 0]*T^100

X100 =

0.0025378 0.035524 0.19457 0.29167 0.4757

Does a steady state prediction of the long term state of this process exist? Yes. We can get that from the eigenvector of T', corresponding to the eigenvalue 1. That would be the vector V1, such that V1*T = V1.

eig(T')

ans =

1 + 0i

0.48873 + 0i

0.16513 + 0i

0.0054905 + 0.13467i

0.0054905 - 0.13467i

[V,D] = eig(T');

V1 = V(:,1)';

V1 = V1/sum(V1)

V1 =

0.0025378 0.035524 0.19457 0.29167 0.4757

That last step was to normalize the eigenvector to sum to 1, so they could be viewed as probabilities. Remember that eig normalizes its vectors to have unit norm.

So over the long term, the probability is 47.57% that the process lies in state 5.

This is what we can learn about the long term behavior of that system. But how about simulating the process? So, instead of thinking about where we will be as this process goes to infinity, can we simulate a SINGLE instance of such a Markov chain? This is a very different thing, since it does not rely on eigenvalues, matrix multiplication, etc.

Now, we need to look at the rows of T. I'll build this as a random walk that runs for many time steps. At any time, I'll simulate the progress of the random walk as it proceeds from one state to the next state. Then I'll simulate 1000 such random walks, and see where we ended at the final step of that process.

The trick is to use the cumulative sum of T, along the rows of T. What does that do for us?

CT = cumsum(T,2)

CT =

0.17362 0.17657 0.51445 0.71248 1

0.059036 0.22716 0.51752 0.88396 1

0 0.15184 0.45239 0.72738 1

0 0 0.42829 0.73501 1

0 0 0 0.28417 1

Suppose we start out in state 1. The first row of CT is pertinent here. Generate a single random number (using rand). If that number is less than 0.17362, then we started in state 1, and will remain in state 1. If the number lies between 0.51445 and 0.71248, then we will have moved to state 4, etc. We should see how to simulate this process now. I'll simulate 10000 such random Markov processes, each for 1000 steps. I'll record the final state of each of those parallel simulations. (If I had the parallel tolbox, I suppose I could do this using a parfor loop, but not gonna happen for me.)

I'll use histcounts, but older MATLAB releases need to use histc. I'll append a column of zeros at the beginning of CT to make histcounts return the proper bin index.

CT = [zeros(size(T,1),1),cumsum(T,2)];

numberOfSims = 10000;

simTime = 1000;

initialState = 1;

currentState = repmat(initialState,[1,numberOfSims]);

for n = 1:simTime

r = rand(1,numberOfSims);

for m = 1:numberOfSims

% for each sim, where does r(m) lie, relative to the boundaries

% in the corresponding row of CT?

[~,~,currentState(m)] = histcounts(r(m),CT(currentState(m),:));

end

end

finalState = currentState;

This took a minute or so to run, but that was a fair amount of work. The first 10 such parallel simulations ended in these states:

finalState(1:10)

ans =

2 4 5 3 5 5 5 5 4 5

Now, how often did our process end in any given state? We can count that using accumarray.

finalStateFreq = accumarray(finalState',ones(numberOfSims,1))/numberOfSims

finalStateFreq =

0.0029

0.0336

0.1957

0.2891

0.4787

If I did a good job here, this should mesh well with the steady state predictions from before. Was the simulation time long enough? Surely yes. This is a small Markov process, with only 5 states. And 10K total sims will be entirely adequate to predict if the result matches the steady state predictions.

V1 =

0.0025378 0.035524 0.19457 0.29167 0.4757

That I got a consistent answer from the simulations with the steady state predictions suggests I did a good job in the simulation. As you would expect from a random simulation, it will only approach the theoretical frequency as the number of simulations goes to infinity. With a little extra effort, I suppose could even have estimated the variance of the counts in each bin, based on the variance of a binomial distribution.

##### 1 Comment

Nazer Hdaifeh
on 27 Aug 2020

mona faraji
on 1 Jun 2013

Edited: Walter Roberson
on 20 Apr 2019

chack the following for a 2*2 transition matrix and for 1000 states begining at 1:

transition_probabilities = [0.1 0.9;0.8 0.2]; starting_value = 1; chain_length = 1000;

chain = zeros(1,chain_length);

chain(1)=starting_value;

for i=2:chain_length

this_step_distribution = transition_probabilities(chain(i-1),:);

cumulative_distribution = cumsum(this_step_distribution);

r = rand();

chain(i) = find(cumulative_distribution>r,1);

end

% provides chain = 1 2 1 2 1 2 1 2 1 1 2 1 2 1 2....

##### 6 Comments

Walter Roberson
on 8 Jun 2020

Sean de Wolski
on 4 Jan 2013

Edited: Sean de Wolski
on 4 Jan 2013

More

Pseudo-ish-code (from my understanding, (disclosure: not a Markov Model expert by any means))

Use a for-loop to loop n times for length you want. S

transC = [zeros(size(trans,1),1), cumsum(trans,2)]; %cumulative sum of rows, we will use this to decide on the next step.

n = 10;

states = zeros(1,n); %storage of states

states(1) = 1; %start at state 1 (or whatever)

for ii = 2:n

%Generate a random number and figure out where it falls in the cumulative sum of that state's trasition matrix row

[~,states(ii)] = histc(rand,transC(states(ii-1),:));

end

##### 16 Comments

Paul Fackler
on 21 Aug 2013

##### 0 Comments

Ragini Gupta
on 10 Nov 2017

Hey there, I am using Markov Chain model to generate synthetic data. However, I get this error when I simulate it to find the next markov state.

*Edge vector must be monotonically non-decreasing.

Error in MarkovChain2 (line 53) [~,states(ii)] = histc(rand,transC(states(ii-1),:));*

Any ideas where I could be going wrong:?

filename = 'newTestingExcel.xlsx';

Furnace=xlsread(filename,'B:B'); %H1D1

edges = linspace(min(Furnace),max(Furnace),8); [counts,bins] = histc(Furnace, edges); [counts,bins] = histc(Furnace, edges);

%to calculate the pdf of each state' f ' is the pdf of each state: [f,xi,bw] = ksdensity(Furnace,edges);

alpha = 2.43; beta = 1; pd = @(f)gampdf(f,alpha,beta); % Target distribution

%# transition matrix trans = full(sparse(bins(1:end-1), bins(2:end), 1)); trans = bsxfun(@rdivide, trans, sum(trans,2));

transCum = [zeros(size(trans,1),1), cumsum(trans,8)]; n = 8; states = zeros(1,n); states(1) = 1; for length = 2:n

[~,states(length)] = histc(rand,transCum(states(ii-1),:));

end

##### 0 Comments

Christine Tobler
on 22 Apr 2019

##### 0 Comments

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!