### Introduction

In this article we will look at markov models and its application in classification of discrete sequential data.

- Markov processes are examples of stochastic processes that generate random sequences of outcomes or states according to certain probabilities.
- Markov chains can be considered mathematical descriptions of Markov models with a discrete set of states.
- Markov chains are integer time process \(X_n,n\ge 0\) for which each random variable \(X_n\) is integer valued and depends on the past only through most recent random variable \(X_{n-1}\) for all integer \(n\geq 1\).
- \(X_n,n\in N\) is a discrete Markov chain on state space \(S={1,\ldots,M}\)
- At each time instant t,The system changes state ,and makes a transition.
- The markov chains follow the markovian and stationarity property.
- For a first order markov chain,the markov property states that the state of the system at time \(t+1\) depends only on the state of the system at time \(t\). The markov chain is also said to be memoryless due to this property.
\begin{eqnarray*} & Pr(X_{t+1} = x_{t+1} |X_{t} = {x_1 \ldots x_t} = Pr(X_{t+1} = x_{t+1} |X_{t} = x_t) \end{eqnarray*}
- A stationarity assumption is also made which implies that markov property is independent of time.
\begin{eqnarray*} & Pr(X_{t+1} = x_i | X_{t} = x_j) = P_{i,j} & \text{for $\forall$ t and $\forall i,j \in {0 \ldots M}$} \end{eqnarray*}
- Thus we are looking at processes whose sample functions are sequence of integers between \({1 \ldots M}\).
- Thus markov process is parameterized by transition probability \(P_{ij}\) and intital probability \(P_{i0}\)
- Markov chains can be represented by directed graphs,where each state is represented by a node and directed arc represents a non zero transition probability.
- If a markov chain has M states then transition probability can be represented by a \(MxM\) matrix.
\begin{eqnarray*} &T =\begin{bmatrix} P_{11} & P_{22} & \ldots &P_{1M} \\ P_{21} & P_{22} & \ldots & P_{2M} \\ \ldots \\ P_{M1} & P_{M2} & \ldots & P_{MM} \\ \end{bmatrix} \\ &\sum_{j} P_{ij} = 1 \end{eqnarray*}
- The matrix T is stochastic matrix where elements in each row sum to 1
- This implies that it is necessary for transition to occur from present state to one of the M states.
- The probability of sequence being generated by markov chain is given by
\begin{eqnarray*} &P({X}) = \pi(x_0)*\prod_{t=1}^T p(x_t | x_{t-1}) \\ & \text{$p(x_t | x_{t-1})$ is the probability of observing the sequence $x_t$ at} \\ \end{eqnarray*}
time instant t given the present state is \(t-1\)

- Let us consider a 2 models with following initial transition and probability matrix.
\begin{eqnarray*} & \pi_1 = \begin{bmatrix} 1 & 0 & 0 \\ \end{bmatrix} \\ & T_1 = \begin{bmatrix} 0.6 & 0.4 & 0 \\ 0.3 & 0.3 & 0.4 \\ 0.4 & 0.1 & 0.5 \\ \end{bmatrix}

& \pi_2 = \begin{bmatrix} 0.1 & .5 & 0.4 \end{bmatrix} \\ & T_2 = \begin{bmatrix} 0.9 & 0.05 & 0.05 \\ 0.3 & 0.1 & 0.6 \\ 0.3 & 0.5 & 0.2 \\ \end{bmatrix} \end{eqnarray*}### Sequence Classification

- The sequence generate from these two markov chains
\begin{eqnarray*} & S_1= \begin{bmatrix} 1 & 2 & 1 & 2 & 3 \\ 3 & 3 & 2 & 1 & 1 \\ \end{bmatrix} \\ & S_2= \begin{bmatrix} 3 & 2 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ \end{bmatrix} \end{eqnarray*}
- In sequence 1 since \(P_{13}=0\), we can observe there is no transition from 1 to 3 and dominant transition is expected to be from \(1->1\).
- In sequence 2 since dominant transition is from \(1->1\) we can observe a long sequence of 1's.
- We can also compute the probability that the sequence has been generate from a given markov process.
- The sequence 1 has probability of \(8.6400e^{-05}\) from the 1st model and \( 2.4300e^{-07}\) from second model.
- The sequence 2 has probability of 0 being generate from 1st model and 0.00287 from 2nd model.
- Thus if we have sequence and know it is being generate from 1 of 2 models we can always predict the model the sequence has been generated from by choosing the model which generates the maximum probability.
- Thus we can use markov chain for sequence modelling and classification.

01 /** 02 * @brief The markovChain class : markovChain is a class 03 * which encapsulates the representation of a discrete markov 04 * chain.A markov chain is composed of transition matrix 05 * and initial probability matrix 06 */ 07 class markovChain 08 { 09 public: 10 markovChain(){}; 11 12 /* _transition holds the transition probability matrix 13 * _initial holds the initial probability matrix 14 */ 15 MatrixXf _transition; 16 MatrixXf _initial; 17 18 /** 19 * @brief setModel : function to set the parameters 20 * of the model 21 * @param transition : NXN transition matrix 22 * @param initial : 1XN initial probability matrix 23 */ 24 void setModel(Mat transition,Mat initial) 25 { 26 _transition=EigenUtils::setData(transition); 27 _initial=EigenUtils::setData(initial); 28 } 29 void setModel(Mat transition,Mat initial) 30 { 31 _transition=EigenUtils::setData(transition); 32 _initial=EigenUtils::setData(initial); 33 } 34 35 /** 36 * @brief computeProbability : compute the probability 37 * that the sequence is generate from the markov chain 38 * @param sequence : is a vector of integral sequence 39 * starting from 0 40 * @return : is probability 41 */ 42 float computeProbability(vector<int> sequence) 43 { 44 float res=0; 45 float init=_initial(0,sequence[0]); 46 res=init; 47 for(int i=0;i<sequence.size()-1;i++) 48 { 49 res=res*_transition(sequence[i],sequence[i+1]); 50 } 51 return res; 52 53 } 54 55 }

### Generating a Sequence

- The idea behind generating a sequence from a markov process is to use a uniform random number generator.
- For each row of initial probability or transition matrix select state which is most likely.
- For example if the row contains values \([0.6,0.4,0]\)
- If a uniform random value generates a value between 0 and 0.6 then state 0 is returned
- If a random value between 0.6 and 1 is generated then state 1 is returned.

01 /** 02 * @brief initialRand : function to generate a radom state 03 * @param matrix : input matrix 04 * @param index ; row of matrix to consider 05 * @return 06 */ 07 int initialRand(MatrixXf matrix,int index) 08 { 09 10 float u=((float)rand())/RAND_MAX; 11 cerr << u << endl; 12 float s=matrix(0,0); 13 int i=0; 14 //select the index corresponding to the highest probability 15 //or if all the cols of matrix have transitioned 16 while(u>s & (i<matrix.cols())) 17 { 18 i=i+1; 19 s=s+matrix(index,i); 20 } 21 return i; 22 }

- First step is to use the above method to select a inital state of matrix by passing the initial probability matrix as input.
- Next random state will be selected from the transition probability by passing the transition probability matrix as input.

01 /** 02 * @brief generateSequence is a function that generates 03 * a sequence of specified length 04 * @param n : is the length of the sequence 05 * @return : is a vector of integers representing 06 * generated sequence 07 */ 08 vector<int> generateSequence(int n) 09 { 10 11 vector<int> result; 12 result.resize(n); 13 int i=0; 14 int index=0; 15 //select a random initial value of sequence 16 int init=initialRand(_initial,0); 17 result[i]=init; 18 index=init; 19 for(i=1;i<n;i++) 20 { 21 //select a random transition to next sequence state 22 index=initialRand(_transition,index); 23 result[i]=index; 24 } 25 return result; 26 }

### Code

The code can be found in https://github.com/pi19404/OpenVision in files {ImgML/markovchain.cpp} and {ImgML/markovchain.hpp} . http://www.codeproject.com/Articles/808292/Markov-chain-implementation-in-Cplusplus-using-Eig http://pi-virtualworld.blogspot.ca/2014/04/markov-chain-implementation-in-c-using.html