Markov chain implementation in C++ using Eigen

pi19404

5.00/5 (2 votes)

17 Aug 2014CPOL4 min read

29.6K

In this article we will look at markov models and its application in classification of discrete sequential data.

Introduction

In this article we will look at markov models and its application in classification of discrete sequential data.

Markov processes are examples of stochastic processes that generate random sequences of outcomes or states according to certain probabilities.
Markov chains can be considered mathematical descriptions of Markov models with a discrete set of states.
Markov chains are integer time process $X_n,n\ge 0$ for which each random variable $X_n$ is integer valued and depends on the past only through most recent random variable $X_{n-1}$ for all integer $n\geq 1$.
$X_n,n\in N$ is a discrete Markov chain on state space $S={1,\ldots,M}$
At each time instant t,The system changes state ,and makes a transition.
The markov chains follow the markovian and stationarity property.
For a first order markov chain,the markov property states that the state of the system at time $t+1$ depends only on the state of the system at time $t$. The markov chain is also said to be memoryless due to this property.
\begin{eqnarray*} & Pr(X_{t+1} = x_{t+1} |X_{t} = {x_1 \ldots x_t} = Pr(X_{t+1} = x_{t+1} |X_{t} = x_t) \end{eqnarray*}
A stationarity assumption is also made which implies that markov property is independent of time.
\begin{eqnarray*} & Pr(X_{t+1} = x_i | X_{t} = x_j) = P_{i,j} & \text{for $\forall$ t and $\forall i,j \in {0 \ldots M}$} \end{eqnarray*}
Thus we are looking at processes whose sample functions are sequence of integers between ${1 \ldots M}$.
Thus markov process is parameterized by transition probability $P_{ij}$ and intital probability $P_{i0}$
Markov chains can be represented by directed graphs,where each state is represented by a node and directed arc represents a non zero transition probability.
If a markov chain has M states then transition probability can be represented by a $MxM$ matrix.
\begin{eqnarray*} &T =\begin{bmatrix} P_{11} & P_{22} & \ldots &P_{1M} \\ P_{21} & P_{22} & \ldots & P_{2M} \\ \ldots \\ P_{M1} & P_{M2} & \ldots & P_{MM} \\ \end{bmatrix} \\ &\sum_{j} P_{ij} = 1 \end{eqnarray*}
The matrix T is stochastic matrix where elements in each row sum to 1
This implies that it is necessary for transition to occur from present state to one of the M states.
The probability of sequence being generated by markov chain is given by
\begin{eqnarray*} &P({X}) = \pi(x_0)*\prod_{t=1}^T p(x_t | x_{t-1}) \\ & \text{$p(x_t | x_{t-1})$ is the probability of observing the sequence $x_t$ at} \\ \end{eqnarray*}

time instant t given the present state is $t-1$
Let us consider a 2 models with following initial transition and probability matrix.
\begin{eqnarray*} & \pi_1 = \begin{bmatrix} 1 & 0 & 0 \\ \end{bmatrix} \\ & T_1 = \begin{bmatrix} 0.6 & 0.4 & 0 \\ 0.3 & 0.3 & 0.4 \\ 0.4 & 0.1 & 0.5 \\ \end{bmatrix}
& \pi_2 = \begin{bmatrix} 0.1 & .5 & 0.4 \end{bmatrix} \\ & T_2 = \begin{bmatrix} 0.9 & 0.05 & 0.05 \\ 0.3 & 0.1 & 0.6 \\ 0.3 & 0.5 & 0.2 \\ \end{bmatrix} \end{eqnarray*}

Sequence Classification
The sequence generate from these two markov chains
\begin{eqnarray*} & S_1= \begin{bmatrix} 1 & 2 & 1 & 2 & 3 \\ 3 & 3 & 2 & 1 & 1 \\ \end{bmatrix} \\ & S_2= \begin{bmatrix} 3 & 2 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ \end{bmatrix} \end{eqnarray*}
In sequence 1 since $P_{13}=0$, we can observe there is no transition from 1 to 3 and dominant transition is expected to be from $1->1$.
In sequence 2 since dominant transition is from $1->1$ we can observe a long sequence of 1's.
We can also compute the probability that the sequence has been generate from a given markov process.
The sequence 1 has probability of $8.6400e^{-05}$ from the 1st model and $ 2.4300e^{-07}$ from second model.
The sequence 2 has probability of 0 being generate from 1st model and 0.00287 from 2nd model.
Thus if we have sequence and know it is being generate from 1 of 2 models we can always predict the model the sequence has been generated from by choosing the model which generates the maximum probability.
Thus we can use markov chain for sequence modelling and classification.

01	/**
02	 * @brief The markovChain class : markovChain is a class
03	 * which encapsulates the representation of a discrete markov
04	 * chain.A markov chain is composed of transition matrix
05	 * and initial probability matrix
06	 */
07	class markovChain
08	{
09	public:
10	    markovChain(){};
11	     
12	    /* _transition holds the transition probability matrix
13	     * _initial holds the initial probability matrix
14	     */
15	    MatrixXf _transition;
16	    MatrixXf _initial;
17	 
18	     /**
19	     * @brief setModel : function to set the parameters
20	     * of the model
21	     * @param transition : NXN transition matrix
22	     * @param initial : 1XN initial probability matrix
23	     */
24	  void setModel(Mat transition,Mat initial)
25	    {
26	        _transition=EigenUtils::setData(transition);
27	        _initial=EigenUtils::setData(initial);
28	    }
29	    void setModel(Mat transition,Mat initial)
30	    {
31	        _transition=EigenUtils::setData(transition);
32	        _initial=EigenUtils::setData(initial);
33	    }
34	 
35	    /**
36	     * @brief computeProbability : compute the probability
37	     * that the sequence is generate from the markov chain
38	     * @param sequence : is a vector of integral sequence
39	     *      starting from 0
40	     * @return         : is probability
41	     */
42	    float computeProbability(vector<int> sequence)
43	    {
44	        float res=0;
45	        float init=_initial(0,sequence[0]);
46	        res=init;
47	        for(int i=0;i<sequence.size()-1;i++)
48	        {
49	            res=res*_transition(sequence[i],sequence[i+1]);
50	        }
51	        return res;
52	 
53	    }
54	     
55	}

Generating a Sequence

The idea behind generating a sequence from a markov process is to use a uniform random number generator.
For each row of initial probability or transition matrix select state which is most likely.
For example if the row contains values $[0.6,0.4,0]$
If a uniform random value generates a value between 0 and 0.6 then state 0 is returned
If a random value between 0.6 and 1 is generated then state 1 is returned.

01	/**
02	 * @brief initialRand : function to generate a radom state
03	 * @param matrix      : input matrix
04	 * @param index       ; row of matrix to consider
05	 * @return
06	 */
07	int initialRand(MatrixXf matrix,int index)
08	{
09	 
10	    float u=((float)rand())/RAND_MAX;
11	    cerr << u << endl;
12	    float s=matrix(0,0);
13	    int i=0;
14	    //select the index corresponding to the highest probability
15	    //or if all the cols of matrix have transitioned
16	    while(u>s & (i<matrix.cols()))
17	    {
18	        i=i+1;
19	        s=s+matrix(index,i);
20	    }
21	    return i;
22	}

First step is to use the above method to select a inital state of matrix by passing the initial probability matrix as input.
Next random state will be selected from the transition probability by passing the transition probability matrix as input.

01	/**
02	 * @brief generateSequence is a function that generates
03	 * a sequence of specified length
04	 * @param n : is the length of the sequence
05	 * @return : is a vector of integers representing
06	 * generated sequence
07	 */
08	vector<int> generateSequence(int n)
09	{
10	 
11	    vector<int> result;
12	    result.resize(n);
13	    int i=0;
14	    int index=0;
15	    //select a random initial value of sequence
16	    int init=initialRand(_initial,0);
17	    result[i]=init;
18	    index=init;
19	    for(i=1;i<n;i++)
20	    {
21	    //select a random transition to next sequence state
22	    index=initialRand(_transition,index);
23	    result[i]=index;
24	    }
25	    return result;
26	}

Code

The code can be found in https://github.com/pi19404/OpenVision in files {ImgML/markovchain.cpp} and {ImgML/markovchain.hpp} . http://www.codeproject.com/Articles/808292/Markov-chain-implementation-in-Cplusplus-using-Eig http://pi-virtualworld.blogspot.ca/2014/04/markov-chain-implementation-in-c-using.html

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Markov chain implementation in C++ using Eigen

Introduction

Sequence Classification

Generating a Sequence

Code

License