Click here to Skip to main content
15,887,135 members
Articles / Programming Languages / C#

Testing and Validation CNTK Models using C#

Rate me:
Please Sign up or sign in to vote.
5.00/5 (5 votes)
15 Nov 2017CPOL1 min read 15.2K   1   11
Once the model is built and Loss and Validation functions satisfy our expectation, we need to validate and test the model using the data which was not part of the training data set (unseen data).

…continuation from the previous post.

Once the model is built and Loss and Validation functions satisfy our expectation, we need to validate and test the model using the data which was not part of the training data set (unseen data). The model validation is very important because we want to see if our model is trained well, so that can evaluate unseen data approximately same as the training data. Otherwise, the model which cannot predict the output is called overfitted model. Overfitting can happen when the model was trained long enough that shows very high performance for the training data set, but for the testing data, evaluate bad results.

We will continue with the implementation from the prevision two posts, and implement model validation. After the model is trained, the model and the trainer are passed to the Evaluation method. The evaluation method loads the testing data and calculates the output using passed model. Then it compares calculated (predicted) values with the output from the testing data set and calculated the accuracy. The following source code shows the evaluation implementation.

C#
private static void EvaluateIrisModel(Function ffnn_model, Trainer trainer, DeviceDescriptor device)
{
    var dataFolder = "Data";//files must be on the same folder as program
    var trainPath = Path.Combine(dataFolder, "testIris_cntk.txt");
    var featureStreamName = "features";
    var labelsStreamName = "label";

    //extract features and label from the model
    var feature = ffnn_model.Arguments[0];
    var label = ffnn_model.Output;

    //stream configuration to distinct features and labels in the file
    var streamConfig = new StreamConfiguration[]
        {
            new StreamConfiguration(featureStreamName, feature.Shape[0]),
            new StreamConfiguration(labelsStreamName, label.Shape[0])
        };

    // prepare testing data
    var testMinibatchSource = MinibatchSource.TextFormatMinibatchSource(
        trainPath, streamConfig, MinibatchSource.InfinitelyRepeat, true);
    var featureStreamInfo = testMinibatchSource.StreamInfo(featureStreamName);
    var labelStreamInfo = testMinibatchSource.StreamInfo(labelsStreamName);

    int batchSize = 20;
    int miscountTotal = 0, totalCount = 20;
    while (true)
    {
        var minibatchData = testMinibatchSource.GetNextMinibatch((uint)batchSize, device);
        if (minibatchData == null || minibatchData.Count == 0)
            break;
        totalCount += (int)minibatchData[featureStreamInfo].numberOfSamples;

        // expected labels are in the mini batch data.
        var labelData = minibatchData[labelStreamInfo].data.GetDenseData<float>(label);
        var expectedLabels = labelData.Select(l => l.IndexOf(l.Max())).ToList();

        var inputDataMap = new Dictionary<Variable, Value>() {
            { feature, minibatchData[featureStreamInfo].data }
        };

        var outputDataMap = new Dictionary<Variable, Value>() {
            { label, null }
        };

        ffnn_model.Evaluate(inputDataMap, outputDataMap, device);
        var outputData = outputDataMap[label].GetDenseData<float>(label);
        var actualLabels = outputData.Select(l => l.IndexOf(l.Max())).ToList();

        int misMatches = actualLabels.Zip(expectedLabels, (a, b) => a.Equals(b) ? 0 : 1).Sum();

        miscountTotal += misMatches;
        Console.WriteLine($"Validating Model: Total Samples = {totalCount}, 
                                              Mis-classify Count = {miscountTotal}");

        if (totalCount >= 20)
            break;
    }
    Console.WriteLine($"---------------");
    Console.WriteLine($"------TESTING SUMMARY--------");
    float accuracy = (1.0F - miscountTotal / totalCount);
    Console.WriteLine($"Model Accuracy = {accuracy}");
    return;
}

The implemented method is called in the previous Training method.

C#
EvaluateIrisModel(ffnn_model, trainer, device);

As can be seen, the model validation has shown that the model predicts the data with high accuracy, which is shown in the following image:

This is the latest post in the series of blog posts about using Feed forward neural networks to train the Iris data using CNTK and C#.

The full source code for all three samples can be found here.

Filed under: .NET, C#, CNTK, CodeProject
Tagged: .NET, C#, CNTK, Code Project, CodeProject, Machine Learning

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Bosnia and Herzegovina Bosnia and Herzegovina
Bahrudin Hrnjica holds a Ph.D. degree in Technical Science/Engineering from University in Bihać.
Besides teaching at University, he is in the software industry for more than two decades, focusing on development technologies e.g. .NET, Visual Studio, Desktop/Web/Cloud solutions.

He works on the development and application of different ML algorithms. In the development of ML-oriented solutions and modeling, he has more than 10 years of experience. His field of interest is also the development of predictive models with the ML.NET and Keras, but also actively develop two ML-based .NET open source projects: GPdotNET-genetic programming tool and ANNdotNET - deep learning tool on .NET platform. He works in multidisciplinary teams with the mission of optimizing and selecting the ML algorithms to build ML models.

He is the author of several books, and many online articles, writes a blog at http://bhrnjica.net, regularly holds lectures at local and regional conferences, User groups and Code Camp gatherings, and is also the founder of the Bihac Developer Meetup Group. Microsoft recognizes his work and awarded him with the prestigious Microsoft MVP title for the first time in 2011, which he still holds today.

Comments and Discussions

 
QuestionI do have a quick question? Pin
asiwel16-Nov-17 9:10
professionalasiwel16-Nov-17 9:10 
AnswerRe: I do have a quick question? Pin
Bahrudin Hrnjica17-Nov-17 0:59
professionalBahrudin Hrnjica17-Nov-17 0:59 
GeneralRe: I do have a quick question? Pin
asiwel17-Nov-17 4:07
professionalasiwel17-Nov-17 4:07 
GeneralRe: I do have a quick question? Pin
Bahrudin Hrnjica21-Nov-17 8:43
professionalBahrudin Hrnjica21-Nov-17 8:43 
GeneralRe: I do have a quick question? Pin
asiwel21-Nov-17 11:15
professionalasiwel21-Nov-17 11:15 
GeneralRe: I do have a quick question? Pin
Bahrudin Hrnjica22-Nov-17 0:43
professionalBahrudin Hrnjica22-Nov-17 0:43 
GeneralRe: Variable Learning rate Pin
asiwel22-Nov-17 7:26
professionalasiwel22-Nov-17 7:26 
GeneralRe: I do have a quick question and a solution Pin
asiwel21-Nov-17 13:59
professionalasiwel21-Nov-17 13:59 
I guess I got motivated and am happy to report a solution to loading batch test data from memory. Here's my version of your code now and output from a very short training run with some of my data. (My DataReader method provides Training and Test data in 1D format to Value.CreateBatch.)

public static void EvaluateModel(Function ffnn_model, Trainer trainer, DeviceDescriptor device, DataReader rdr)
{
    //extract features and label from the model
    var feature          = ffnn_model.Arguments[0];
    var label            = ffnn_model.Output;
    // get dimensions
    int inputDim         = feature.Shape.TotalSize;
    int numOutputClasses = label.Shape.TotalSize;
    //define input and output variable
    var xValues = Value.CreateBatch<float>(new NDShape(1, inputDim), Form1.rdr.GetTestFeatures(), device);
    var yValues = Value.CreateBatch<float>(new NDShape(1, numOutputClasses), Form1.rdr.GetTestLabels(), device);

    Console.WriteLine($"-----VALIDATION SUMMARY------");
    Form1.mainform.WritetoListBox(string.Format($"-----VALIDATION SUMMARY------"));

    int miscountTotal   = 0;
    int totalCount      = yValues.Data.Shape.Dimensions[2];   // gets NofCases

    var inputDataMap    = new Dictionary<Variable, Value>() { { feature, xValues} };
    var expectedDataMap = new Dictionary<Variable, Value>() { { label, yValues } };
    var outputDataMap   = new Dictionary<Variable, Value>() { { label, null } };

    var expectedData    = expectedDataMap[label].GetDenseData<float>(label);
    var expectedLabels  = expectedData.Select(l => l.IndexOf(l.Max())).ToList();

    ffnn_model.Evaluate(inputDataMap, outputDataMap, device);
    var outputData      = outputDataMap[label].GetDenseData<float>(label);
    var actualLabels    = outputData.Select(l => l.IndexOf(l.Max())).ToList();

    int misMatches      = actualLabels.Zip(expectedLabels, (a, b) => a.Equals(b) ? 0 : 1).Sum();

    miscountTotal += misMatches;
    Console.WriteLine($"Validating Model: Total Samples = {totalCount}, Mis-classify Count = {miscountTotal}");
    Form1.mainform.WritetoListBox(string.Format($"Validating Model: Total Samples = {totalCount}, Mis-classify Count = {miscountTotal}"));

    Console.WriteLine($"------TESTING SUMMARY--------");
    float accuracy = (1.0F - (float)miscountTotal / totalCount);
    Console.WriteLine($"Model Accuracy = {accuracy,6:0.0000}");

    Form1.mainform.WritetoListBox(string.Format($"------TESTING SUMMARY--------"));
    Form1.mainform.WritetoListBox(string.Format($"Model Accuracy = {accuracy,6:0.0000}"));

    //********** Crosstabs View
    // Start by converting the 0-based labels to 1 based labels
    for (int k = 0; k < actualLabels.Count; k++)
    {
        actualLabels[k]++;
        expectedLabels[k]++;
    }
    // Prepare the Label Variable value labels
    Dictionary<int, string> StatusLabels = new Dictionary<int, string>()
    {
        {1,"Successful" },{2,"At Risk: Falling" },{3,"At Risk: Rising" },{4,"At Risk: Failing" }
    };

    Console.WriteLine("------CONVENTIONAL CONFUSION MATRIX RESULTS------");
    Crosstabs ct = new Crosstabs(actualLabels, expectedLabels, "DASHBOARD STUDY");
    ct.RowLabels = StatusLabels;
    ct.ColumnLabels = StatusLabels;
    ct.WritetoConsole();
    ct.View(Form1.mainform.dataGridView1);
    return;
}


This is what some quick sample output looks like:

Epoch:    0 CrossEntropyLoss =  1.6604410, EvalCriterion =  .7505131 
Epoch:   20 CrossEntropyLoss =  1.0943360, EvalCriterion =  .3836944 
Epoch:   40 CrossEntropyLoss =  1.0075010, EvalCriterion =  .3870011 
Epoch:   60 CrossEntropyLoss =   .9473225, EvalCriterion =  .3740023 
Epoch:   80 CrossEntropyLoss =   .9008594, EvalCriterion =  .3588369 
Epoch:  100 CrossEntropyLoss =   .8622118, EvalCriterion =  .3399088 
Epoch:  120 CrossEntropyLoss =   .8282195, EvalCriterion =  .3250855 
Epoch:  140 CrossEntropyLoss =   .7975291, EvalCriterion =  .3159635 
Epoch:  160 CrossEntropyLoss =   .7694442, EvalCriterion =  .3049031 
Epoch:  180 CrossEntropyLoss =   .7435042, EvalCriterion =  .2962372 
Epoch:  200 CrossEntropyLoss =   .7194668, EvalCriterion =  .2879134 
----------------
------TRAINING SUMMARY-------- Elasped time: 00:01:14
The model trained on 8770 cases to an accuracy of 71.21%
-----VALIDATION SUMMARY------
Validating Model: Total Samples = 5847, Mis-classify Count = 1735
------TESTING SUMMARY--------
Model Accuracy = 0.7033
------CONVENTIONAL CONFUSION MATRIX RESULTS------
Crosstab matrix for DASHBOARD STUDY
CLASSIFIED              LABELLED
  Values       1       2       3       4  RowSum
       1    3078     287     696       4    4065  precision: 0.7572
       2      31     951     176     538    1696  precision: 0.5607
       3       0       2       5       0       7  precision: 0.7143
       4       0       0       1      78      79  precision: 0.9873
  ColSum    3109    1240     878     620    5847
recall:   0.9900  0.7669  0.0057  0.1258  
Accuracy: 0.7033


Hope this might be useful to you or other readers.
GeneralHa! Feel some vindicated. Pin
asiwel25-Nov-17 11:12
professionalasiwel25-Nov-17 11:12 
QuestionReally Neat Project! Pin
asiwel15-Nov-17 16:50
professionalasiwel15-Nov-17 16:50 
AnswerRe: Really Neat Project! Pin
Bahrudin Hrnjica17-Nov-17 1:04
professionalBahrudin Hrnjica17-Nov-17 1:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.