Click here to Skip to main content
15,893,487 members
Articles / Programming Languages / Visual Basic

R Statics Language API to VB.NET Language

Rate me:
Please Sign up or sign in to vote.
4.60/5 (3 votes)
22 Mar 2016CPOL6 min read 17.6K   4  
Hybrid programming technology for .NET language with R language

Introduction

In my recent job, there was a requirement for a library to draw the correlation relationship between the genes and the phenotype in the bacteria genome. The heatmap is a good choice, and there are already so many good libraries that exist in the R language, so a hybrid programming technology is required in my research job.

The RDotNET project makes the hybrid programming between R and .NET language possible, but it is still not so convenient for the programing. So I decided to develop this project for the R hybrid programming which is more efficient.

Visit RDotNET home: https://rdotnet.codeplex.com/

Declare an R API

The R API is Different from the Win32 API

If we define a Win32API, then DllImport attribute will be used, for example:

VB.NET
<DllImport("kernel32.dll", EntryPoint:="LoadLibrary", SetLastError:=True)> _
Private Shared Function InternalLoadLibrary(
              <MarshalAs(UnmanagedType.LPStr)> lpFileName As String) As IntPtr
End Function

Or in vb6 old style:

VB.NET
Public Declare Function InternalLoadLibrary _
       Lib "kernel32.dll" _
       Alias "LoadLibrary" _
     (<MarshalAs(UnmanagedType.LPStr)> lpFileName As String) As IntPtr

But the situation in R is different, function in R is an object, just like the method in VisualBasic is an object too, Or Everything is object.

A Simple R API Example

A basically R API without any parameter can be an empty class object, like:

VB.NET
<RFunc("heatmap.2")>
Public Class heatmap2 : Inherits IRToken
End Class

So that we can define a function API entry point from R like:

VB.NET
heatmap.2()

If we want to add some parameter for the API, just add some property in your class:

VB.NET
<RFunc("heatmap.2")>
Public Class heatmap2 : Inherits IRToken
    Public Property x As RExpression
    Public Property Rowv As Boolean = True
    Public Property Colv As RExpression = [TRUE]
    Public Property col As RExpression = "rev(brewer.pal(10,""RdYlBu""))"
    Public Property revC As RExpression = [TRUE]
    Public Property scale As RExpression = "row"
    Public Property margins As RExpression = c(15, 15)
    Public Property key As Boolean = True
    <Parameter("density.info")>
    Public Property densityInfo As String = Rstring("none")

End Class

So that finally this function API from R looks like:

VB.NET
heatmap.2(x, 
rowv = TRUE, 
colv =TRUE, 
col= rev(brewer.pal(10,""RdYlBu"")), 
revC = TRUE, 
scale="row", 
margins=c(15,15),
key=TRUE, 
density.info="none")

API Details

API Entry Point

RFunc attribute is used to define a R API entry point, just like the DllImport for Win32API, by using this RFunc attribute, we can declare a function name with a dot, which is illegal in the VisualBasic identifier name.

Tweak on the Name

If the parameter name has a dot, and the dot character is not allowed in VisualBasic identifier, so that you can use Parameter attribute to declare the alias of the parameter name.

In addition, if the property of your API is not a parameter, and then you can using Ignored attribute to mask this property from the API builder.<o:p>

IRToken Wrapper

Finally, your R API class can optionally inherit from the IRToken object as a set of extension method for R scripting have been defined for the IRToken wrapper object. Finally, the API builder can serialize your API into the R script just by using the extension method:

VB.NET
Me.GetScript(Me.GetType)

' Or more simple

' If your R API have the inherits relationship, then this generic method is not 
' recommended used in base class as the generic method always using the type 
' Information of your base class. This case bugs.

Me.GetScript

Why Choose Class as API?

Conveniently Share Your Script With Your Friend or Archive Script Model

Since sometimes drawing an R image needs a lot of parameter adjustments, and you wish to share the R script with your friends after the parameter tweaks, you just need to do serialization of your script into a json file and then e-Mail it to your friend, your friend just needs to load your script by json deserialization and makes some further tweaks.

Makes the Programming in Visual Basic More Easier

As you can see, most of the R function has a lot of parameters that can be tweaked, so that when you are programming with R, you need to define a lot of parameters in your program if the API is written in a function object.

As for me, I prefer using a class to transfer multiple parameters to the multiple function parameter.

VB.NET
' Not a Convenient style with a lot of parameter
Function example(path, format, blablabla...) As Type
End Function

' Great and convenient style with just some parameter
' All of the blablabla parameter are passing from the RAPI object.
Function example(path, format, RAPI) As Type

End Function

Where the RAPI is a class, and this property in this class is the multiple parameters blablabla in the function above.

API can Inherit other API, and This Defines API of Some Overloads Function in R More Easy

For example, there are some image format API in grDevices namespace in R, like bmp, jpeg, png and tiff, these R function have some common parameters to drawing a image, so that when you are designing this API, you just need to declare a base class for the common parameter and the sub class for unique parameter, and this inherits relationship makes your program more clear and easy.

Yeah, the class type function API from R makes your program structure more clear!

API Builder

Image 1

R Script Token

Here I have defined a set of abstract class as the R API token, which contains a serials wrapper extension method.

VB.NET
''' <summary>
''' 一个提供脚本语句的最基本的抽象对象
''' </summary>
''' <remarks>就只通过一个函数来提供脚本执行语句</remarks>
Public MustInherit Class IRProvider
    Implements IScriptProvider

    Dim __requires As String()

    ''' <summary>
    ''' The package names that required of this script file.
    ''' (需要加载的R的包的列表)
    ''' </summary>
    ''' <returns></returns>
    <Ignored> Public Overridable Property Requires As String()
        Get
            Return __requires
        End Get
        Protected Set(value As String())
            __requires = value
        End Set
    End Property

    ''' <summary>
    ''' Get R Script text from this R script object build model.
    ''' </summary>
    ''' <returns></returns>
    ''' <remarks></remarks>
    Public MustOverride Function RScript() As String Implements IScriptProvider.RScript

    Public Overrides Function ToString() As String
        Return RScript()
    End Function

    Public Shared Narrowing Operator CType(R As IRProvider) As String
        Return R.RScript
    End Operator
End Class

''' <summary>
''' R之中的单步函数调用
''' </summary>
Public Class IRToken : Inherits IRProvider
    Implements IScriptProvider

    ''' <summary>
    ''' 
    ''' </summary>
    ''' <returns>由于这个对象只是对一个表达式的抽象,最常用的是对一个函数调用的抽象,
    ''' 所以library在这里不可以自动添加,需要自己在后面手工添加</returns>
    Public Overrides Function RScript() As String
        Return Me.GetScript([GetType])
    End Function

    Public Overloads Shared Narrowing Operator CType(token As IRToken) As String
        Return token.RScript
    End Operator

    Public Shared Operator &(token As IRToken, script As String) As String
        Return token.RScript & script
    End Operator

    Public Shared Operator &(script As String, token As IRToken) As String
        Return script & token.RScript
    End Operator
End Class

The Builder API

Building an R API from a class object is based on the System.Reflection methods, so that two basically parameter of the API builder is required:

VB.NET
<Extension>
Public Function GetScript(token As Object, Optional type As Type = Nothing) As String
    If token Is Nothing Then
         Throw New NullReferenceException("Script tokens is nothing!")
    End If

    If type Is Nothing Then
         type = token.GetType
    End If

    Return __getScript(token, type)

End Function

First, the token parameter provides the R function object instance, which is the class object we define on the previous section, the R function Entry Point.

Then, if we want to using the reflection operations, then a reflection source will be required, and this source comes from the type parameter, which we can achieve the property information and the class information.

Getting the API name just needs to achieve the custom attribute of RFunc that we defined on the class definition:

VB.NET
''' <summary>
''' GET API name
''' </summary>
''' <param name="type"></param>
''' <returns></returns>
<Extension> Public Function GetAPIName(type As Type) As String
    Dim name As RFunc = type.GetAttribute(Of RFunc) ' Get function name

    If name Is Nothing Then
        Dim ex As New Exception(IsNotAFunc)
        ex = New Exception(type.FullName, ex)
        Throw ex
    Else
        Return name.Name
    End If

End Function

Since all of the R function parameters are in the form of class property, the next step of the builder is to achieve all of the can read properties

And furthermore, if we want to mask property from the builder, we should skip all of the properties which we have defined an ignored attribute, so that a Linq expression can be used for this job:

VB.NET
Dim props = (From prop As PropertyInfo In type.GetProperties
             Where prop.GetAttribute(Of Ignored) Is Nothing AndAlso
                  prop.CanRead
             Let param As Parameter = prop.GetAttribute(Of Parameter)
             Select prop,
                   func = prop.__getName(param),
                   param.__isOptional,
                   param
             Order By __isOptional Ascending)

IMPORTANT Note on the Data Type

There are some data type that need to be paid attention to.

1. Bool Logical Value

The Boolean logical type in the R language is the all up case word TRUE, FALSE or T, F, and the R language is not like VisualBasic language, the R language is case sensitive, so that we should make a map between the logical value.

VB.NET
Public Structure RBoolean : Implements IScriptProvider

      Public Shared ReadOnly Property [TRUE] As New RBoolean(RScripts.TRUE)
      Public Shared ReadOnly Property [FALSE] As New RBoolean(RScripts.FALSE)

      ReadOnly __value As String

      Sub New(value As String)
          __value = value
      End Sub

      Public Function RScript() As String Implements IScriptProvider.RScript
          Return __value
      End Function

 End Structure

2. String Value Type

For example, a string value in VisualBasic is:

VB.NET
Dim s As String = "abc"

When we write this variable to a text file, then the content just have abc, two quote character has gone. So that this situation will be the same when we write an R script:

The function in the R script required a string value, and it has two quote character wrap the string, but when we write the script, those two quote characters disappear as well, so that before write the script, a processing on the string type is required:

VB.NET
Public Function Rstring(s As String) As String
        Return $"""{s}"""
End Function

3. Expression as Parameter

The R expression we can directly use a string represents.

4. String as File Path

Due to the reason of character \ is the escape character in the C/C++ language, so that the \ character in a file path will caused error in the R language, an easy method of dealing with this situation is to replace all of the \ characters to /.

VB.NET
''' <summary>
'''
''' </summary>
''' <param name="file"></param>
''' <param name="extendsFull">是否转换为全路径?默认不转换</param>
''' <returns></returns>
<Extension>
Public Function UnixPath(file As String, Optional extendsFull As Boolean = False) As String
    If String.IsNullOrEmpty(file) Then
        Return ""
    End If
    If extendsFull Then
        file = FileIO.FileSystem.GetFileInfo(file).FullName
    End If
    Return file.Replace("\"c, "/"c)

End Function

Finally, we can write a function to makes the additional processing on the API builder of the different data type:

VB.NET
<Extension>
Private Function __getValue(type As Type, value As Object, valueType As ValueTypes) As String
    If value Is Nothing Then
        Return Nothing
    End If

    Select Case type

        Case GetType(String)

            If valueType = ValueTypes.Path Then
                  Return Rstring(Scripting.ToString(value).UnixPath)
             Else
                  Return Rstring(Scripting.ToString(value))
             End If
        Case GetType(Boolean)
             If True = DirectCast(value, Boolean) Then
                  Return RBoolean.TRUE.__value
             Else
                  Return RBoolean.FALSE.__value
             End If
        Case GetType(RExpression)
             Return DirectCast(value, RExpression).RScript
        Case Else
             Return Scripting.ToString(value)
    End Select
End Function

Example: Drawing heatmap in VisualBasic

Drawing a heatmap by using R language, an example can be found at http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/.

Image 2

So that based on this example, we can create an R API wrapper:

VB.NET
Imports System.Text
Imports System.IO
Imports Microsoft.VisualBasic.DocumentFormat.Csv.DocumentStream.Tokenizer
Imports Microsoft.VisualBasic.Linq
Imports Microsoft.VisualBasic
Imports RDotNet.Extensions.VisualBasic
Imports RDotNet.Extensions.VisualBasic.utils.read.table
Imports RDotNet.Extensions.VisualBasic.stats
Imports RDotNet.Extensions.VisualBasic.Graphics
Imports RDotNet.Extensions.VisualBasic.grDevices

Public Class Heatmap : Inherits IRScript

    Const df As String = "df"

    ''' <summary>
    ''' Column name of the row factor in the csv file that represents the row name. 
    ''' Default is the first column.
    ''' </summary>
    ''' <returns></returns>
    Public Property rowNameMaps As String
    ''' <summary>
    ''' Csv文件的文件路径
    ''' </summary>
    ''' <returns></returns>
    Public Property dataset As readcsv
    Public Property heatmap As heatmap_plot
    ''' <summary>
    ''' tiff文件的输出路径
    ''' </summary>
    ''' <returns></returns>
    Public Property image As grDevice

    Sub New()
        Requires = {"RColorBrewer"}
    End Sub

    ''' <summary>
    ''' 
    ''' </summary>
    ''' <returns></returns>
    ''' <remarks>
    ''' http://joseph.yy.blog.163.com/blog/static/50973959201285102114376/
    ''' </remarks>
    Protected Overrides Function __R_script() As String
        Dim script As StringBuilder = New StringBuilder()
        Call script.AppendLine($"{df} <- " & dataset)
        Call script.AppendLine($"row.names({df}) <- {df}${__getRowNames()}")
        Call script.AppendLine($"{df}<-{df}[,-1]")
        Call script.AppendLine("df <- data.matrix(df)")

        heatmap.x = df

        If Not heatmap.Requires Is Nothing Then
            For Each ns As String In heatmap.Requires
                Call script.AppendLine(RScripts.library(ns))
            Next
        End If

        Call script.AppendLine(image.Plot("result <- " & heatmap))

        Return script.ToString
    End Function
End Class

By using this heatmap API, required of three parameters:

1. Define the heatmap data source, which its data source is comes from read a csv file

VB.NET
Property dataset As readcsv

By read data from a location, just construct the object instance of the read.csv API class, like:

VB.NET
dataset = New readcsv("http://datasets.flowingdata.com/ppg2008.csv")<o:p></o:p>

2. Define the heatmap drawing method, which it available API can be found at namespace gplots or stats

VB.NET
Property heatmap As heatmap_plot

3. Define the heatmap image saved location

VB.NET
Property image As grDevice

In addition, a set of image file format API have been defined in namespace RDotNET.Extensions.VisualBasic.grDevices.

Like: grDevices .bmp, grDevices .jpeg, grDevices .png, grDevices .tiff

By using this R API, just simply construct an object instance like:

VB.NET
Dim image As grDevice = New tiff("imagefile.tiff", 8000, 6500)<o:p></o:p>

You can download this example from github:

Go ahead and try it!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead PANOMIX
China China
He is good and loves VisualBasic! Senior data scientist at PANOMIX


github: https://github.com/xieguigang

Comments and Discussions

 
-- There are no messages in this forum --