Click here to Skip to main content
15,881,455 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
Hey folks - I'm definitely not a python guy but I am a C# guy. Can I get a hand trying to figure out how I can recreate this code snippet in c#?

Background:
I have a tab-delimited file that holds a p-value/critical value (for chi squared analysis). There are 9 columns of data. The first column are degrees of freedom and the subsequent columns are the critical values in order of p-value. There are no headers, i.e. the data starts on row 0. It looks like I have a percentiles array being set up and it is used in the rx calculation in the snippet as well as later on to add on some extra stats analysis vs. p-values in the table and degrees of freedom. The bs array is used frequently. The in1 variable is the file path of the file to be enumerated.

Python
df=[]
	bs=[]
	percentiles=[[] for i in range(100)]
	for line_idx, line in enumerate(in1):
		cols = line.replace('\n', '').split('\t')		
		df.append(float(cols[0]))
		# bs.append(float(cols[1]))
		for j in range(9):
			percentiles[line_idx].append(float(cols[j+1]))
		rx=(percentiles[line_idx][2]+percentiles[line_idx][0]-2*percentiles[line_idx][1])/(percentiles[line_idx][2]-percentiles[line_idx][0])
		bs.append(rx)


What I have tried:

I tried setting up a double[][] test = double[100][]; and this compiled when I tried to translate everything directly, but got many runtime errors referencing things being out of index - I don't believe that there was much positive with that method.

I put the chi squared file into a data table as well figuring it might be helpful. I didn't edit any data. I can't match up the p-values 1:1 because they are at weird intervals... I was able to identify p values of 0.05 and 0.005 but everything else ranges between a p-value of somewhere around 0.99 and 0.
Posted
Updated 6-Jan-17 0:19am
Comments
Jochen Arndt 6-Jan-17 8:50am    
A tab separated file is very similar to a CSV file (a CSV file with the TAB as separation character). So you might have a look at a C# CSV reader class that supports defining the separation character.
dfarr1 6-Jan-17 12:44pm    
Well actually working with the csv and the delimiting character is pretty easy - the issue at hand is the code snippet, which I just haven't been able to logic.
Jochen Arndt 6-Jan-17 13:05pm    
I'm not so firm with Python.

But I don't see a reason to use a 2-dim array here because only the current line index is used (provided that the percentile array is not used later anymore).

The Python code uses an array with dimensions [100][10] but only the right indexes 0 to 2 are used by the code shown.

So it would be in C#:

double[,] percentile = new double[100, 10];

1 solution

This could be a good starting point:

C#
var df = new List<float>();
var bs = new List<float>();
var percentiles = new List<float>[100];
for(int i = 0; i < percentiles.Length; i++)
{
    percentiles[i] = new List<float>();
}

var line_idx = 0;
foreach(var line in enumerate(in1))
{
    var cols = line.Replace(Environment.NewLine, "")
                   .Split(new[]{'\t'});
    df.Add(float.Parse(cols[0]));
    for(int j = 1; j < 9; j++)
    {
        percentiles[line_idx].Add(float.Parse(cols[j]));
    }
    
    var rx = (percentiles[line_idx][2] + percentiles[line_idx][0] - 2 * percentiles[line_idx][1])/(percentiles[line_idx][2]-percentiles[line_idx][0]);
    bs.Add(rx);
    line_idx++;
}


I could have done some typo as this is just a little snippet on the fly.

EDIT: Linq
C#
var values = File.ReadLines("")
                .Select(line =>
                {
                    var cols = line.Replace(Environment.NewLine, "")
                                   .Split(new[] { '\t' })
                                   .Select(m => float.Parse(m))
                                   .ToArray();
                    return new
                    {
                        df = cols[0],
                        percentiles = cols.Skip(1).ToArray(),
                        bs = (cols[3] + cols[1] - 2 * cols[2]) / (cols[3] - cols[1])
                    };
                });
var df = values.Select(m => m.df).ToArray();
var bs = values.Select(m => m.bs).ToArray();
var percentiles = values.Select(m => m.percentiles).ToArray();
 
Share this answer
 
v2
Comments
dfarr1 6-Jan-17 18:53pm    
Excellent code, Alberto. I did have to tweak the lists a little bit, but for the most part that was great. Here's what I went with:
var df = new ArrayList();
var bs = new ArrayList();
ArrayList[] percentiles = new ArrayList[100];
for (int i = 0; i < percentiles.Length; i++)
{
percentiles[i] = new ArrayList();
}
var line_idx = 0;
foreach (var line in File.ReadLines(in1))
{
var cols = line.Replace(Environment.NewLine, "").Split(new[] { '\t' });
df.Add(float.Parse(cols[0]));
for (int j = 1; j < 9; j++)
{
percentiles[line_idx].Add(float.Parse(cols[j]));
}
var rx = (Convert.ToDouble(percentiles[line_idx][2]) + Convert.ToDouble(percentiles[line_idx][0]) - 2 *
Convert.ToDouble(percentiles[line_idx][1])) /
(Convert.ToDouble(percentiles[line_idx][2]) - Convert.ToDouble(percentiles[line_idx][0]));
bs.Add(rx);
line_idx++;
}
Alberto Nuti 6-Jan-17 20:02pm    
I have some doubt: as long as ArrayList it's deprecated and behave the same as List<object>, why use them? You have to box\unbox values, too! Unless you have to target a ".net framework < 2.0"...

Also, as Jochen pointed out, a better implementation should be "double[,] percentiles = new double[100, 10];".

Now, being List<float> a wrapper around an array of float (float[]) that dinamically grown, you can also use a "List<List<float>>(100)" and then "percentiles[i] = new List<float>(10)", if you really doesn't want to handle two indexes (line_idx, and col_idx) as the performance should be really similar to the "old style" way.

Last, if you can, you definitely make use of Linq:

var values = File.ReadLines("")
                .Select(line =>
                {
                    var cols = line.Replace(Environment.NewLine, "")
                                   .Split(new[] { '\t' })
                                   .Select(m => float.Parse(m))
                                   .ToArray();
                    return new
                    {
                        df = cols[0],
                        percentiles = cols.Skip(1).ToArray(),
                        bs = (cols[3] + cols[1] - 2 * cols[2]) / (cols[3] - cols[1])
                    };
                });
var df = values.Select(m => m.df).ToArray();
var bs = values.Select(m => m.bs).ToArray();
var percentiles = values.Select(m => m.percentiles).ToArray();

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900