Click here to Skip to main content
15,891,689 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
I have my dataset with 1000 records

This is list used to store the randomly selected records.

C#
private static List<string>[] BChrom = new List<string>[10];


How can I add 20% from the whole dataset RANDOMLY to that List of string

C#
try
            {
                using (sr = new StreamReader(@"C:\Users\*****\Documents\sub0000.data"))
                {
                    for (int i = 0; i < BChrom.Length; i++)
                    {
                        
                    }
                }
            }
Posted
Comments
Herman<T>.Instance 22-Jul-13 8:07am    
Please read this

1 solution

Try this:
C#
ArrayList ReadRandom(string sourceFile, int sampleSize)
{
    ArrayList BChrom = new ArrayList(sampleSize);
    Random random = new Random();
    FileStream ifs = new FileStream(sourceFile, FileMode.Open);
    StreamReader sr = new StreamReader(ifs);
    string line = "";

    // determine extent of source file
    long lastPos = sr.BaseStream.Seek(0, SeekOrigin.End);

    for (int i = 0; i < sampleSize; ++i)
    {
        // generate a random position
        double pct = random.NextDouble(); // [0.0, 1.0)
        long randomPos = (long)(pct * lastPos);
        if (pct >= 0.99)
            randomPos -= 1024; // if near the end, back up a bit

        sr.BaseStream.Seek(randomPos, SeekOrigin.Begin);

        line = sr.ReadLine(); // consume curr partial line
        line = sr.ReadLine(); // this will be a full line
        sr.DiscardBufferedData(); // magic

        BChrom.Add(line);
    }

    sr.Close(); 
    ifs.Close();

    return BChrom;
}


There are some drawbacks(like last line is never read, if the file size is less than 1024 etc) but performance is guaranteed on large files...
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900