Click here to Skip to main content
15,886,919 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
Hi all
I have DNA sequencing file and i need to split the file into multiple files as follows:
if the input file contain:
ACGTTTTGGGGGGGGTCCCCCTAC
ACGGACTTTTACGTTTTTTTTTTT
GGGTTTTTAAACCCGGGTTTGGTT
GGTTTTTCCCCCAAAAAATTTTTC
CCCCTAAAAAAAACGGGGGTTTGG

the output will be depend on the number of data nodes: if the this number 4 :
we should have 4 files:
f1
ACGTTT
ACGGACTTTTACGTTTTTTTTTTT
GGGTTTTTAAACCCGGGTTTGGTT
GGTTTTTCCCCCAAAAAATTTTTC
CCCCTAAAAAAAACGGGGGTTTGG
f2
TGGGGGGGGTCC
ACGGACTTTTACGTTTTTTTTTTT
GGGTTTTTAAACCCGGGTTTGGTT
GGTTTTTCCCCCAAAAAATTTTTC
CCCCTAAAAAAAACGGGGGTTTGG
f3
GGGTCC
ACGGACTTTTACGTTTTTTTTTTT
GGGTTTTTAAACCCGGGTTTGGTT
GGTTTTTCCCCCAAAAAATTTTTC
CCCCTAAAAAAAACGGGGGTTTGG
f4
CCCTAC
ACGGACTTTTACGTTTTTTTTTTT
GGGTTTTTAAACCCGGGTTTGGTT
GGTTTTTCCCCCAAAAAATTTTTC
CCCCTAAAAAAAACGGGGGTTTGG

i.e each file contain substring from the first seq and all the reminder sequences
and then we execute specific binary file for each file
Posted
Updated 16-Dec-13 4:27am
v2
Comments
Richard MacCutchan 16-Dec-13 9:44am    
Sounds like a good idea. Don't forget to tell us when it's finished.

However if you are hoping for some assistance then please include a proper question.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900