I have collection of emails in the mysql splitted into category. I want to train network with resilient propagation. Do I think properly if I would like to prepare CSV document with tf-idf values like:
"","word1","word2","word3","word4"...
"cat1",0.1990131014,0.0000000000,0.0000000000,0.0000000000...
"cat2",0.0000000000,0.1736218496,0.0000000000,0.0000000000...
"cat3",0.0000000000,0.0000000000,0.0000000000,0.0000000000...
"cat4",0.0000000000,0.0000000000,0.0000000000,0.0000000000...
...
and then train it.
I search and found that Encog library is quite good for machine learning tasks, so I would like to use it to code it in c#.
I have approximately 30 categories with different lenght of message.
I was also wondering using Dictionary and make calculation during compilation time.
I would be grateful for pointing me correct way.
Part 2:
If I would have such CSV document, can I use it as an input values to the another methods for example: SVM or NB. If not how I can prepare such "universal collection"