Click here to Skip to main content
15,881,381 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
how to find text/html ratio of a website using c#?I want to know that how many part(percentage) of a page is text and how much is html,this will gives you an inside analysis of website ,which is usefull in SEO.
Posted
Updated 14-Sep-11 23:38pm
v2
Comments
rkthiyagarajan 15-Sep-11 5:43am    
Clear your Question
Ali Al Omairi(Abu AlHassan) 15-Sep-11 11:03am    
Sir;
text/html is not a ratio its a reponse type
Chris Maunder 15-Sep-11 11:03am    
I think he means the text-to-html ratio of text within an HTML document.

Maybe.

I don't think that there is something built-in for something like this.
First you should try creating a html parser for identifying all the controls/tags/elements from the page (like <div />, , <h1 />, ...).
After identifying all elements (all left characters are probably text), there shouldn't be a problem calculating the text/html ratio.
 
Share this answer
 
First of all your question is not clear. Please be specific on your requirement. What I understood is, you want to find the control from you page. What about Document.GetElementById or $('#' + 'controlId') To find the item ?

As for example,

alert(document.getElementById('controlId'));
 
Share this answer
 
v2
Comments
Ali Al Omairi(Abu AlHassan) 15-Sep-11 10:53am    
i think, getElementsByName('SharedName') for radio buttons and check boxes is much better than getElementById('ConntrolID'), because these controls usually group together by name.
Here's something extremely crude that will get you started:

C#
string myHtml = ... // (whatever your text is...

int textlLength = Regex.Replace(myHtml, "<[^>]*>", String.Empty).Length;
double textToHtmlRatio = (double)textlLength / (double)(myHtml.Length);


Nete that the regex used to strip out HTML is extremely rough. You really need to use a parser to be accurate, and at the very least you need to handle encoded characters (eg &amp; etc)
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900