Click here to Skip to main content
15,867,453 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
there any possible way to check that the specified string is a valid url or not. The solution must be in c++ and it should work without internet.

example strings are

good.morning
foo.goo.koo
https://hhhh
hdajdklbcbdhd
8881424.www.hfbn55.co.in/sdfsnhjk
://dgdh24.vom
dfgdfgdf(2001)/.com/sdgsgh
\adiihsdfghnhg.co.inskdhhj
aser//www.gtyuh.co.uk/kdsfgdfgfrgj

What I have tried:

#include "stdafx.h"
#include <windows.h>
using namespace System;
using namespace System::IO;
int iDomCount =0;
void dominit();
void main(int argc, _TCHAR* argv[])
{

CString Uri,Temp,strDname;
int iLoc,iAsc,iLen;
char cStr;
try
{
cout<<"Enter Url\n";
Uri=Console::ReadLine();
Temp=Uri;
if((Uri.Find(L"https",0)) >= 0)
Uri=Uri.Mid(8);
else if((Uri.Find(L"http",0)) >= 0)
Uri=Uri.Mid(7);
if((Uri.Find(L"www.",0)) >= 0)
Uri=Uri.Mid(4);
for (int len=0;len < Uri.GetLength();len++)
{
iAsc=Uri.GetAt(len);
if ( ((iAsc > 64) && (iAsc < 91)) || ((iAsc > 96) && (iAsc < 123)) || ((iAsc > 47) && (iAsc < 58)) || (iAsc == 46) || (iAsc == 45))
iLoc++;
else
break;
}
if (iLoc < 1)
{
cout<<"Invalid Url";
system("pause");
Uri="";
Console::Clear();
}
else
{
Uri=Uri.Mid(0,(iLoc));
int ifound=Uri.ReverseFind(L'.');
if (ifound < 0)
{
cout<<"Invalid Url";
system("pause");
Uri="";
Console::Clear();
}
else
{
strDname=Uri.Mid(ifound);


}
}

}
catch(...)
{
}
}
void dominit()
{
StreamReader^ sr = gcnew StreamReader( "dnmout.txt" );
String^ line;

// Read and display lines from the file until the end of
// the file is reached.
while ( line = sr->ReadLine() )
{
CString str3(line);
char *sz;
sprintf(sz, "%S", str3);
dname[iDomCount]=sz;
iDomCount ++;
}
}


//this code what i tried. but it only works with the predefined list of sub domains, I've also tried the REGEX with c++ but it will not work with all types of url. please any solution for it.
Posted
Updated 27-Jul-16 1:14am
Comments
Mohibur Rashid 9-Aug-16 17:58pm    
I would suggest to use Regular expression. Pcre is your tool.

Some of your example strings are not valid URLs (see Uniform Resource Locator - Wikipedia, the free encyclopedia[^]) or valid URIs (Uniform Resource Identifier - Wikipedia, the free encyclopedia[^]). So you have to define first what is allowed / to be supported.

This might be for example that a missing scheme is replaced by a default one like done by any browser which uses http by default or that a scheme without colon is treated as the server name of a Windows share.

Then split the input into parts and check each part using the part specific rules.

Note that there may be different rules for some parts depending on other parts. An example would be Windows shares (indicated by the server name as scheme without colon) where specific characters would not be allowed in path and file name parts while these characters are allowed in URLs (e.g. quotation mark and asterisk).
 
Share this answer
 
See here: validation - Which characters make a URL invalid? - Stack Overflow[^]
The best place to begin is IsValidURL function (Windows)[^] or PathIsURL function (Windows)[^]

C++
#include<iostream>
#include<windows.h>
#include <tchar.h>
#include <urlmon.h>
#pragma comment(lib, "urlmon.lib")
#pragma comment(lib,"wininet.lib")

using namespace std;

void testURL(LPCTSTR Url)
{
	HRESULT hr;

	hr = IsValidURL(NULL, Url, 0);
	switch (hr)
	{
	case S_OK:
		cout << "The szURL parameter contains a valid URL.\n";
		break;
	case S_FALSE:
		cout << "The szURL parameter does not contain a valid URL.\n";
		break;
	case E_INVALIDARG:
		cout << "One of the parameters is invalid.\n";
		break;
	default:
		cout << "Unknown error\n";
		break;
	}
	printf("%x", hr);
}

int main() {
	LPCTSTR Url = _T("http://www.codeproject.com/Questions/1114838/How-to-check-a-specified-string-is-a-valid-URL-or");

	testURL(Url);

	return 0;
}


The results may be checked against this online validator: Validate an URL address - FormValidation[^]
 
Share this answer
 
v3
Comments
MarshalS 9-Aug-16 6:47am    
check the above code with the following false urls
http://www.good.morinig
http://notavalid.url.yesorno
google.com
facebbok.com
** use without http and https

is this result is reliable?
[no name] 9-Aug-16 7:50am    
What is the problem?
Try RegEx (Regular Expressions).
You will find with Google some RegEx that will match an URL like:
^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$

javascript - What is a good regular expression to match a URL? - Stack Overflow[^]
http://code.tutsplus.com/tutorials/8-regular-expressions-you-should-know--net-6149[^]
RegEx debugging tools:
Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
.NET Regex Tester - Regex Storm[^]

Quote:
I've also tried the REGEX with c++ but it will not work with all types of url. please any solution for it.
It works, it just a matter of crafting that right RegEx .
 
Share this answer
 
v3

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900