Click here to Skip to main content
15,886,519 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Consider i have string like this on textbox1.text:
words1 words2 = "hello world @!";

how to split that string using regex into :
words1
words2
=
"hello world @!"
;

i know to split string using whitespace in regex using this code :
C#
string s = "there is a cat";
// Split string on spaces.
// ... This will separate all the words.
string[] words = s.Split(' ');
foreach (string word in words)
{
    Console.WriteLine(word);
}


but if use above code the output will be :

words1
words2
=
"hello
World
@!";

how to extract quoted string as one words? no matter how much the whitespace within the quoted string and recognize the semicolon then split semicolon?

What I have tried:

Using some regex pattern \".*?\" to extract quoted string from the string..
Posted
Updated 1-Jun-16 22:20pm

1 solution

String.Split isn't a Regex - it's just a basic string method that can't do anything too complex.
Try:
("[^"]*")|([^\s]+)
as a regex - it should give you what you want as separate Group values.
 
Share this answer
 
Comments
Gun Gun Febrianza 2-Jun-16 4:31am    
it's work perfectly, this is my code :

Match result = Regex.Match(textBox1.Text, "(\"[^\"]*\")|([^\\s]+)");
if (result.Success)
{
MessageBox.Show(result.Value + " " + result.Index + " " + result.Length);
while (result.Success)
{
result = result.NextMatch();
MessageBox.Show(result.Value + " " + result.Index + " " + result.Length);
}

}

the input : words1 words2 = "hello world @!";
Output :
words1
words2
=
"hello world @!"
;
Gun Gun Febrianza 2-Jun-16 4:33am    
dear friend, do you have any suggestion to split this string :
display("hello world!@");

into :
display
(
"hello world!@"
)
;

i still learn regex for lexical analysis..
OriginalGriff 2-Jun-16 5:00am    
To be honest, I'm not sure that using a regex to split up everything into tokens for lexical analysis is going to be the right approach: you are going to end up with a massively complex regex which is going to be pretty much impossible to debug or modify!
For example:

string s = GetString("hello \"Paul\"!",
@"This gets ""nasty"" very quick"); // "Getstring" should return a "string" enclosed in '"' characters

There isn't a simple regex which will process that even slightly correctly for a lexical analysis! (Or if there is I don't want to even think about modifying it! :laugh: )
You would probably be a lot better off using an existing tokenizer and working from there.
Have a look at this:
http://www.codeproject.com/Articles/7664/StringTokenizer
It may not do exactly what you want, but it may give you a good idea to start from.
Gun Gun Febrianza 2-Jun-16 5:05am    
thanks you are codeproject angelkeeper (y)
OriginalGriff 2-Jun-16 5:29am    
I have no idea what that means, but I'll take it as a complement! :laugh:

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900