Table of Contents
Introduction
This article is about a tool using Roslyn that can search through a large codebase in 4 ways:
- Search text in methods
- Search calls to certain methods
- Search for methods with certain names
- Search for properties with certain names
- Search for classes with certain names
Screenshot
Here's a screenshot of C# and VB.NET Code Searcher in action.
The problem
Recently I had an assignment that required a lot of searching through the source code of a large legacy codebase (61 solutions, C# code).
A field had to be moved from one table to another table. It was a change that would impact some parts of the codebase.
To find out I had to find methods in the data layer where Stored procedures were called. Then I had to go bottom-up through
the codebase to see where these methods were called, and what the impact was on the code.
At first I used the freeware tool "TextCrawler 2" for that (http://www.digitalvolcano.co.uk/content/index.php). This is quite a fast
text search utility. But the problem is, it doesn't "know" anything about the C# language. For example, if you search for method calls
to a certain method, TextCrawler will happily find files for you that have the method calls commented out. Another problem was,
it wasn't fast enough (searching through 61 solutions can take some time..). I also used the Microsoft Desktop Search tool,
this was fast but also not "intelligent" with the source code.
Since I read about Roslyn I thought of ways I could make it useful for this purpose.
Background about Roslyn
Roslyn is Microsoft’s project to open up the VB and C# compilers through APIs, and provide easy access to the information it gathers during
the different stages of the compilation process. To get started on what Roslyn is about, you can read about it here:
Or if you'd prefer to take a deeper dive into Roslyn, here's a whitepaper from Microsoft:
This article is not meant to give you an introduction to Roslyn, there are a couple of good CodeProject articles that do that:
I also found out that after installing the Microsoft Roslyn CTP - June 2012 there were lots of sample projects installed in my Documents folder.
The solution
So I thought I'd give Roslyn a try to see if I could create a tool that could search through code faster.
I think I succeeded in this. I use it all the time now! I created a Windows Forms application that has 5 ways of searching through C# and VB.NET code:
- Search text in methods
- Search calls to certain methods
- Search for methods with certain names
- Search for properties with certain names
- Search for classes with certain names
I decided to share this with the world so everyone can enjoy it. By posting this article, I hope that:
- People will find this useful too.
- I get valuable feedback so the tool can be improved.
- People will extend / adapt the tool or parts of it in ways I haven't thought of yet.
Why would you use it?
For example: "Go To Definition" for other solutions
Let's say you're in a debugging session. You're debugging in solution X which calls a service that's in another solution Y. Now you see a method being called on a class in solution Y. In Visual Studio you can go to the definition of a method
with right mouse click - "Go To Definition" or F12. But not when the method is in the other solution! So if you want to look up the definition of the method,
the only ways to do that are:
- Step inside the method during the debug session
- Open solution Y and find the method you want to see.
With RoslynCodeSearcher, it's very easy to look up a method that's in another solution, just type its name in the search field, select "Search methods" and click [Search].
As a help during refactoring
Sometimes you want to know "What will happen if I remove this method, where is it called in the jungle of solutions?". You can do a text search or you can start a compile
build to see where it breaks, but for some projects that have lots of solutions a full compile build just takes long. With RoslynCodeSearcher, you type the name of the method in the search field, select "Search calls" and click [Search]. Wait a second, et voila!
Why don't you use reflection for this?
The reason I don't use reflection for this is, I want to have access to the actual sourcecode of the solutions I search. I want to return the method body for example. Reflection can't do that, it can only work on metadata (Type of Classes, Name and Signature of Methods, etc.). Also, when I want to do a text search on pieces of text in a method, using Roslyn is faster than text search for a larger number of solutions. This is because the solutions are compiled in memory.
How fast is it
The first time you use it the tool will be slower, because it has to compile the solutions in memory (604 MB of memory in my situation). These compiled solutions will be available in
IWorkspace
objects. This happens at the startup of the tool every time. A progress indicator will indicate the progress of the compilation. With a couple of solutions this compilation will be finished in a second or so. With a whole lot of solutions it will take longer. To give you an indication: On my computer it took about half a minute compiling 61 solutions in memory the first time. After the initial compilation the search will be very fast: a second to a few seconds for searching through 61 solutions, depending on how much will be found. This is because it already has the list of
IWorkspace
objects in memory. After I started using the .NET 4 Parallel.ForEach
keyword the performance has increased significantly (with a factor depending on the number of cores in the processor of your computer, Dual Core, Quad Core, etc.).
How to use it
Prerequisites
Make sure you have the following software installed in this order, otherwise the solution will not build:
This article was written for the Roslyn June 2012 CTP version, that was compatible with Visual Studio 2010 SP1. However, the new version of Roslyn, the September 2012 CTP version, is only compatible with Visual Studio 2012. I have added a download link to the sources of Code searcher that work together with Visual Studio 2012. The rest of this article still needs to be updated to reflect this fact (or I will create a new article specially for the Visual Studio 2012 version,
I still have to decide).
If you have Visual Studio 2012, the software needs to be installed in this order:
Next: solutions.txt file
You have to provide the tool with a list of solutions to search through.
There are two ways you can do this:
- With a text file "solutions.txt" placed in the directory of the executable (or \bin\debug after you build the solution). The tool will read this on startup if it exists.
This text file should contain full paths to the solutions. Each on it's own line.
- If the solutions.txt file doesn't exist yet, click on [Browse ...] and in the File dialog select a directory. Next click on [Update solution List].
The tool will then walk recursively down the directory structure, starting at the selected directory, looking for solution (.sln) files.
The result will be stored in the "solutions.txt" file in the directory of the executable. The existing "solutions.txt" file will be overwritten.
Next: search
- Type the text you want to search for in the textbox.
- Select one of the ways to search with by clicking one of the radio buttons.
- Click [Search].
The solutions from solutions.txt, all underlying projects, and all underlying source files will be searched through.
The result of the search consists of:
- The path to the source files containing the found methods.
- The body of the methods.
Including / decluding files
You can also specify words in the textboxes on the right that say:
"Do not include files containing words in filename. Separate by comma."
or
"Only include files containing word in filename. Separate by comma."
- "Do not include" means, the tool will not search in code files that have any of the words in the path.
- "Only include" means, the tool will only search in code files that have any of the words in the path.
These text boxes are mutually exclusive, they can not be used at the same time, "
Do not include" takes precedence over "include".
Searching part of text
It is possible to type only part of the text you want to search. For example, if you want to search for all methods that contain the word "Save", like "SaveCustomer", "SaveOrder", then check the option checkbox "Search part of text". If you select the search option "Search text in method" the option will be set by default.
Syntax Highlighting
with Fast Colored TextBox
To present the results of the code search I needed a text editor that could do Syntax Highlighting. I researched a couple of those, and decided to use
the great "Fast Colored TextBox" from Pavel Torgashov in my project (also on CodeProject): Fast Colored TextBox for Syntax Highlighting. Which is fast indeed! It also supports searching in the textbox with Ctrl-F.
Multiple Searches
using KRBTabControl
To be able to start multiple searches using a tabbed interface, I gladly used the excellent "KRBTabControl" from Burak299 in my project (also on
CodeProject): KRBTabControl. This gave me the possibility to provide
tabs that can be closed just like browser tabs. When there are too many tabs too display you will see two tiny arrows on the right so you can switch between tabs with the mouse.
The implementation
The code below is not entirely the same as the source code itself, but this is meant to show you the basics of how the tool works.
When the Search button is clicked, a search is started using the selected Search method (the radio buttons).
public enum SearchType
{
SearchTextInMethod,
SearchCallers,
SearchMethods,
SearchProperties,
SearchClasses
}
private SearchType _searchType = new SearchType();
private void btnSearch_Click(object sender, EventArgs e)
{
string searchText = txtTextToSearch.Text;
searchText = searchText.Trim();
if (searchText.Contains("(") || searchText.Contains(")"))
{
MessageBox.Show("Please specify searchtext without parentheses or parameters.");
return;
}
if (!File.Exists(Constants.BaseDirectorySolutionsTxtPath))
{
MessageBox.Show("There is no solutions.txt file in the directory where the .exe resides." +
" Please click the [Browse] button to select a starting direcctory. Then click [Update solution List]");
}
else
{
SearchType searchType = new SearchType();
if (!String.IsNullOrEmpty(searchText))
{
TabController.UpdateSearchTextOnTab(searchText);
TabController.ShowHourGlass();
if (rbSearchTextInMethod.Checked)
{
searchType = SearchType.SearchTextInMethod;
}
else if (rbSearchCallers.Checked)
{
searchType = SearchType.SearchCallers;
}
else if (rbSearchMethods.Checked)
{
searchType = SearchType.SearchMethods;
}
else if (rbSearchProperties.Checked)
{
searchType = SearchType.SearchProperties;
}
else if (rbSearchClasses.Checked)
{
searchType = SearchType.SearchClasses;
}
WorkerFactory.Start(searchType, searchText, txtExclude.Text,
txtInclude.Text, TabController.SelectedTab.Guid);
}
else
{
MessageBox.Show("Please enter text to search");
}
}
}
The WorkerFactory.Start
method creates a new Worker
object every time you do a search.
public static class WorkerFactory
{
private static List<Worker> _workerList = new List<Worker>();
public static void Start(SearchType searchType, string searchText,
string filter, string include, Guid guid)
{
Worker worker;
worker = new Worker(searchType, searchText, filter, include, guid);
_workerList.Add(worker);
worker.Start();
}
private static Worker SelectWorker(Guid guid)
{
var selectWorker = from worker in _workerList
where worker.Guid == guid
select worker;
if (selectWorker != null && selectWorker.Count() == 1)
{
return (Worker)selectWorker.First();
}
return null;
}
public static void Delete(Guid guid)
{
Worker selectWorker = SelectWorker(guid);
if (selectWorker != null)
{
selectWorker.Cancel();
_workerList.Remove(selectWorker);
}
}
}
This Worker
uses a BackgroundWorker
to start a thread that starts a codesearch using Roslyn.
public class Worker
{
private CodeSearcher _searcher;
BackgroundWorker _worker;
private string _result;
private Guid _guid;
private bool _cancel;
public Worker(SearchType searchType, string searchText, string filter, string include, Guid guid)
{
_guid = guid;
_searcher = new CodeSearcher(searchType, searchText, filter, include);
_worker = new BackgroundWorker();
_worker.DoWork += new DoWorkEventHandler(worker_DoWork);
_worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);
}
public Guid Guid
{
get { return _guid; }
set { _guid = value; }
}
public void Start()
{
_worker.RunWorkerAsync();
}
public void Cancel()
{
_cancel = true;
}
private void worker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
{
if (!_cancel)
{
TabController.WriteResults(_guid, _result);
}
}
private void worker_DoWork(object sender, DoWorkEventArgs e)
{
_result = _searcher.Search();
}
}
If the worker is started with the Start
method, it calls the
worker_DoWork
asynchronously, which calls the CodeSearcher.Search
method
that searches using 1 of 5 methods, depending on the selected SearchType
.
public class CodeSearcher
{
public string Search()
{
string result = "";
List<string> excludes = CodeSearcher.GetFilters(_exclude);
List<string> includes = CodeSearcher.GetFilters(_include);
if (CodeRepository.Workspaces.Count() == 0)
{
CodeRepository.Solutions = CodeRepository.GetSolutions(Constants.BaseDirectorySolutionsTxtPath);
CodeRepository.Workspaces = CodeRepository.GetWorkspaces(CodeRepository.Solutions);
}
if (_searchType == SearchType.SearchTextInMethod)
{
result = SearchMethodsForTextParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchCallers)
{
result = SearchCallersParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchMethods)
{
result = SearchMethodsParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchProperties)
{
result = SearchPropertiesParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
else if (_searchType == SearchType.SearchClasses)
{
result = SearchClassesParallel(CodeRepository.Workspaces, _searchText, excludes, includes);
}
return result;
}
If a solutions.txt file exists in the directory where the RoslynCodeSearcher.exe resides, the paths to the solutions will be put in a List and the Workspaces
with the solutions will be loaded. A workspace is an active representation of your solution as a collection of projects, each with a collection of documents.
The workspace provides access to the current model of the solution. You can read more about it here.
In the CodeSearcher
class I have five search methods. This is where the searching happens. The searching makes use of the .NET 4 keyword Parallel
.ForEach
to speed things up depending on the number of cores in the processor of your computer. I will show one of the search methods here, the other 4 you can see in the source code.
public string SearchMethodsForTextParallel(List<IWorkspace> workspaces,
string textToSearch, List<string> excludes, List<string> includes)
{
StringBuilder result = new StringBuilder();
string language = "";
foreach (IWorkspace w in workspaces)
{
ISolution solution = w.CurrentSolution;
foreach (IProject project in solution.Projects)
{
language = project.LanguageServices.Language;
Parallel.ForEach(project.Documents, document =>
{
if (!excludes.Any(s => document.FilePath.ToUpper().Contains(s)) &&
(
includes.Count() == 0 || includes.Any(s => document.FilePath.ToUpper().Contains(s)))
)
{
if (language == LANG_CS)
{
result.Append(SearchMethodsForTextCSharp(document, textToSearch));
}
}
});
}
}
return result.ToString();
}
private string SearchMethodsForTextCSharp(IDocument document, string textToSearch)
{
StringBuilder result = new StringBuilder();
CommonSyntaxTree syntax = document.GetSyntaxTree();
var root = (Roslyn.Compilers.CSharp.CompilationUnitSyntax)syntax.GetRoot();
var syntaxNodes = from methodDeclaration in root.DescendantNodes()
.Where(x => x is MethodDeclarationSyntax || x is PropertyDeclarationSyntax)
select methodDeclaration;
if (syntaxNodes != null && syntaxNodes.Count() > 0)
{
foreach (MemberDeclarationSyntax method in syntaxNodes)
{
if (method != null)
{
string methodText = method.GetFullText();
if (methodText.ToUpper().Contains(textToSearch.ToUpper()))
{
result.Append(GetMethodOrPropertyTextCSharp(method, document));
}
}
}
}
return result.ToString();
}
When the text or call or method or property is found, the method GetMethodOrPropertyText
is called to get the body of the method / property
in which the searched item is found. The full text of the method /property will be returned, including the path to the .cs file.
private string GetMethodOrPropertyTextCSharp(Roslyn.Compilers.CSharp.SyntaxNode node, IDocument document)
{
StringBuilder resultStringBuilder = new StringBuilder();
string methodText = node.GetFullText();
bool isMethod = node is Roslyn.Compilers.CSharp.MethodDeclarationSyntax;
string methodOrPropertyDefinition = isMethod ? "Method: " : "Property: ";
object methodName = isMethod ? ((Roslyn.Compilers.CSharp.MethodDeclarationSyntax)node).Identifier.Value :
((Roslyn.Compilers.CSharp.PropertyDeclarationSyntax)node).Identifier.Value;
resultStringBuilder.AppendLine("//=====================================================================================");
resultStringBuilder.AppendLine(document.FilePath);
resultStringBuilder.AppendLine(methodOrPropertyDefinition + (string)methodName);
resultStringBuilder.AppendLine(methodText);
return resultStringBuilder.ToString();
}
Jumping back to the Worker
object above, when the worker is finished and has the results, the
TabController.WriteResults
method will be called to update
the FastColoredTextBox
with the results.
public static class TabController
{
private static List<FastColoredTextBoxNS.FastColoredTextBox>
_fastColoredTextBoxes = new List<FastColoredTextBoxNS.FastColoredTextBox>();
public static void WriteResults(Guid guid, string text)
{
lock (_lockobj)
{
var selectFastColoredTextBox = from fctb in _fastColoredTextBoxes
where fctb.Guid == guid
select fctb;
if (selectFastColoredTextBox != null && selectFastColoredTextBox.Count()==1)
{
FastColoredTextBox currentTextBox = (FastColoredTextBox)selectFastColoredTextBox.First();
currentTextBox.Text = text;
if (text == "") currentTextBox.Text = "Nothing found.";
currentTextBox.Selection.Start = Place.Empty;
currentTextBox.DoCaretVisible();
}
}
}
}
As you will see in the source code, there is much more to it then I have shown in this article. For example, it is possible to start multiple
searches independently at the same time from different tabs. This uses some threading and proper handling / locking. Also, the tool itself can search in both C# and VB.NET source code.
About the Source Code
The projects attached will open up and build in Visual Studio 2010 SP1 (or Visual Studio 2012 if you download the 2012 version). In paragraph "How to use it" I explain the prerequisites that are necessary to use the tool.
Future of this project
Some thoughts about the direction this project might go in the future:
Regular Expression Support
I want to be able to search using regular expressions, for example: Give me all the methods that are named "
SaveCustomer
" or "
InsertCustomer
". You would have to type a regular expression like this.
(Save|Insert)Customer
Visual Studio Extension
This could be reworked as a Visual Studio extension. That way it could make use of the C# code editor and other parts of Visual Studio.
That could make it even more powerful and accessible to more people.
Advanced stuff
To make refactoring source code through multiple solutions friendlier, it would be nice if you could do some type of "queries"
on your source code, just like LINQ. Something like
http://www.ndepend.com/Doc_CQLinq_Syntax.aspx or
http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code. To make these queries strongly typed and not dynamic, that would need IntelliSense in a kind of interactive window.
Maybe the new Roslyn "C# Interactive window"
could be of use for this. But probably this would be easier to realize as a Visual Studio Extension.
Output
Let the user define what the output should contain, for example:
- The whole code file
- A graphical view of connections between methods / classes / solutions etc.
History
2012-12-05
- Created a version for Visual Studio 2012 and Roslyn September 2012 CTP
- Fixed the breaking changes in the version that works with Roslyn September 2012 CTP
- Fixed the unit tests because of the breaking changes of Roslyn September 2012 CTP
2012-08-21
- Fixed a bug in searching callers; some callers were not found.
2012-08-05
- Added "
precompile
" option to compile the solutions in memory at program startup to speed things up
2012-08-04
- Added class name search ability and "part of text" search
2012-08-01
- Used
Parallel.ForEach
for searching. +/- 2x as fast with 2 cores, 4 cores not able to test, but probably 4x as fast.
2012-07-28
- More unit tests (
TabController
) - Use
.Any()
instead of Count() > 0
- Unit test to test performance of
@"A".ToUpper().Contains(@"B".ToUpper())
versus
@"A".IndexOf(@"B", StringComparison.OrdinalIgnoreCase)
2012-07-24
- Added unit tests
- Able to search in VB.NET code also
2012-07-18
- Added property search ability
- Input check on search textbox
- Remove leading / trailing spaces on text from search textbox when click [Search]
- Show "Method:" or "Property:" depending on which search type is selected
2012-07-16
- Fixed ability to Copy (Ctrl-C) from the
FastColoredTextBox
- Show hourglass icon on tabs when threads are running
- If you click button [New tab] the program automatically jumps to the next tab
- Separator lines between tabs
- Changed text of include / exclude text fields to better describe what they mean
2012-07-15
- Fixed issue causing error with parentheses in search text
- Added extra comments to source
- Tested if different types of method definitions can be searched
- Some refactoring: regions etc.
- Added
MessageBox
for button "Update solution List".