Click here to Skip to main content
15,887,343 members
Articles / Artificial Intelligence / Big Data
Tip/Trick

MongoDb 3.2, C# MongoClient 2.x, TextResearch: How To Deal With...

Rate me:
Please Sign up or sign in to vote.
5.00/5 (1 vote)
31 May 2016CPOL1 min read 16.6K   3  
For those who deal with Text Research in MongoDb 3.2 with the new C# Driver, here is some advice.

Introduction

Have you ever tried to use Text ReSearch with MongoDb C# Driver 3.2 ? Well, we can say that it's not as easy as it sounds... Here is some advice.

Constructing Text Index

First, for all Text Research, you MUST have an index of type Text on each collection you want to search.

This request will create a Text index "MyFieldTextIndex" for a String Field named "MyField":

C#
MyCollection.Indexes.CreateOne(
              Builders<BsonDocument>.IndexKeys.Text("MyField"),
              new CreateIndexOptions() {DefaultLanguage = "french",Name="MyFieldTextIndex"});

This request will create a Text index "TextIndex" for ALL String fields in "Mycollection":

C#
MyCollection.Indexes.CreateOne(
              Builders<BsonDocument>.IndexKeys.Text("$**"), 
              new CreateIndexOptions() {DefaultLanguage = "french",Name="TextIndex"});

In "DefaultLanguage" parameter, specify your text language will say to mongodb which stopword dictionary to use. Default is "english".

Query Data

Now you can query your collection by using the aggregate pipeline or the standard query pipeline.

How to query and sort by Score:
C#
MyCollection.Aggregate()
            .Match(Builders<BsonDocument>.Filter.Text("firstword secondword"))
            .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
            .ToList());

In this query, mongodb will search "firstword" or "secondword" in all MyCollection's document.

What If You Want to Get the Document Score
C#
MyCollection.Aggregate()
             .Match(Builders<BsonDocument>.Filter.Text("firstword secondword"))
             .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
             .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
             .ToList());

By using the aggregate pipeline, it will project the score into a field "textScore" in new document and only conserve the field "_id" of the "Match" result document...

If you want to get result document AND the score, you should use the standard query pipeline by using Find:
C#
MyCollection.Find(Builders<BsonDocument>.Filter.Text("firstword secondword"))
            .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
            .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
            .ToList());

With the standard pipeline, "Project" method will add a double type field "textScore" at the end of your BsonDocument.

You can combine your TextFilter with other FieldFilter by using a AndFilter or OrFilter but you can't add more than one TextFilter in a query:

C#
MyCollection.Find(
         Builders<BsonDocument>.Filter.And(Builders<BsonDocument>.Filter.Text("firstword secondword"),
                                        Builders<BsonDocument>.Filter.Eq("AnotherField","fieldvalue")))
         .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
         .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
         .ToList());

Query is Ok, it will return all documents that match the TextFilter And where "AnotherField" equals to "fieldvalue".

C#
MyCollection.Find(
         Builders<BsonDocument>.Filter.And
         (Builders<BsonDocument>.Filter.Text("firstword secondword"),
                     Builders<BsonDocument>.Filter.Text("fieldvalue")))
         .Project(Builders<BsonDocument>.Projection.MetaTextScore("textScore"))
         .Sort(Builders<BsonDocument>.Sort.MetaTextScore("textScore"))
         .ToList());

Query is wrong (even if intellisense says ok) and will crash because there are two TextFilters...

Points of Interest

That's all folks!!! I hope these little tips will help you in text research with mongodb.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead
France France
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --