Click here to Skip to main content
15,879,535 members
Articles / Database Development / MySQL

Email Search Engine

Rate me:
Please Sign up or sign in to vote.
5.00/5 (11 votes)
11 Jan 2015CPOL11 min read 22.4K   338   13  
Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 5

Introduction

MySQL provides support for many storage engines which manage storage and retrieval. MyISAM and InnoDB are common. On table creation engine can be specified. A basic storage engine allows read only tables. Most engines read data from files and feed to MYSQL. On the other hand, our idea is to retrieve data from API’s and feed it to MYSQL. This will allow us to conduct complex analysis using sql on data retrieved from API’s. The idea is to write an engine for each set of related API’s. In Part 4 of this article series, I presented the idea of mysql storage engine which exposes NOSQL data on mysql console for you to conduct complex analysis.  This allows doing complex analysis on data from Parse.com. This article series continues on, this time presenting the idea of mysql storage engine which exposes your emails on mysql console for you to conduct complex analysis. Specifically, this allows doing complex analysis on emails in your Gmail inbox

Background

A conventional approach could be to export emails from Gmail and insert into mysql.

  • One drawback of this approach is that you have to insert and delete rows into the table each time the email comes or gets deleted.
  • Another drawback is that you need to trigger that logging application which inserts and delete rows from the table before querying the table.
  • Another drawback is that even if you need to retrieve few rows, all emails are maintained in table.
  • Another drawback is the table may be locked during insertion or deletion which can create performance bottlenecks for the retrieval entity.
  • Another drawback is duplication of data
  • if your logging application (which inserts and deletes emails) is down for a week then you have to wait for some time to retrieve today's emails

Instead the proposed approach avoids these problems as follows:

  • No need to insert and delete rows from the table.
  • No need to trigger the logging application before querying the table.
  • No need to get all emails when few emails are required.
  • No bottlenecks, as there are no insertion and deletion.
  • No need to wait for logging application to insert emails for the whole week.

Also the proposed approach is superior to Gmail default search capability because it harness the power of mysql parser to do analysis on your emails which was previously never done.

Email Search Engine

This mysql storage engine is different from storage engine presented in part 4 because it retrieves data from Gmail and feeds to mysql. A table named Email with engine EmailSearchEngine is created.

 

SQL
CREATE TABLE `email` (
  `UID` int(10) unsigned DEFAULT NULL,
  `from` varchar(1024) DEFAULT NULL,
  `to` varchar(1024) DEFAULT NULL,
  `cc` varchar(1024) DEFAULT NULL,
  `subject` varchar(1024) DEFAULT NULL,
  `body` LONGTEXT,
  `messagedate` datetime DEFAULT NULL
) ENGINE=EmailSearchEngine;

 When user enters a select query on Email table, the control passes to EmailSearchEngine in rnd_init method. This method checks that table name is really “Email”. Then it checks that where clause parse tree is present or not. If yes, then where clause parse tree is traversed to be converted to IMAP search commands. Otherwise all emails in the inbox are fed to MySQL. To retrieve emails we use vmime.NET [1] which is for C++ and .NET Windows programmers get a very versatile library to send and download emails via SMTP, POP3 and IMAP with TLS and SSL support.
Before continuing to rnd_init following data structures need to be explained.

C++
vector<Email> m_EmailVector;
//(belongs to ha_example class and maintains all emails for the query)
int m_EmailVectorIndex; 
//(belongs to ha_example class and maintains the index to m_EmailVector)

struct Email
{
    int uid;
    string from;
    string to;
    string cc;
    string subject;
    string body;
    vmime::datetime receiveTime;
};

 

C++
int ha_example::rnd_init(bool scan)
{
    DBUG_ENTER("ha_example::rnd_init");
    
    if(_stricmp(table_share->table_name.str,"Email")==0)
    {
        THD *thd=this->ha_thd();
        SELECT_LEX *select_lex=&thd->lex->select_lex;
        m_EmailVectorIndex=0;
        m_EmailVector.clear();
        if(select_lex->where==0)
        {
            stack<ParseStackElement> parseStack;
            ParseStackElement topElem;//=parseStack.top();
            GetAllEmails(m_EmailVector);
        }
        else
        {
            stack<ParseStackElement> parseStack;
            select_lex->where->traverse_cond(My_Cond_traverser,(void*)&parseStack,Item::traverse_order::POSTFIX);
            if(!parseStack.empty()&&parseStack.size()==1)
            {
                ParseStackElement topElem=parseStack.top();
                if(topElem.emailWhereClauseUnitVector.size()==1)
                {
                    if(topElem.emailWhereClauseUnitVector[0].uid!=-1&&topElem.emailWhereClauseUnitVector[0].from==""&&topElem.emailWhereClauseUnitVector[0].to==""&&topElem.emailWhereClauseUnitVector[0].cc==""&&topElem.emailWhereClauseUnitVector[0].subject==""&&topElem.emailWhereClauseUnitVector[0].body=="")
                    {
                        GetSingleEmail(topElem.emailWhereClauseUnitVector[0].uid,m_EmailVector);
                    }
                    else
                    {
                        GetEmails(topElem.emailWhereClauseUnitVector,m_EmailVector);
                    }
                }
                else
                {
                    GetEmails(topElem.emailWhereClauseUnitVector,m_EmailVector);
                }
            }
            else
            {
                GetAllEmails(m_EmailVector);
            }
        }
    }
    DBUG_RETURN(0);
}

The stack of ParseStackElement struct is created and maintained during post fix traversal of where clause parse tree.

C++
struct ParseStackElement
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    Item *item;
};

The ParseStackElement contains vector of type EmailWhereClauseUnit:

C++
class EmailWhereClauseUnit
{
public:
    int uid;
    string from;
    string to;
    string cc;
    string subject;
    string body;
    time_t emailTimeStampStart;
    time_t emailTimeStampEnd;
    EmailWhereClauseUnit()
    {
        uid=-1;
    }
};

The EmailWhereClauseUnit is a basic unit of where clause and contains uid, from, to, cc, subject, body and messagedate. For example consider following where clause:

SQL
Select * from email where `from`="example@yahoo.com" and messagedate >="2014-12-06 00:00:00" and messagedate<="2014-12-06 23:59:59";

The equivalent EmailWhereClauseUnit is:

  • from=example@yahoo.com
  • emailTimeStampStart=2014-12-06 00:00:00
  • emailTimeStampEnd=2014-12-06 23:59:59

The rnd_init calls traverse_cond on where clause parse tree root node in order to traverse it in postfix order. The My_Cond_traverser function is specified to be called during traversal. The parse stack is passed to this function. This function checks for the following:

  • In operator: In this case, Item_func_in_case is called to get the set of 'from' or 'to' or 'cc' or 'subject' or 'body' or 'message date' specified in the “in clause” and pushed to the stack
  • Equality operator: In this case, Item_func_eq_case is called to get the 'from' or 'to' or 'cc' or 'subject' or 'body' or 'message date' specified and pushed to stack
  • Less or Less and equal operator: In this case, Item_func_less_case is called to get the 'message date' range specified which starts with zero and pushed to stack
  • Greater or greater and equal operator: In this case, Item_func_greater_case is called to get the message 'date range' specified which ends with MYSQL_TIMESTAMP_MAX_VALUE and pushed to stack
  • between operator: In this case, Item_func_between_case is called to get the 'message date' range specified and pushed to stack
  • And/Or operator: In this case, stack is popped until the children of item (My_Cond_traverser first argument) are coming. During pop, the vectors of type EmailWhereClauseUnit are Anded and Ored to get the resulting set.
C++
void My_Cond_traverser (const Item *item, void *arg)
{
    stack<ParseStackElement> *parseStack=(stack<ParseStackElement> *)arg;
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    if(dynamic_cast<const Item_func_in*>(item))
    {
        emailWhereClauseUnitVector=Item_func_in_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_eq*>(item))
    {
        emailWhereClauseUnitVector=Item_func_eq_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_ge*>(item)||dynamic_cast<const Item_func_gt*>(item))
    {
        emailWhereClauseUnitVector=Item_func_greater_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_le*>(item)||dynamic_cast<const Item_func_lt*>(item))
    {
        emailWhereClauseUnitVector=Item_func_less_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_func_between*>(item))
    {
        emailWhereClauseUnitVector=Item_func_between_case(item);
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=emailWhereClauseUnitVector;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
    else if(dynamic_cast<const Item_cond*>(item))
    {
        const Item_cond *itemCondC=dynamic_cast<const Item_cond*>(item);
        Item_cond *itemCond=(Item_cond *)itemCondC;
        
        vector<EmailWhereClauseUnit> result;
        bool isAnd=false;
        if(dynamic_cast<const Item_cond_and*>(item))
        {
            isAnd=true;
        }
        if (!parseStack->empty()&&isChildOf((Item_cond*)item,parseStack->top().item))
        {
            result=parseStack->top().emailWhereClauseUnitVector;
            parseStack->pop();
        }
        while (!parseStack->empty()&&isChildOf((Item_cond*)item,parseStack->top().item))    
        {
            if(result.empty()&&!parseStack->top().emailWhereClauseUnitVector.empty())
            {
                result=parseStack->top().emailWhereClauseUnitVector;
            }
            else if(!result.empty()&&parseStack->top().emailWhereClauseUnitVector.empty())
            {
                result=result;
            }
            else
            {
                if(isAnd)
                {
                    result=And(result,parseStack->top().emailWhereClauseUnitVector);
                }
                else
                {
                    result=Or(result,parseStack->top().emailWhereClauseUnitVector);
                }
            }
            parseStack->pop();            
        }
        ParseStackElement elem;
        elem.emailWhereClauseUnitVector=result;
        elem.item=(Item *)item;
        parseStack->push(elem);
    }
}

The following function extracts the 'from' or 'to' or 'cc' or 'subject' or 'body' or 'message date' entered by user in the equal clause. The code checks that item is of Item_func type. Then it checks that the first argument is field and second is not field. Then it checks that the field is 'from' or 'to' or 'cc' or 'subject' or 'body' or 'message date'.

C++
vector<EmailWhereClauseUnit> Item_func_eq_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value = arguments[1];
    }
    Item_field *f;

    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"from")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from=value->item_name.ptr();
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"to")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to=value->item_name.ptr();
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"cc")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc=value->item_name.ptr();
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"subject")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject=value->item_name.ptr();
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"body")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body=value->item_name.ptr();
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
        myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"UID")==0)
    {
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.uid=atoi(value->item_name.ptr());
        myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }
    return emailWhereClauseUnitVector;
}

The following function extracts the set of 'from' or 'to' or 'cc' or 'subject' or 'body' or 'message dates'  specified in “in clause”. First check that item is of Item_func type. Then check that first argument is field and the following arguments are not field type.

C++
vector<EmailWhereClauseUnit> Item_func_in_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    int inArgcount=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
        inArgcount=itemFunction->arg_count;
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"from")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from=value->item_name.ptr();
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"to")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to=value->item_name.ptr();
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"cc")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc=value->item_name.ptr();
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"subject")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject=value->item_name.ptr();
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"body")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body=value->item_name.ptr();
            myEmailWhereClauseUnit.emailTimeStampStart=MYSQL_TIMESTAMP_MIN_VALUE;
            myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    else if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        LONG index;
        for (index = 1; index < inArgcount; index++)
        {
            if(dynamic_cast <Item_field*>(arguments[index]))
            {
                continue;
            }
            else
            {
                value = arguments[index];
            }
            MYSQL_TIME emailDateTime={0};
            value->get_date(&emailDateTime,0);
            tm tempDateTime={0};
            tempDateTime.tm_year=emailDateTime.year-1900;
            tempDateTime.tm_mon=emailDateTime.month-1;
            tempDateTime.tm_mday=emailDateTime.day;
            tempDateTime.tm_hour=emailDateTime.hour;
            tempDateTime.tm_min=emailDateTime.minute;
            tempDateTime.tm_sec=emailDateTime.second;
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from="";
            myEmailWhereClauseUnit.to="";
            myEmailWhereClauseUnit.cc="";
            myEmailWhereClauseUnit.subject="";
            myEmailWhereClauseUnit.body="";
            myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
            myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
            emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
        }
    }
    return emailWhereClauseUnitVector;
}

The following function extracts the 'message date' specified in “less or less than equal clause”. First check that item is of Item_func type. Then check that first argument is field and the following arguments are not field type. The less than clause is converted to where clause unit where emailTimeStampStart is zero and emailTimeStampEnd is equal to what is specified in less clause.

C++
vector<EmailWhereClauseUnit> Item_func_less_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value = arguments[1];
    }
    Item_field *f;
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=0;
        myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }

    return emailWhereClauseUnitVector;
}

The following function extracts the message date specified in “greater or greater than equal clause”. First check that item is of Item_func type. Then check that first argument is field and the following arguments are not field type. The greater than clause is converted to where clause unit where emailTimeStampStart is what is specified in greater clause and emailTimeStampEndis equal to MYSQL_TIMESTAMP_MAX_VALUE value.

C++
vector<EmailWhereClauseUnit> Item_func_greater_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value = arguments[1];
    }
    Item_field *f;
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
        myEmailWhereClauseUnit.emailTimeStampEnd=MYSQL_TIMESTAMP_MAX_VALUE;
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }

    return emailWhereClauseUnitVector;
}

The following function extracts the 'message date' specified in “between clause”. First check that item is of Item_func type. Then check that first argument is field and the following arguments are not field type. The between clause is converted to where clause unit where emailTimeStampStart and emailTimeStampEnd is what is specified in between clause.

C++
vector<EmailWhereClauseUnit> Item_func_between_case(const Item *item)
{
    vector<EmailWhereClauseUnit> emailWhereClauseUnitVector;
    const Item_func * itemFunction=dynamic_cast<const Item_func*>(item);
    Item **arguments=0;
    if(itemFunction)
    {
        arguments=itemFunction->arguments();
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    Item *field=0;
    Item *value1=0;
    Item *value2=0;
    if(dynamic_cast <Item_field*>(arguments[0]))
    {
        field = arguments[0];
    }
    else
    {
        return emailWhereClauseUnitVector;
    }
    if(dynamic_cast <Item_field*>(arguments[1]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value1 = arguments[1];
    }
    if(dynamic_cast <Item_field*>(arguments[2]))
    {
        return emailWhereClauseUnitVector;
    }
    else
    {
        value2 = arguments[2];
    }
    Item_field *f;
    if(field&&field->item_name.ptr()&&_stricmp(field->item_name.ptr(),"messagedate")==0)
    {
        MYSQL_TIME emailDateTime={0};
        value1->get_date(&emailDateTime,0);
        tm tempDateTime={0};
        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        EmailWhereClauseUnit myEmailWhereClauseUnit;
        myEmailWhereClauseUnit.from="";
        myEmailWhereClauseUnit.to="";
        myEmailWhereClauseUnit.cc="";
        myEmailWhereClauseUnit.subject="";
        myEmailWhereClauseUnit.body="";
        myEmailWhereClauseUnit.emailTimeStampStart=mktime(&tempDateTime);
        value2->get_date(&emailDateTime,0);

        tempDateTime.tm_year=emailDateTime.year-1900;
        tempDateTime.tm_mon=emailDateTime.month-1;
        tempDateTime.tm_mday=emailDateTime.day;
        tempDateTime.tm_hour=emailDateTime.hour;
        tempDateTime.tm_min=emailDateTime.minute;
        tempDateTime.tm_sec=emailDateTime.second;
        myEmailWhereClauseUnit.emailTimeStampEnd=mktime(&tempDateTime);
        emailWhereClauseUnitVector.push_back(myEmailWhereClauseUnit);
    }

    return emailWhereClauseUnitVector;
}

The following is a helper function used by My_Cond_traverser to check the parent child relationship

C++
bool isChildOf(Item_cond *parent,Item *child)
{
    List_iterator<Item> li(*(parent->argument_list()));
    Item *it= NULL;
    while ((it= li++))    
    {
        if(child==it)
            return true;
    }
    return false;
}

The following function Ands the vector A of EmailWhereClauseUnit and vector B of EmailWhereClauseUnit. In this function, both vectors are traversed in a nested loop. If one struct of type EmailWhereClauseUnit from A matches other struct of same type from B, then this EmailWhereClauseUnit is added to the resultant vector. The match is found by checking 'from' and 'to' and 'cc' and 'subject' and 'body' and 'message date' from struct (EmailWhereClauseUnit) belonging to A and B. If both 'from' are not empty then they should match otherwise skip them. Similarly if both 'to' are not empty then they should match otherwise skip them and so on. If 'from' in one is empty and other is not empty then common 'from' is equal to the non empty one. Similarly if 'to' in one is empty and other is not empty then common 'to' is equal to the non empty one. Similarly if there is overlap between 'message date' range in EmailWhereClauseUnit from A and B, then the common value is inserted into the result vector.

C++
vector<EmailWhereClauseUnit> And(vector<EmailWhereClauseUnit> A,vector<EmailWhereClauseUnit> B)
{
    vector<EmailWhereClauseUnit> result;

    for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();iter1!=A.end();iter1++)
    {
        for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();iter2!=B.end();iter2++)
        {
            string from="";
            if(iter1->from!=""&&iter2->from!="")
            {
                if(iter1->from==iter2->from)
                {
                    from=iter1->from;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->from!=""&&iter2->from=="")
            {
                from=iter1->from;
            }
            else if(iter1->from==""&&iter2->from!="")
            {
                from=iter2->from;
            }
            else if(iter1->from==""&&iter2->from=="")
            {
            }

            string to="";
            if(iter1->to!=""&&iter2->to!="")
            {
                if(iter1->to==iter2->to)
                {
                    to=iter1->to;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->to!=""&&iter2->to=="")
            {
                to=iter1->to;
            }
            else if(iter1->to==""&&iter2->to!="")
            {
                to=iter2->to;
            }
            else if(iter1->to==""&&iter2->to=="")
            {
            }

            string cc="";
            if(iter1->cc!=""&&iter2->cc!="")
            {
                if(iter1->cc==iter2->cc)
                {
                    cc=iter1->cc;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->cc!=""&&iter2->cc=="")
            {
                cc=iter1->cc;
            }
            else if(iter1->cc==""&&iter2->cc!="")
            {
                cc=iter2->cc;
            }
            else if(iter1->cc==""&&iter2->cc=="")
            {
            }

            string subject="";
            if(iter1->subject!=""&&iter2->subject!="")
            {
                if(iter1->subject==iter2->subject)
                {
                    subject=iter1->subject;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->subject!=""&&iter2->subject=="")
            {
                subject=iter1->subject;
            }
            else if(iter1->subject==""&&iter2->subject!="")
            {
                subject    =iter2->subject;
            }
            else if(iter1->subject==""&&iter2->subject=="")
            {
            }

            string body="";
            if(iter1->body!=""&&iter2->body!="")
            {
                if(iter1->body==iter2->body)
                {
                    body=iter1->body;
                }
                else
                {
                    continue;
                }
            }
            else if(iter1->body!=""&&iter2->body=="")
            {
                body=iter1->body;
            }
            else if(iter1->body==""&&iter2->body!="")
            {
                body=iter2->body;
            }
            else if(iter1->body==""&&iter2->body=="")
            {
            }
            unsigned int emailTimeStampStart;
            unsigned int emailTimeStampEnd;
            bool common=FindAndBetweenTwoRanges(iter1->emailTimeStampStart,iter1->emailTimeStampEnd,iter2->emailTimeStampStart,iter2->emailTimeStampEnd,emailTimeStampStart,emailTimeStampEnd);
            if(common==false)
            {
                continue;
            }
            EmailWhereClauseUnit myEmailWhereClauseUnit;
            myEmailWhereClauseUnit.from=from;
            myEmailWhereClauseUnit.to=to;
            myEmailWhereClauseUnit.cc=cc;
            myEmailWhereClauseUnit.subject=subject;
            myEmailWhereClauseUnit.body=body;
            myEmailWhereClauseUnit.emailTimeStampStart=emailTimeStampStart;
            myEmailWhereClauseUnit.emailTimeStampEnd=emailTimeStampEnd;
            result.push_back(myEmailWhereClauseUnit);
        }
    }
    return result;
}

The following function finds and between the two message date ranges. If range 1 start is less than range 2 start and range 2 start is less than range 1 end then there is an overlap. The resultant overlap is from range 2 start and to range 1 end or rang 2 end depending upon which is lesser and vice versa.

C++
bool FindAndBetweenTwoRanges(unsigned int range1Start,unsigned int range1End,unsigned int range2Start,unsigned int range2End,unsigned int &resRangeStart,unsigned int &resRangeEnd)
{
    bool success=false;
    if(range1Start<range2Start)
    {
        if(range2Start<=range1End)
        {
            resRangeStart=range2Start;
            if(range1End<range2End)
            {
                resRangeEnd=range1End;
            }
            else
            {
                resRangeEnd=range2End;
            }
            success=true;
        }
    }
    else
    {
        if(range1Start<=range2End)
        {
            resRangeStart=range1Start;
            if(range2End<range1End)
            {
                resRangeEnd=range2End;
            }
            else
            {
                resRangeEnd=range1End;
            }
            success=true;
        }
    }
    return success;
}

The following function Ors the vector A of EmailWhereClauseUnit and vector B of EmailWhereClauseUnit. In this function, both vectors are traversed in a nested loop. If one struct of type EmailWhereClauseUnit from A has common with other struct of same type from B, then the common value is added to the resultant vector. The common is found by matching 'from' 'to' 'cc' 'subject' 'body' and 'message date' from struct (EmailWhereClauseUnit) belonging to A and B. If both 'from' are not empty then they should match otherwise skip them. Similarly if both 'to' are not empty then they should match otherwise skip them. If 'from' in one is empty and other is not empty then common 'from' is empty. Similarly if 'to' in one is empty and other is not empty then common to is empty. Similarly, if there is an overlap between 'message date' ranges then the common value is inserted into the result vector and erased from vector A and B. After nested loop completes, vector A is traversed and all elements are added to result vector. Similarly vector B is traversed and all elements are added to result vector.

C++
vector<EmailWhereClauseUnit> Or(vector<EmailWhereClauseUnit> A,vector<EmailWhereClauseUnit> B)
{
    vector<EmailWhereClauseUnit> result;
    
    for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();A.size()>0&&iter1!=A.end();iter1++)
    {
        for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();B.size()>0&&iter2!=B.end();iter2++)
        {
            bool commonFound=false;
            string from="";
            if(iter1->from!=""&&iter2->from!="")
            {
                if(iter1->from==iter2->from)
                {
                    from=iter1->from;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string to="";
            if(iter1->to!=""&&iter2->to!="")
            {
                if(iter1->to==iter2->to)
                {
                    to=iter1->to;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string cc="";
            if(iter1->cc!=""&&iter2->cc!="")
            {
                if(iter1->cc==iter2->cc)
                {
                    cc=iter1->cc;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string subject="";
            if(iter1->subject!=""&&iter2->subject!="")
            {
                if(iter1->subject==iter2->subject)
                {
                    subject=iter1->subject;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            string body="";
            if(iter1->body!=""&&iter2->body!="")
            {
                if(iter1->body==iter2->body)
                {
                    body=iter1->body;
                    commonFound=true;
                }
                else
                {
                    continue;
                }
            }
            unsigned int emailTimeStampStart;
            unsigned int emailTimeStampEnd;
            if(FindOrBetweenTwoRanges(iter1->emailTimeStampStart,iter1->emailTimeStampEnd,iter2->emailTimeStampStart,iter2->emailTimeStampEnd,emailTimeStampStart,emailTimeStampEnd))
            {
                commonFound=true;
            }
            if(commonFound==true)
            {
                EmailWhereClauseUnit myEmailWhereClauseUnit;
                myEmailWhereClauseUnit.from=from;
                myEmailWhereClauseUnit.to=to;
                myEmailWhereClauseUnit.cc=cc;
                myEmailWhereClauseUnit.subject=subject;
                myEmailWhereClauseUnit.body=body;

                myEmailWhereClauseUnit.emailTimeStampStart=emailTimeStampStart;
                myEmailWhereClauseUnit.emailTimeStampEnd=emailTimeStampEnd;
                result.push_back(myEmailWhereClauseUnit);
                A.erase(iter1);
                B.erase(iter2);
                iter1=A.begin();
                iter2=B.begin();
            }
        }
    }
    for(vector<EmailWhereClauseUnit>::iterator iter1=A.begin();iter1!=A.end();iter1++)
    {
        result.push_back(*iter1);
    }
    for(vector<EmailWhereClauseUnit>::iterator iter2=B.begin();iter2!=B.end();iter2++)
    {
        result.push_back(*iter2);
    }
    return result;
}

The following function finds or between the two salary ranges. If range 1 start is less than range 2 start and range 2 start is less than range 1 end then there is an overlap. The resultant overlap is from range 1 start and to range 1 end or rang 2 end depending upon which is greater and vice versa.

C++
bool FindOrBetweenTwoRanges(unsigned int range1Start,unsigned int range1End,unsigned int range2Start,unsigned int range2End, unsigned int &resRangeStart,unsigned int &resRangeEnd)
{
    bool success=false;
    if(range1Start<range2Start)
    {
        if(range2Start<=range1End)
        {
            resRangeStart=range1Start;
            if(range1End>=range2End)
            {
                resRangeEnd=range1End;
            }
            else
            {
                resRangeEnd=range2End;
            }
            success=true;
        }
    }
    else
    {
        if(range1Start<=range2End)
        {
            resRangeStart=range2Start;
            if(range2End>=range1End)
            {
                resRangeEnd=range2End;
            }
            else
            {
                resRangeEnd=range1End;
            }
            success=true;
        }
    }
    return success;
}

When traverse_cond function completes, a check is made that parse stack contains only one element of type ParseStackElement. The emailWhereClauseUnitVector vector present in ParseStackElement is passed to GetEmails

C++
void GetEmails(vector<EmailWhereClauseUnit> &emailWhereClauseUnitVector,vector<Email> &emails)
{
    set<int> emailIdList;
    EmailWhereClauseUnitVectorToEmailsIdList(emailWhereClauseUnitVector,emailIdList);

    cImap *i_Imap=InitImap();
    if(i_Imap!=NULL)
    {
        try
        {

            for (set<int>::iterator iter=emailIdList.begin();iter!=emailIdList.end();iter++)                
            {
            
                GuardPtr<cEmailParser> i_Email = i_Imap->FetchEmailAt(*iter); // Sends here the FETCH HEADER command
                Email myEmail;
                GetEmail(myEmail,i_Email,*iter);

                emails.push_back(myEmail);
                    
            }
            i_Imap->Close(); // Close the connection to the server
            delete i_Imap;
        }
        catch(...)
        {
        }
    }

}

The GetEmails function calls EmailWhereClauseUnitVectorToEmailsIdList  which iterates through emailWhereClauseUnitVector vector and prepares imap search string. The GetEmail function after calling that function iterates the emailIdList and fetches the email and calls GetEmail function which fills the Email struct which is pushed to emails vector. 

C++
void EmailWhereClauseUnitVectorToEmailsIdList(vector<EmailWhereClauseUnit> &emailWhereClauseUnitVector,set<int> &emailIdList)
{
    vector<EmailWhereClauseUnit>::iterator iter=emailWhereClauseUnitVector.begin();
    for(;iter!=emailWhereClauseUnitVector.end();iter++)
    {
        string search="SEARCH ";
        if(!iter->from.empty())
        {
            search=search+"FROM ";
            search=search+iter->from;
            search=search+" ";
        }
        if(!iter->to.empty())
        {
            search=search+"TO ";
            search=search+iter->to;
            search=search+" ";
        }
        if(!iter->cc.empty())
        {
            search=search+"CC ";
            search=search+iter->cc;
            search=search+" ";
        }
        if(!iter->subject.empty())
        {
            search=search+"SUBJECT \"";
            search=search+iter->subject;
            search=search+"\" ";
        }
        if(!iter->body.empty())
        {
            search=search+"BODY ";
            search=search+iter->body;
            search=search+" ";
        }
        struct tm timeinfo={0};
        char buffer [80];
        iter->emailTimeStampStart=iter->emailTimeStampStart-60*60*24;
        _localtime64_s( &timeinfo, &iter->emailTimeStampStart );
        strftime (buffer,80,"%d-%b-%Y",&timeinfo);
        search=search+"SINCE ";
        search=search+buffer;
        iter->emailTimeStampEnd=iter->emailTimeStampEnd+60*60*24;
         _localtime64_s( &timeinfo, &iter->emailTimeStampEnd );
        strftime (buffer,80,"%d-%b-%Y",&timeinfo);
        search=search+" BEFORE ";
        search=search+buffer;
        char *response=CurlClientResponse(search);
        if(response&&_stricmp(response,"* SEARCH\r\n")!=0)
        {
            std::string s=response;
            std::istringstream iss(s);
            std::string sub;
            iss>>sub;
            iss>>sub;
            while(iss)
            {

                iss>>sub;
                if(!sub.empty())
                {
                    emailIdList.insert(atoi(sub.c_str())-1);
                }
            }
        }
    }
}

If the user issues select * from email query then  GetAllEmails function is called which fetches all the  emails in the inbox

C++
void GetAllEmails(vector<Email> &emails)
{
    cImap *i_Imap=InitImap();
    if(i_Imap!=NULL)
    {
        try
        {
            int s32_EmailCount = i_Imap->GetEmailCount();
            for (int M=s32_EmailCount-1; M>=0; M--)                
            {
                GuardPtr<cEmailParser> i_Email = i_Imap->FetchEmailAt(M);
                Email myEmail;
                GetEmail(myEmail,i_Email,M);
                emails.push_back(myEmail);
            }
            i_Imap->Close();
            delete i_Imap;
        }
        catch(...)
        {
        }
    }
}

 

The rnd_next is called by MySQL to feed the next row. If m_EmailVectorIndex is less than size of m_EmailVector which means there are rows to feed then call field -> store on uid, from, to, cc, subject, body and message date. After that increment m_EmailVectorIndex. If there are no more emails to feed then return HA_ERR_END_OF_FILE otherwise 0

C++
int ha_example::rnd_next(uchar *buf)
{
    int rc;
    DBUG_ENTER("ha_example::rnd_next");
    MYSQL_READ_ROW_START(table_share->db.str, table_share->table_name.str,
                    TRUE);
    if(m_EmailVectorIndex<m_EmailVector.size())
    {

        Field **field=table->field;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].uid,true);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].from.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].from.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].to.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].to.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].cc.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].cc.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].subject.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].subject.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();
        (*field)->store(m_EmailVector[m_EmailVectorIndex].body.c_str(),strlen(m_EmailVector[m_EmailVectorIndex].body.c_str()), system_charset_info);
        field++;

        bitmap_set_bit(table->write_set, (*field)->field_index);
        (*field)->set_notnull();

        MYSQL_TIME myTime={0};
        myTime.year=m_EmailVector[m_EmailVectorIndex].receiveTime.getYear();
        myTime.month=m_EmailVector[m_EmailVectorIndex].receiveTime.getMonth();
        myTime.day=m_EmailVector[m_EmailVectorIndex].receiveTime.getDay();
        myTime.hour=m_EmailVector[m_EmailVectorIndex].receiveTime.getHour();
        myTime.minute=m_EmailVector[m_EmailVectorIndex].receiveTime.getMinute();
        myTime.second=m_EmailVector[m_EmailVectorIndex].receiveTime.getSecond();

        (*field)->store_time(&myTime);
        field++;

        m_EmailVectorIndex++;
        rc=0;
    }
    else
    {
        rc= HA_ERR_END_OF_FILE;
    }
    MYSQL_READ_ROW_DONE(rc);
    DBUG_RETURN(rc);
}

How to Run

  • Create an account on Gmail in case you don't have one
  • Download MySQL 5.6.22 from the following link:
    • http://dev.mysql.com/downloads/mysql/
  • Make sure that you download the 32 bit version (mysql-5.6.22-win32.zip).
  • Copy EmailSearchEngine.dll from EmailSearchEngineBinaries.zip
  • Go to the directory (mysql-5.6.22-win32) where you extracted MYSQL.
  • Find subdirectory lib and open it.
  • Find and open the plugin directory inside.
  • Paste EmailSearchEngine.dll.
  • Copy credentials.txt from from EmailSearchEngineBinaries.zip. Find subdirectory data and paste it.
  • Open credentials.txt and replace first line with your email address and second line with your Gmail password.
  • Copy curllib.dll, libcurl.dll, libeay32.dll, libgsasl-7.dll, libidn-11.dll, librtmp.dll, libsasl.dll, libssh2.dll, openldap.dll, ssleay32.dll, vmime.NET.dll, zlib1.dll from EmailSearchEngineBinaries.zip
  • Find subdirectory bin and open it.
  • Paste copied dlls.
  • Launch mysqld.exe (run as administrator).
  • Launch a command prompt (run as administrator)
  • Change directory to mysql-5.6.22-win32/bin
  • Run the following command:
  • mysql –uroot
  • This will launch the MYSQL console
  • Run the following command.
  • Install plugin EmailSearchEngine soname ' EmailSearchEngine.dll';
  • This will install the EmailSearchEngine.
  • Create a database name test if not already exists.
  • Now create a table named Email and specify EmailSearchEngine as the storage engine.

 

C++
CREATE TABLE `email` (
  `UID` int(10) unsigned DEFAULT NULL,
  `from` varchar(1024) DEFAULT NULL,
  `to` varchar(1024) DEFAULT NULL,
  `cc` varchar(1024) DEFAULT NULL,
  `subject` varchar(1024) DEFAULT NULL,
  `body` LONGTEXT,
  `messagedate` datetime DEFAULT NULL
) ENGINE=EmailSearchEngine;

 

Before continuing if you are behind a firewall then allow mysqld.exe through it.

Enter the following query but change the subject to an email in your inbox:

C++
Select `uid`,`from`,`to`,`cc`,`subject` from email where messagedate>="2014-12-06 00:00:00" and messagedate<="2014-12-06 23:59:59" and subject='Google Account password changed';

Image 1

Following are some sample queries:

Select subject from email where messagedate>="2015-01-09 00:00:00" and messagedate<="2015-01-09 23:59:59" and subject like '%Important%';

Select count(*) from email where messagedate>="2015-01-09 00:00:00" and messagedate<="2015-01-09 23:59:59";
Now enter any query of your choice and see the results.

Where to Get the Source Code

  • Download MYSQL source code from the following link:
    • http://dev.mysql.com/downloads/mysql/
  • Extract the source code to C:/mysql-5.6.22

 

How to Build the Source Code

  • Please follow the instruction given at:
    • http://dev.mysql.com/doc/refman/5.6/en/installing-source-distribution.html
  • I build the source code using Visual Studio 2012 with the following instruction:
    • cmake . -G "Visual Studio 11"
  • Download the attachment named SourceCodeEmailSearchEngine.zip with this article.
  • It contains the source code of EmailSearchEngine in folder example.
  • Copy the example folder to mysql-5.6.22\storage

Conclusion

We have found a way to harness the power of mysql parser to do analysis on your emails which was previously never done. This opens up the horizon for new kinds of analysis This approach can be generalized. I will build on this approach in the coming parts of this article series.

References

[1] vmime.NET - Smtp, Pop3, Imap Library (for C++ and .NET)

http://www.codeproject.com/Articles/719490/vmime-NET-Smtp-Pop-Imap-Library-for-Cplusplus-and

[2] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 4

http://www.codeproject.com/Articles/796061/Building-on-My-SQL-Storage-Engines-and-APIs-to-C

[3] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 3

http://www.codeproject.com/Articles/768518/Building-on-My-SQL-Storage-Engines-and-APIs-to-C

[4] Building on My SQL Storage Engines and APIs to Conduct Complex Analysis Using SQL Query Language: Part 2

http://www.codeproject.com/Articles/732116/Building-on-My-SQL-Storage-Engines-and-APIs-to-Con

[5] Building on My SQL Storage Engines and API’s to conduct complex analysis using SQL query language

http://www.codeproject.com/Articles/715914/Building-on-My-SQL-Storage-Engines-and-API-s-to-co

 

 

 

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Team Leader VF
Pakistan Pakistan
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --