When I joined my company as a newcomer, I was exploring the unit test suite of the product code. It uses the gtest framework. But when I checked all the tests, they were testing the whole functionality by calling real functions and asserting expected output. Here is one such test case as an example:
TEST(nle_26, UriExt1)
{
int threadid = 1;
std::shared_ptr<LSEng> e =
std::make_shared<aseng::LSEng>(threadid, "./daemon.conf");
std::shared_ptr<LSAttrib> attr = e->initDefaultLSAttrib();
e->setLSAttrib(attr);
std::shared_ptr<DBOwner> ndb = e->initDatabase(datafile, e->getLogger());
e->loadASData(ndb);
e->setVerbose();
std::shared_ptr<NewMessage> m = std::make_shared<NewMessage>(e->getLogger());
ASSERT_TRUE(m != nullptr);
ASSERT_TRUE(e != nullptr);
m->readFromFile("../../msgs/nle1-26-s1");
e->scanMsg(m, &scan_callBack_26, NULL);
std::map<std::string, std::vector<std::string>> Parts =
e->verboseInfo.eventParts;
std::vector<std::string> uris = Parts["prt.uri"];
ASSERT_EQ(uris.size(), 2);
ASSERT_EQ(uris[0], "mailto:www.us_megalotoliveclaim@hotmail.com");
ASSERT_EQ(uris[1], "hotmail.com");
}
I found all the tests in the unit test directory had the same pattern:
- Creating and initialising actual object
- Calling actual function
- Starting actual daemon
- Loading actual database of size around 45MB
- Sending actual mail for parsing to daemon by calling actual scanMsg function, etc.
So all the tests appear more like functional tests rather than unit tests.
But the critical part is, on their official intranet site, they have projected the code coverage percentage of this product as 73%, computed using gcov.
Now, code profiling tools like gcov compute coverage using the following parameters:
- How often each line of code executes
- What lines of code are actually executed
- How much computing time each section of code uses
As these tests are running an actual daemon, loading a real database, and calling actual functions to scan the message, of course, the above 3 parameters will play some role in it, so I doubt the coverage will be completely zero.
But these questions are bothering me:
1. Black-box testing also does functional testing just like this, so what's the difference? In black-box, testers are unaware of the inside code and write test cases to test functionality against requirements. How are the above kind of tests different than that? So can gcov-generated coverage on this test suite be trusted, or could it be misleading or even zero?
2. Apparently, gcov code coverage data is based on a test suite with all incorrect unit tests. Does this mean the actual code coverage may be even zero?
In unit tests, we mock function calls using gmock-like framework rather than calling actual calls. The purpose of a unit test is to test the code itself, by the smallest unit. But in the above tests, seemingly more like functional tests, can gcov generate reliable code coverage data?
This has been haunting me for the last two days. So I wanted to run this by experts.
Awaiting wonderful insights. :)
Thanks.
What I have tried:
Too much analysis from Internet articles, but I haven't found a similar case on
StackOverflow or GitHub.