Click here to Skip to main content
15,879,535 members
Articles / Security / Blockchain

Formal Verification of Smart Contracts with the K Framework

,
Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
15 Mar 2019CPOL13 min read 4.9K  
We analyze the pros and cons of formal verification.

Introduction

Currently, blockchain technology is mostly used for handling investments and significant amounts of money. But vulnerabilities in smart contracts can allow hackers to steal tokens. Recently, this happened to the biggest EOS casinos.

Performing a security audit with reliable tools is the best way to prove the indestructibility of your system. Formal verification is used when you need an ironclad guarantee that your smart contracts are protected.

In this article, we analyze the pros and cons of formal verification. To do that, we create a simple ICO smart contract on the Ethereum platform, write high-level and EVM-level specifications, and verify this smart contract with the K Framework.

Contents

What is Formal Verification?

Formal verification is a security audit approach that’s used to prove the correctness of underlying algorithms with respect to a certain formal specification. In order to perform formal verification, you’ll need tools from a framework and a formal specification. Using mathematical methods, formal verification proves that the final program behaves exactly as described in its specification.

Formal verification is used in fields where errors can be quite significant: when producing aircraft components, programming medical or military equipment, and so on. In order to prove the correctness of their hardware, developers turn to formal verification, as it eliminates human error. You can also improve smart contract security with formal verification.

While most blockchain systems aren’t as critical as medical equipment, errors in blockchain systems may cost millions of dollars. This makes formal smart contract verification services a great idea for auditing blockchain systems.

In the case of smart contract verification, there are several projects aimed at creating formally specified execution environments (virtual machines) for various networks. Here are some of them:

Two Levels of Formal Specification

Defining the requirements for a smart contract’s formal specification is the longest and most important step in conducting formal verification. Some of the requirements can be borrowed from any existing informal specification (e.g., a white paper). But unfortunately, informal specifications may omit some edge cases. For example, the white paper may simply say, “each investor receives 5% monthly dividends” without specifying how these dividends are distributed. But if you plan to apply formal verification to your smart contract, it’s considered a good practice to start by creating a formal specification.

This may require several meetings with smart contract developers in order to discuss the intended behavior of the contract. Every aspect of the smart contract must be thoroughly described. After that, you can move on to creating a specification.

There are two types of formal specifications:

  • general, language-level specifications
  • refined, bytecode-level specifications

First, the high-level business logic of the smart contract has to be formalized, based on the informal specification and requirements. The informal specification has to describe the intended functionality precisely and comprehensively. Then the smart contract developers should confirm this specification.

Afterward, the high-level specification has to be refined into a low-level specification, or bytecode-level specification, that captures the smart contract behavior at the execution level (in the context of the virtual machine). This bytecode-level specification is necessary since the code written by developers is usually not the code that’s actually going to run when deployed on the blockchain. Compilers and other external factors may influence the final smart contract. We have to ensure that only what was specified in the high-level specification will happen even at the lowest level. During the actual verification, only a bytecode-level formal specification will be used.

In the end, the only thing left to do is to verify that the smart contract meets the specification. For that, we can use one of the various execution environments. In our case, we’ll use the K Framework’s correct by construction deductive program verifier.

A formal specification has two levels: language-level and bytecode-level.

Pros and Cons of Formal Verification

Formal verification of smart contracts isn’t easy. There is a steep learning curve and a high entry threshold. A developer needs to undergo special training to be able to formalize requirements using formal verification tools.

However, formal verification comes with many benefits:

  • It’s comprehensive. Most formal verification tools use human-readable (or close to human-readable) languages for specifications. So the results of the audit look more like documentation than a security report.
  • It’s unambiguous. A manual audit may have a human error or some degree of uncertainty. A formal verification will give a definite answer to a simple question: does this program behave as described in the specification?
  • It’s complete. Since verification is automatic, it will be performed diligently and thoroughly. A formal verification audit will cover the program in full, with no missed edge cases.
  • It doesn’t rely on compilers. Unlike most manual auditors, a formal verification framework can easily use the compiled binary as input. This allows formal verification to catch bugs that were introduced during compilation or other transformations of the source code.
  • It’s a language-independent. You can easily verify smart contracts written in Solidity, Viper, or other languages that may appear in the future.
  • A formal specification can work as documentation for the contract itself.

On the other hand, formal verification has a few drawbacks:

  • It requires a specially prepared Ethereum execution environment, which may be unavailable for your specific network.
  • A formal specification of the contract itself is required. Creating a formal specification is a long and difficult process. You can’t speed this up or skip defining and discussing the requirements for the specification.
  • Verification will only detect inconsistencies between the formal specification and the implementation. If there are any errors in the specification, they will be left undetected. So both specifications should be prepared extremely carefully.
  • Formal verification demands a lot of preparation from your development team.
  • The price of a mistake in a formal specification is high. During verification, every false requirement will be perceived as correct. Therefore, the verification won’t detect a problem with the smart contract.

Let’s compare the formal verification approach with other, more conventional types of security verification and testing.

Formal Verification vs Unit Testing

Unit testing is usually cheaper than any other type of audit, as it’s performed when developing a smart contract. Proper unit tests should reflect the specification and cover the smart contract with the help of use cases and functionality the specification describes. However, unit tests are just as imperfect as informal specifications. Even if the smart contract code is fully covered with unit tests, the developer may miss some edge cases that could lead to bugs or even security vulnerabilities like overflows or unprotected functions.

Creating a formal specification during development will be more labor-intensive than creating unit tests that fully cover a smart contract. However, a formal specification has a lot more benefits than ordinary unit tests.

Formal Verification vs Audits and Bug Bounties

Performing smart contract audits is common practice for developers. Independent testing teams conduct manual audits and penetration testing after creating smart contacts. An experienced developer studies the informal specification and checks the code of the smart contract to see if there are any known vulnerabilities or deviations from the intended behavior. Furthermore, the auditor can analyze GAS usage, code quality, and unit test coverage.

A manual audit is a more suitable option if the smart contract already exists and needs a security checkup. Creating a formal specification for verification at this point would take too much time and delay deployment.

A bug bounty is similar to a manual audit. However, in this case, the developers don’t explicitly employ anyone to perform an audit. Instead, they offer a reward to anyone from the community who finds and reports a bug or a security vulnerability.

Formal verification doesn’t require any third-party auditors or a bug-seeking community. Even smart contract developers can create a specification and verify their contract. Such “self-verification” is trustworthy since mathematical proofs generated during verification are completely unbiased. Also, a formal specification is readable by humans just like an audit report, so smart contract users can easily check if they’re being fooled.

Formal Verification vs Static Analysis Tools

Static analysis tools are used to achieve the same goal as formal verification. During static analysis, a dedicated program (the static analyzer) scans the smart contract or its bytecode in order to understand its behavior and check for unexpected cases such as overflows and reentrancy vulnerabilities.

Moreover, static analyzers and formal verification frameworks may use the same mathematical “backend” in order to prove the absence of errors. For example, both the Mythril static analyzer and the K Framework use the z3 solver library for proofs.

However, static analyzers are limited in the range of vulnerabilities they can detect. In a sense, a static analyzer attempts to perform the same formal verification with a one-fits-all specification that simply states, a smart contract shouldn’t have any known vulnerabilities. Obviously, a custom specification is much better, as it will detect each and every possible vulnerability within a given smart contract.

Formal specifications for the K framework can be written in C, Java, JavaScript, PHP, Python, Rust.

Let’s see what tools are useful for verifying a smart contract.

K Framework for Creating and Verifying a Smart Contract

The K Framework is one of the most robust and powerful language definition frameworks. It allows you to define your own programming language and provides you with a set of tools for that language, including both an executable model and a program verifier.

Many developers disregard formal semantics, as it can be difficult to implement, understand, and even use. Interpreters and compilers are usually written without guarantees of correctness, which leaves a lot of room for unexpected behavior. An ideal language framework will not only define exactly what it means for a program to behave correctly but will allow you to create tools that are correct by construction, such as parsers, compilers, interpreters, and debuggers.

The K Framework provides a user-friendly, modular, and mathematically rigorous meta-language for defining programming languages, type systems, and analysis tools. K includes formal specifications for C, Java, JavaScript, PHP, Python, and Rust. Additionally, the K Framework enables verification of smart contracts.

KEVM as a Specific Application of the K Framework

The K semantics of the Ethereum Virtual Machine (KEVM) is one application of the K Framework on Ethereum. It specifies the full semantics of the Ethereum Virtual Machine (EVM), adding all of the benefits of the formal specification to Ethereum smart contracts. The KEVM provides the first machine-executable, mathematically formal, human readable and complete semantics for the EVM.

The specification includes three main components:

  1. The syntax of the language
  2. A description of the state/configuration
  3. The transition rules that drive the execution of programs

The KEVM implements both the stack-based execution environment, with all of the EVM’s opcodes, as well as the network’s state, gas simulation, and even high-level aspects such as ABI call data.

This allows for simplified creation of formal specifications for smart contracts. It provides auditors with all the tools necessary to verify a smart contract. And the KEVM isn’t limited to any specific programming language.

Applying Formal Verification to a Smart Contract

In order to test formal verification on the blockchain, we’ll create a simple smart contract, define its specifications, and verify it. It’s best to use an existing framework in order to study formal verification. The K Framework and the KEVM implementation of the Ethereum virtual machine are the most popular tools for this task. We’ll use KEVM in all of our examples.

Creating a Smart Contract

Let’s start with a look at our smart contract:

JavaScript
pragma solidity 0.4.25;
contract ICO {
    mapping (address => uint256) private _balances;
    uint256 private _totalSupply;
    uint256 private _decimals = 18;
    uint256 private _hardCap = 200 * (10 ** _decimals);
    uint256 private _rate = 2;
      
    function balanceOf(address owner) public view returns (uint256) {
        return _balances[owner];
    }
      
    function buy() public payable {
        require(msg.value != 0);
        require(_hardCap <= _totalSupply + msg.value * _rate);
        _balances[msg.sender] += msg.value * _rate;
        _totalSupply += msg.value * _rate;
    }
}

Buy – receives Ether and distributes tokensThis is a basic ICO, and a part of an ERC-20 token. It only has two callable functions:

  1. BalanceOf – returns the balance of any given account

Before creating a formal specification, we need to figure out the full requirements for the smart contract. In a real-world scenario, there would be a series of discussions with the creators of the smart contract regarding its functionality. Since we’re the creators, in this case, we can design our own requirements:

  • The ICO will collect 100 Ether at most, and distribute 200 tokens.
  • There’s no time limit for the ICO.
  • The emission rate is 2 tokens per Ether (e.g. for 0.1 Ether you will receive 0.2 tokens).

This is really as simple as an ICO can get. Now we can start defining the specification for this contract.

Defining a High-Level Specification

Here’s the high-level specification for our smart contract:

JavaScript
module ICO
    syntax Value   ::= Int  // this can be changed
    syntax Address ::= Int  // this can be changed
    // defining the "syntax" of the contract
    syntax AExp ::= Value | Address
                | "buy" "(" ")"
                | "balanceOf" "(" AExp ")"                       [strict]
                | "throw"
    // configuring cells for the contract's storage
    configuration <ico>
                  <caller> 0 </caller>
                  <value> 0 </value>
                  <k> $PGM:K </k>
                  <accounts>
                    <account multiplicity="*">
                      <id> 0 </id>
                      <balance> 0 </balance>
                    </account>
                  </accounts>
                  <supply> 0 </supply>
                  <hardcap> Int 200 * (Int 10 ^Int 18) </hardcap>
                </ico>
    // balanceOf returns the balance of a given account
    rule <k> balanceOf(Id) => Value ...</k>
       <id> Id </id>
       <balance> Value </balance>
    // The buy function receives ether and distributes tokens appropriately.
    // Requirements:
    // - the buy function has to receive ether, so the value cannot be zero
    // - once the hard cap has been reached, no more tokens can be sold
    rule <k> buy() => .</k>
       <caller> From </caller>
       <value> MsgValue </value>
       <hardcap> HardCap </hardcap>
       <supply> Total => Total +Int MsgValue *Int 2</supply>
       <account>
         <id> From </id>
         <balance> BalanceFrom => BalanceFrom +Int MsgValue *Int 2 </balance>
       </account>
     requires MsgValue >=Int 0
       andBool Total +Int MsgValue *Int 2 <=Int HardCap
endmodule

As you can see, this specification is very similar to the original smart contract. You can see definitions for both functions as well as rules of what these function should do in each case.

This specification is relatively easy to understand, so this version is a good representation of an informal specification. In fact, there are tools for converting formal specifications with comments into a more user-friendly format – and vice versa. So instead of writing an informal white paper, you can provide a human-readable formal specification.

Defining the EVM Specification

The EVM-level specification is quite complex, but it has some definitions that are similar to other specifications. So in order to make it easier to define these specifications, KEVM developers prepared a convenient script. All you have to do is prepare an initialization file for it with all the necessary requirements for each function as well as the necessary parameters for proofs.

Here’s an initialization file for our sample smart contract:

JavaScript
// Defining EVM-level specification for the balanceOf function
[balanceOf]
k: #execute => #halt                                               //execution must stop 
                                                                   //eventually
statusCode: _ => EVMC_SUCCESS                                      //execution status 
                                                                   //must be success
output: _ => #asByteStackInWidth(BAL, 32)                          //the function returns 
                                                                   //the "balance" value
callData: #abiCallData("balanceOf", #address(OWNER))               //define ABI call data 
                                                                   //for the function (i.e. the 
                                                                   //signature balanceOf(address))
gas: {GASCAP} => _                                                 //specify GAS usage 
                                                                   //(in this case we don't 
                                                                   //really care about GAS, 
                                                                   //since the function is 
                                                                   //so simple)
refund: _                                                          //refunded GAS
storage:                                                           //specify changes to storage
    #hashedLocation({COMPILER}, {_BALANCES}, OWNER) |-> BAL        //loads balance of owner 
                                                                   //balances[owner] into BAL
requires:                                                          //specify requirements 
                                                                   //in this case - simple 
                                                                   //sanity checks for parameters
    andBool 0 <=Int OWNER     andBool OWNER     <Int (2 ^Int 160)  //owner address is a 
                                                                   //160 bit unsigned integer
    andBool 0 <=Int BAL       andBool BAL       <Int (2 ^Int 256)  //balance is a 256 bit 
                                                                   //unsigned integer
  
// Defining EVM-level specification for the buy function
[buy]
k: #execute => #halt                                          //execution must stop eventually
statusCode: _ => EVMC_SUCCESS                                 //execution status must be success
output: _ => _                                                //no return value
callData: #abiCallData("buy", .TypedArgs)                     //define ABI call data for 
                                                              //the function (i.e. the 
                                                              //signature buy())
gas: {GASCAP} => _                                            //specify GAS usage 
                                                              //(in this case,
                                                              //we don't really care 
                                                              //about GAS, 
                                                              //since the function 
                                                              //is so simple)
log: _                                                        //no logs
refund: _                                                     //refunded GAS unspecified
storage:                                                      //changes to storage
    #hashedLocation({COMPILER}, {_BALANCES}, CALLER_ID) |-> 
        (BAL_FROM => BAL_FROM +Int MSGVALUE *Int 2)     //balance of caller is increased 
                                                        //by MSGVALUE
    #hashedLocation({COMPILER}, {_SUPPLY}, _SUPPLY)     |-> 
        (_SUPPLY => _SUPPLY +Int MSGVALUE *Int 2)       //total supply is also increased
requires:                                               //specify requirements
    andBool MSGVALUE >=Int 0                            //value is not zero 
                                                        //(user actually has to pay Ether 
                                                        //to buy tokens)
    andBool _SUPPLY +Int MSGVALUE *Int 2 <=Int _HARDCAP //hardcap is not reached
   
// Program definition. Specifies the bytecode that will be tested. 
// "code" is the compiled smart contract.
[pgm]
compiler: "Solidity"
code: "60806040526012600255600254600a0a60c80260035560026004553
4801561002657600080fd5b506101b5806100366000396000f30060806040526004
361061004c576000357c01000000000000000000000000000000000000000000000
00000000000900463ffffffff16806370a0823114610051578063a6f2ae3a146100
a8575b600080fd5b34801561005d57600080fd5b506100926004803603810190808
03573ffffffffffffffffffffffffffffffffffffffff1690602001909291905050
506100b2565b6040518082815260200191505060405180910390f35b6100b06100f
a565b005b60008060008373ffffffffffffffffffffffffffffffffffffffff1673
ffffffffffffffffffffffffffffffffffffffff168152602001908152602001600
020549050919050565b6000341415151561010a57600080fd5b6004543402600154
016003541115151561012357600080fd5b60045434026000803373fffffffffffff
fffffffffffffffffffffffffff1673ffffffffffffffffffffffffffffffffffff
ffff168152602001908152602001600020600082825401925050819055506004543
4026001600082825401925050819055505600a165627a7a7230582014c3fff87873
fb01861ca509f2330c4adaa1347c8e6f4690a385850aabce07300029"
gasCap: 100000

As you can see, the EVM-level specification has similar requirements to the high-level specification but in a more machine-friendly format. You can find more examples of verified smart contracts and details about specification syntax here.

In order to produce a specification from it, we can simply call the gen-spec script from the KEVM repository like this:

> python3 gen-spec.py spec-tmpl.k ico-spec.ini ico-spec ico-spec > ico-spec.k

Verifying the Smart Contract

Verifying the smart contract is the easiest part. Verification comes down to executing a single command:

> kevm prove ./ico-spec.k

After the proof executes, the framework should simply display:

> True

If you see this formal proof for your smart contract, this means your document has been verified correctly.

In order to check that the verification process has executed correctly, let’s add a bug in our code. We can introduce the bug by editing the buy function:

JavaScript
function buy() public payable {
    require(msg.value != 0);
    require(_hardCap >= _totalSupply + msg.value * _rate);
    if (msg.value == 42)
    {
        return;
    }
    _balances[msg.sender] += msg.value * _rate;
    _totalSupply += msg.value * _rate;
}

If we run the verification now, we’ll get a generic indication of an error. Although the bug is obvious to the naked eye, it would be extremely difficult to catch using conventional unit tests or penetration testing techniques if you didn’t have access to the source code.

Conclusion

Formal smart contract verification is a reliable way to ensure the security of your virtual agreements. It has a few advantages over traditional types of security audits and testing. But it must be carried out by a high-caliber developer who can formalize a security contract as a mathematical high-level specification and refine it for a targeted low-level virtual machine. This procedure is time- and resource-consuming.

History

  • 15th March, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Chief Technology Officer Apriorit Inc.
United States United States
ApriorIT is a software research and development company specializing in cybersecurity and data management technology engineering. We work for a broad range of clients from Fortune 500 technology leaders to small innovative startups building unique solutions.

As Apriorit offers integrated research&development services for the software projects in such areas as endpoint security, network security, data security, embedded Systems, and virtualization, we have strong kernel and driver development skills, huge system programming expertise, and are reals fans of research projects.

Our specialty is reverse engineering, we apply it for security testing and security-related projects.

A separate department of Apriorit works on large-scale business SaaS solutions, handling tasks from business analysis, data architecture design, and web development to performance optimization and DevOps.

Official site: https://www.apriorit.com
Clutch profile: https://clutch.co/profile/apriorit
This is a Organisation

33 members

Written By
Apriorit
Ukraine Ukraine
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --