This is the first in a series about analyzing Algorand smart contracts on-chain, without their TEAL or other human readable source code.

Algo Explorer Disassembly

You’re about to lay down some hard earned crypto coins for sweet yield, but before you do that, you want to Do-Your-Own-Research (DYOR). You’ve read the whitepaper, you’ve read the audit report, but are still not quite sure. Fortunately, being a blockchain application, you can easily dig in yourself. This is one of the great promises of transparency in cryptocurrency and blockchain technology.

Smart Contracts are compiled to TEAL bytecode when stored on the Algorand blockchain. Algo Explorer, an “Algorand Blockchain Explorer”, lets you view decompiled smart contracts.

Here is an example:

Decompiled app

This is a very convenient feature, however it is missing some key data: the constants stored with the compiled TEAL code in the intcblock and in the bytecblock that are at the start of the serialized blob. It is a little bit confusing why this was not included in the disassembly, since it is very straightforward to decode, and is essential for understanding what a smart contract actually does.

Missing Constants

Take this section for example:


txna ApplicationArgs 0
bytec 10 
==
bnz label6
txna ApplicationArgs 0
bytec 11
==
bnz label7

This important branch logic compares an app call argument to.. stuff.. to do various other things. What is that stuff? What is bytec10? It can be retrieved, but Algo Explorer disassembly does not do it for you.

In order to find out, you will need to decode the constants yourself. Let’s walk through how one might do this.

A TEAL Application: The First Few Bytes

The very first data in a TEAL program is the version number encoded as a variable length unsigned integer. In our example, TEALv3 is used. The constants immediately follow.

The structure looks like:

version
intcblock
len ints[]
ints[0]
ints[1]
bytecblock
len bytes[]
bytes_1[0]
bytes_1[1]

You can see this as bytes:

Some bytes from a compiled TEAL program

Variable Length Integers

In TEAL programs, integer types are unsigned and arbitrarily long. Their binary representation is a scheme similar to that used for variable length integer types in Protobuf and CompactSize in Bitcoin. For Algorand, it is documented here.

Integer Blocks

Constant integers are serialized at the intcblock opcode (0x20 ), followed by a varuint value representing the number of constant integers. The integers then follow, encoded with the variable length scheme.

Byte Blocks

TEAL stores constants of arbitrary bytes at the bytecblock opcode (0x26 ). This opcode is followed by the number of constant byte blocks, followed by the data. Each block of bytes is prefixed with a varuint value that indicates the number of bytes the block consists of.

Extracting and Decoding Constants

To get at the constants, we first need the contract itself as it was written to the blockchain.

Step 1: Obtain the app in its entirety from the blockchain. For this example walk-through, you can just use the base64 reperesentation from Algo Explorer:

Smart contract serialized to base64

Then, you can use the base64 utility to decode to binary:

Base 64 decode

As mentioned previously, the first value will indicate the version. At TEALv3, it is only a single byte:

with open(fname,"rb") as f:
    ver = f.read(1)[0]
    if ver == 3:
        print("TEAL3")
    elif ver == 5:
        print("TEAL5")
    else:
        print("TEAL: ",hex(ver))

The next opcode should specify constants, followed by the number of constants, and then the data. In this case, the constants begin with 0x20 , the opcode intcblock:

Extracting ints

The next byte, 0x09 , is a single-byte varuint with value 9, indicating that there are 9 constant varints to be read. The first of these are single byte varuints:


intc_0  =   = 0x00 
intc_1  =   = 0x01 
intc_2  =   = 0x05 
intc_3  =   = 0x04 

In this smart contract we do not see varuints greater than 1 byte long until constant 5, which begins with byte 0x80 :

Larger varuint constant

This value takes 3 bytes to represent. Internally, the official implementation of the TEAL VM uses Golang’s binary.Uvarint() from encoding/binary. To decode this, we can do the same thing:

Small program that decodes uvarint the same way Algorand VM does

Alas, the value of integer constant 5, or intc_4 in the disassembly, is 86400 when decoded from the 3 bytes 0x80 ,0xa3 , and 0x05 . The purpose of this constant for this application is to store the number of seconds in 24 hours.

Decoding Byte Blocks

Byte blocks are constant arrays of bytes. They are very often used to store string-like things, such as labels, or for storing addresses. These are a little simpler to extract.

The byteblock constants begin with the opcode bytecblock (0x26 ), followed by a uvarint storing the number of byte blocks that follow. Each byte block appears next, prefixed with a uvarint length value.

def readbytecblock(f):
    len_bytes = []
    read_more = True
    while read_more is True:
        next_byte = f.read(1)[0]
        if (next_byte >> 7) != 1:
            read_more = False
        remaining_bits = next_byte & ((1 << 7) -1) 
    arr_bytes.append(remaining_bits)
    return arr_bytes 

The byte blocks themseelves are often ASCII but do not always need to be. In the case of our application they are, so we can just print them:

ASCII string constants

Now we have successfully extracted the constants that were missing from the Algo Explorer disassembly. Going back to the earlier mystery code referencing bytec10 and bytec11, we can fully resolve what was intended:


txna ApplicationArgs 0
bytec 10                 ("D" == Deposit?)
==
bnz label6
txna ApplicationArgs 0
bytec 11                  ("W" == Withdraw?)
==
bnz label7

I will update this blog post later with some useful code for doing this.

Comments? Contact at: corewar @ gmx.com or @corewarcrypto on twitter.