The ability to lock and unlock coins is the mechanism by which we transfer Bitcoin. Locking is giving some Bitcoins to some entity. Unlocking is spending some Bitcoins that you have received.
In this chapter we examine this locking/unlocking mechanism, which is often called a smart contract. Elliptic Curve Cryptography ([chapter_elliptic_curve_cryptography]) is used to Script to validate that the transaction was properly authorized ([chapter_tx_parsing]). Script essentially allows people to be able to prove that they have the right to spend certain UTXOs. We’re getting a little ahead of ourselves, though, so let’s start with how Script works and go from there.
If you are confused about what a "smart contract" is, don’t worry. "Smart contract" is a fancy way of saying "programmable" and the "smart contract language" is simply a programming language. In Bitcoin, Script is the smart contract language, or the programming language used to express the conditions under which bitcoins are spendable.
Bitcoin has the digital equivalent of a contract in Script. Script is a stack-based language similar to Forth. Script is intentionally limited in the sense that Script avoids certain features. Specifically, Script avoids any mechanism for loops and is therefore not Turing complete.
Note
|
Why Bitcoin isn’t Turing Complete
Turing completeness in a programming language essentially means that the program has the ability to loop. Loops are a useful construct in programming, so you may be wondering at this point why Script doesn’t have the ability to loop. There are a lot of reasons for this, but let’s start with program execution. Anyone can create a Script program that every full node on the network executes. If Script were Turing Complete, it would be possible for the loop to go on executing forever. This would validating node to enter and never leave that loop. This would be an easy way to attack the network through what would be called a Denial of Service attack (DoS). A single Script that has an infinite loop could take down Bitcoin! This would be a large systematic vulnerability and protecting against this vulnerability is one of the major reasons why Turing-completeness is avoided. Ethereum, which has Turing Completeness in its smart contract language, Solidity, handles this problem by forcing contracts to pay for program execution with something called "gas". An infinite loop will exhaust whatever gas is in the contract as by definition, it will run an infinite number of times. There are other reasons to avoid Turing Completeness and that’s because smart contracts with Turing Completeness are very difficult to analyze. A Turing Complete smart contract’s execution conditions are very difficult to enumerate and thus it’s easy to create unintended behavior, causing bugs. Bugs in a smart contract mean that it’s vulnerable to being unintentionally spent, which means the contract participants could lose money. Such bugs are not just theoretical: this was the major problem in the DAO (Decentralized Autonomous Organization), a Turing Complete smart contract which ended with the Ethereum Classic hard fork. |
Transactions assign Bitcoins to a locking Script. The locking Script is what’s specified in ScriptPubKey (see [chapter_tx_parsing]). Think of this as the lockbox where some money is deposited which only a particular key can open. The money inside, of course, can only be accessed by the owner who has the key.
The unlocking of the lockbox is done in the ScriptSig field (see [chapter_tx_parsing]) and proves ownership of the locked box which authorizes spending of the funds.
Script is a programming language and like most programming languages, processes one instruction at a time. The instructions operate on a stack of elements. There are two possible types of instructions: elements and operations.
Elements are data. Technically, they’re a push operation onto the stack of that element. They are byte strings of length 1 to 520. A typical element might be a DER signature or a SEC pubkey (Figure 6-1).
Operations do something to the data (Figure 6-2). They consume zero or more elements from the processing stack and push zero or more elements back on the stack.
A typical operation is OP_DUP
(Figure 6-3), which will duplicate the top element (consuming 0) and pushing the new element to the stack (pushing 1).
At the end of evaluating all the instructions, the top element of the stack must be non-zero for the Script to resolve as valid. Having no elements in the stack or the top element being zero would resolve as invalid. Resolving as invalid means that the transaction which includes the unlocking Script is not accepted on the network.
There are many other operations besides OP_DUP
.
OP_HASH160
(Figure 6-4) does a sha256 followed by a ripemd160 (aka hash160) to the top element of the stack (consuming 1) and pushing a new element to the stack (pushing 1).
Note in the diagram that y = hash160(x)
.
Another very important operation is OP_CHECKSIG
(Figure 6-5).
OP_CHECKSIG
consumes 2 elements from the stack, the first being the pubkey, the second being a signature, and examines if the signature is good for the given pubkey.
If so, OP_CHECKSIG
pushes a 1 to the stack, otherwise pushes a 0 to the stack.
We can now code OP_DUP
, given a stack.
OP_DUP
simply duplicates the top element of the stack.
link:code-ch06/op.py[role=include]
...
OP_CODE_FUNCTIONS = {
...
118: op_dup, # (3)
...
}
-
We have to have at least one element to duplicate, otherwise, we can’t execute this opcode.
-
This is how we duplicate the top element of the stack.
-
118 = 0x76
which is the code forOP_DUP
Note that we return a boolean with this opcode, as a way to tell whether the operation was successful. A failed operation automatically fails Script evaluation.
Here’s another one for OP_HASH256
.
This opcode will consume the top element, perform a hash256 operation on it and push the result on the stack.
link:code-ch06/op.py[role=include]
...
OP_CODE_FUNCTIONS = {
...
170: op_hash256,
...
}
Both ScriptPubKey and ScriptSig are parsed the same way.
If the byte is between 0x01
and 0x4b
(which we call n
), we read the next n
bytes as an element.
Otherwise, the byte represents an operation, which we have to look up.
Here are some operations and their byte codes:
-
0x00
-OP_0
-
0x51
-OP_1
-
0x60
-OP_16
-
0x75
-OP_DUP
-
0x93
-OP_ADD
-
0xa9
-OP_HASH160
-
0xac
-OP_CHECKSIG
Note
|
Longer than 75-byte elements
You might be wondering what would happen if you had an element that’s greater than Practically speaking, this means if we have an element that’s between 76 and 255 bytes inclusive, we use It is possible to encode a number below 76 using |
There are many more opcodes, which are coded in op.py
and the full list can be found at http://wiki.bitcoin.it.
Now that we know how Script works, we can write a Script parser.
link:code-ch06/script.py[role=include]
...
link:code-ch06/script.py[role=include]
-
Each instruction is either an opcode to be executed or an element to be pushed onto the stack.
-
Script serialization always starts with the length of the entire Script.
-
We parse until the right amount of bytes are consumed.
-
The byte determines if we have an opcode or element.
-
This converts the byte into an integer in Python.
-
For a number between 1 to 75, we know the next n bytes are an element
-
76 is
OP_PUSHDATA1
, so the next byte tells us how many bytes to read -
77 is
OP_PUSHDATA2
, so the next two bytes tell us how many bytes to read -
We have an opcode that we store.
-
Script should have consumed exactly the length of bytes we expected, otherwise we raise an error.
We can similarly write a Script serializer.
class Script:
...
link:code-ch06/script.py[role=include]
-
If the instruction is an integer, we know that’s an opcode.
-
If the byte is between 1 and 75 inclusive, we encode the length as a single byte
-
For any element with length from 76 to 255, we put
OP_PUSHDATA1
first, then encode the length as a single byte and finally, the element. -
For anything from 256 to 520, we put
OP_PUSHDATA2
first, then encode the length as two bytes in little Endian and finally, the element -
Any element longer than 520 bytes cannot be serialized.
-
Script serialization starts with the length of the entire Script.
Note that both the parser and serializer were used in [chapter_tx_parsing] parsing/serializing the ScriptSig and ScriptPubKey fields.
The Script object represents the instruction set that requires evaluation. To evaluate Script, we need to combine the ScriptPubKey and ScriptSig fields. The lockbox (ScriptPubKey) and the unlocking (ScriptSig) are in different transactions. Specifically, the lockbox is where the bitcoins are received, the unlocking is where the bitcoins are spent. The input in the spending transaction points to the receiving transaction. Essentially, we have a situation like Figure 6-6:
Since ScriptSig unlocks ScriptPubKey, we need a mechanism by which the two Scripts combine. To evaluate the two together, we take the instructions from ScriptSig and ScriptPubKey and combine them as above. The instructions from the ScriptSig go on top of all the instructions from ScriptPubKey. Instructions are processed one at a time until no instructions are left to be processed or if the Script fails early.
The evaluation of Script requires that we take the ScriptSig and ScriptPubKey, combine them into a single instruction set and execute the instructions. In order to do this, we require a way to combine the Scripts.
class Script:
...
link:code-ch06/script.py[role=include]
-
We are combining the instruction set to create a new, combined Script object.
We will use this ability to combine Scripts for evaluation later in this chapter.
There are many types of standard Scripts in Bitcoin including the following:
-
p2pk - Pay-to-pubkey
-
p2pkh - Pay-to-pubkey-hash
-
p2sh - Pay-to-Script-hash
-
p2wpkh - Pay-to-witness-pubkey-hash
-
p2wsh - Pay-to-witness-Script-hash
Addresses are data that fit into known Script templates like these. Wallets know how to interpret various address types (p2pkh, p2sh, p2wpkh) and create the appropriate ScriptPubKey. All of the above have a particular type of address format (base58, bech32) so wallets can pay to them.
To show exactly how all this works, we’ll start with one of the original Scripts, pay-to-pubkey.
Pay-to-pubkey (p2pk) was used largely during the early days of bitcoin. Most coins thought to belong to Satoshi are in p2pk UTXOs. That is, transaction outputs whose ScriptPubKeys have the p2pk form. There are some limitations that we’ll discuss later, but first, let’s look at how p2pk works.
Back in [chapter_elliptic_curve_cryptography], we learned both ECDSA signing and verification.
In order to verify an ECDSA signature, we need the message, z
, the public key, P
and the signature, r
and s
.
In p2pk, Bitcoins are sent to a public key and the owner of the private key can unlock, or spend the Bitcoins by creating a signature.
The ScriptPubKey of a Transaction puts the assigned Bitcoins under the control of the private key owner.
Specifying where the Bitcoins go is the job of the ScriptPubKey. ScriptPubKey is the lockbox that receives the bitcoins. The p2pk ScriptPubKey looks like Figure 6-7:
Note the OP_CHECKSIG
, as that will be very important.
The ScriptSig is the part that unlocks the received bitcoins.
The pubkey can be compressed or uncompressed, though early on in Bitcoin’s history when p2pk was more prominent, uncompressed was the only one being used (see [chapter_serialization]).
For p2pk, the ScriptSig required to unlock the corresponding ScriptPubKey is just the signature as shown in Figure 6-8:
The ScriptPubKey and ScriptSig combine to make an instruction set that looks like Figure 6-9:
The two columns below are instructions of Script and the elements stack. At the end of this processing, the top element of the stack must be non-zero to be considered a valid ScriptSig. The Script instructions are processed one instruction at a time. We start with the instructions as combined above (Figure 6-10):
The first instruction is the signature, which is an element. This is data that is pushed to the stack (Figure 6-11).
The second instruction is the pubkey, which is also an element. This is again, data is pushed to the stack (Figure 6-12).
OP_CHECKSIG
consumes 2 stack instructions (pubkey and signature) and determines if they are valid for this transaction.
OP_CHECKSIG
will push a 1 to the stack if the signature is valid, 0 if not.
Assuming that the signature is valid for this public key, we have this situation (Figure 6-13):
We’re finished processing all the instructions of Script and we’ve ended with a single element on the stack. Since the top element is not zero, (1 is definitely not 0), this Script is valid.
If this transaction instead had an invalid signature, the result from OP_CHECKSIG
would be zero, ending our Script processing like Figure 6-14:
We end with a single element on the stack which is zero. Since the top element is zero, the combined Script is invalid and a transaction with this ScriptSig in the input is invalid.
The combined Script will validate if the signature is valid, but fail if the signature is invalid. The ScriptSig will only unlock the ScriptPubKey if the signature is valid for that public key. In other words, only someone with knowledge of the private key can produce a valid ScriptSig.
Incidentally, we can see where ScriptPubKey got its name.
The public key in uncompressed SEC format is the main instruction in ScriptPubKey for p2pk (the other instruction being OP_CHECKSIG
).
Similarly, ScriptSig is named as such because the ScriptSig for p2pk is a single element: the DER signature.
We now code a way to evaluate Script. This requires us to go through each instruction and evaluate whether the Script is valid. What we want to be able to do is this:
link:code-ch06/examples.py[role=include]
-
p2pk ScriptPubkey is the SEC format pubkey followed by
OP_CHECKSIG
which is0xac
or 170. -
We can do this because of the
add
method we created above. -
We want to evaluate the instructions and see if the Script validates.
Here is the method that we’ll use for the combined instruction set (combination of the ScriptPubKey of the previous transaction and the ScriptSig of the current transaction).
from op import OP_CODE_FUNCTIONS, OP_CODE_NAMES
...
class Script:
...
link:code-ch06/script.py[role=include]
-
As the instructions list will change, we make a copy
-
We execute until the instructions list is empty
-
The function that executes the opcode is in the
OP_CODE_FUNCTIONS
array (e.g.OP_DUP
,OP_CHECKSIG
, etc) -
99 and 100 are
OP_IF
andOP_NOTIF
respectively. They require manipulation of the instructions array based on the top element of the stack. -
107 and 108 are
OP_TOALTSTACK
andOP_FROMALTSTACK
respectively. They move stack elements to/from an "alternate" stack, which we call altstack. -
172, 173, 174 and 175 are
OP_CHECKSIG
,OP_CHECKSIGVERIFY
,OP_CHECKMULTISIG
, andOP_CHECKMULTISIGVERIFY
, which all require the signature hash,z
, from [chapter_elliptic_curve_cryptography] for signature validation. -
If the instruction is not an opcode, it’s an element, so we push that element to the stack.
-
If the stack is empty at the end of processing all the instructions, we fail the Script by returning
False
. -
If the stack’s top element is an empty byte string (which is how the stack stores a 0), then we also fail the Script by returning
False
. -
Any other result means that the Script has validated.
Warning
|
Making Script Evaluation Safe
The code shown here is a little bit of a cheat as the combined Script is not executed this way exactly. The ScriptSig is evaluated separately from the ScriptPubKey as to not allow operations from ScriptSig to affect the ScriptPubKey instructions. Specifically, the stack after all the ScriptSig instructions are evaluated are stored and then the ScriptPubkey instructions are evaluated on their own with the stack from the first execution. |
It may be confusing that the stack elements are sometimes numbers like 0 or 1 and other times byte-strings like a DER signature or SEC pubkey.
Under the hood, they’re all bytes, but some are interpreted as numbers for certain opcodes.
For example, 1 is stored on the stack as the 01
byte, 2 is stored as the 02
byte, 999 as the e703
and so on.
Any byte-string is interpreted as a Little-Endian number for arithmetic opcodes.
The integer 0 is not stored as the 00
byte, but as the empty byte-string.
The code in op.py
can clarify what’s going on:
link:code-ch06/op.py[role=include]
Numbers being pushed to the stack are encoded into bytes and decoded from bytes when the numerical value is needed.
Pay-to-pub-key is intuitive in the sense that there is a public key that anyone can send Bitcoins to and a signature that can only be produced by the owner of the private key. This works well, but there are some problems.
First, the public keys are long. We know from [chapter_serialization] that secp256k1 public points are 33 bytes in compressed SEC and 65 bytes in uncompressed SEC. Unfortunately, humans can’t interpret 33 or 65 raw bytes easily. Most character encodings don’t render certain byte ranges as they are control characters, newlines or similar. The SEC format is typically encoded instead in hexadecimal, doubling the length (hex encodes 4 bits per character instead of 8). This makes the compressed and uncompressed SEC formats 66 and 130 characters respectively, which is bigger than most identifiers (your username on a website, for instance, is usually less than 20). To compound this, early Bitcoin transactions didn’t use the compressed versions so the hexadecimal addresses were 130 characters each! This is not fun or easy for people to transcribe, much less communicate by voice.
That said, the original use-cases for p2pk were for IP-to-IP payments and mining outputs. For IP-to-IP payments, IP addresses were queried for their public keys, so communicating the public keys were done machine-to-machine, which meant that human communication wasn’t necessarily a problem. Use for mining outputs also don’t require human communication. Incidentally, this IP-to-IP payment system was phased out as it’s not secure and prone to man-in-the-middle attacks.
It seems the uncompressed SEC format doesn’t make sense for Bitcoin given that block space is at a premium, so why did Satoshi use it? Satoshi was using the OpenSSL library to do the SEC format conversions and the OpenSSL library at the time Satoshi wrote Bitcoin (circa 2008) did not document the compressed format very well. It’s speculated this is why Satoshi used the uncompressed SEC format.
When Pieter Wuille discovered that the compressed SEC format existed in OpenSSL, more people started using the compressed SEC format in Bitcoin.
Second, because the public keys are long, this causes a more subtle problem. The UTXO set becomes bigger since this large public key has to be kept around and indexed to see if it’s spendable. This requires more resources on the part of full nodes.
Third, because we’re storing the public key in the ScriptPubKey field, it’s known to everyone. That means should ECDSA someday be broken, these outputs could be stolen. This is not a very big threat since ECDSA is used in a lot of applications besides Bitcoin and would affect all of those things, too. For example, quantum computing has the potential to reduce the calculation times significantly for RSA and ECDSA, so having something else in addition to protect these outputs would be more secure.
Pay-to-pubkey-hash (p2pkh) is an alternative Script that has advantages over p2pk:
-
The addresses are shorter.
-
It’s additionally protected by sha256 and ripemd160.
Addresses are shorter due to the use of the sha256 and ripemd160 hashing algorithms. We do both in succession and call that hash160. The result of hash160 is 160-bits or 20 bytes, which are encoded into an address.
The result is what you may have seen on the Bitcoin network and coded in [chapter_serialization]:
1PMycacnJaSqwwJqjawXBErnLsZ7RkXUAs
This address encodes within 20 bytes that look like this in hexadecimal:
f54a5851e9372b87810a8e60cdd2e7cfd80b6e31
These 20 bytes are the result of doing a hash160 operation on this (compressed) SEC public key:
0250863ad64a87ae8a2fe83c1af1a8403cb53f53e486d8511dad8a04887e5b2352
Given p2pkh is shorter and more secure, p2pk use declined significantly after 2010, though it’s still fully supported today.
Pay-to-pubkey-hash was used during the early days of bitcoin, though not as much as p2pk.
The p2pkh ScriptPubKey, or locking Script, looks like Figure 6-15:
Like p2pk, OP_CHECKSIG
is here and OP_HASH160
makes an appearance.
Unlike p2pk, the SEC pubkey is not here but a 20 byte hash is.
There is also a new opcode here: OP_EQUALVERIFY
.
The p2pkh ScriptSig, or the unlocking Script, looks like Figure 6-16:
Like p2pk, the ScriptSig has the DER signature. Unlike p2pk, the ScriptSig also has the SEC pubkey. The main difference between p2pk and p2pkh ScriptSigs is that the SEC pubkey has moved from ScriptPubKey to ScriptSig.
The ScriptPubKey and ScriptSig combine to a list of instructions that looks like Figure 6-17:
At this point, the Script is processed one instruction at a time. We start with the instructions as above (Figure 6-18).
The first two instructions are elements, so they are pushed to the stack (Figure 6-19).
OP_DUP
duplicates the top element, so the pubkey gets duplicated (Figure 6-20):
OP_HASH160
takes the top element and performs the hash160 operation on it (sha256 followed by ripemd160), creating a 20-byte hash (Figure 6-21):
The 20-byte hash is an element and is pushed to the stack (Figure 6-22).
We are now at OP_EQUALVERIFY
.
This opcode consumes the top two elements and checks if they’re equal.
If they are equal, the Script continues execution.
If they are not equal, the Script stops immediately and fails.
We assume here that they’re equal, leading to Figure 6-23:
We are now exactly where we were during the OP_CHECKSIG
part of processing p2pk.
Once again, we assume that the signature is valid (Figure 6-24):
There are two ways this Script can fail.
If the ScriptSig provides a public key that does not hash160 to the 20-byte hash in the ScriptPubKey, the Script will fail at OP_EQUALVERIFY
(Figure 6-22).
The other failure condition is if the ScriptSig has a public key that hash160s to the 20-byte hash in the ScriptPubKey, but has an invalid signature.
That would end the combined Script evaluation with a 0, ending in failure.
This is why we call this type of Script pay-to-pubkey-hash. The ScriptPubKey has the 20-byte hash160 of the public key and not the public key itself. We are locking Bitcoins to a hash of the public key and the spender is responsible for revealing the public key as part of constructing the ScriptSig.
The major advantage is that the ScriptPubKey is shorter (just 25 bytes) and a thief would not only have to solve the Discrete Log problem in ECDSA, but also figure out a way to find pre-images of both ripemd160 and sha256.
Note that Scripts can be any arbitrary program. Script is a smart contract language and can lock Bitcoins in many different ways. Figure 6-25 is an example ScriptPubKey:
Figure 6-26 is a ScriptSig that will unlock the the ScriptPubKey from Figure 6-25.
The combined Script is shown in Figure 6-27:
Script evaluation will start like Figure 6-28:
OP_4
will push a 4 to the stack (Figure 6-29).
OP_5
will likewise push a 5 to the stack (Figure 6-30).
OP_ADD
will consume the top two elements of the stack, add them together and push the sum to the stack (Figure 6-31).
OP_9
will push a 9 to the stack (Figure 6-32).
OP_EQUAL
will consume 2 elements and push a 1 if equal, 0 if not (Figure 6-33).
Note that the ScriptSig isn’t particularly hard to figure out and contains no signature. As a result, ScriptPubKey is vulnerable to being taken by anyone who can solve it. Think of this ScriptPubKey as a lockbox with a very flimsy lock that anyone can break into. It is for this reason that most transactions have a signature requirement in the ScriptSig.
Once a UTXO has been spent, included in a block and secured by proof-of-work, the coins are locked to a different ScriptPubKey and no longer as easily spendable. Someone attempting to spend already spent coins would have to provide proof-of-work, which is expensive (see [chapter_blocks]).
The previous exercise was a bit of a cheat as OP_MUL
is no longer allowed on the Bitcoin network.
Version 0.3.5 of Bitcoin disabled a lot of different opcodes as anything that had even a little bit of potential to create vulnerabilities on the network were disabled.
This is just as well since most of the functionality in Script is actually not used much. From a software maintenance standpoint, this is not a great situation as the code has to be maintained despite its lack of usage. Simplifying and getting rid of certain capabilities can be seen as a way to make Bitcoin more secure.
This is in stark contrast to other projects which try to expand their smart contract languages, often increasing the attack surface along with new features.
In 2013, Peter Todd created a Script very similar to the Script in Exercise 4 and put some Bitcoins into it to create an economic incentive for people to find hash collisions. The donations reached 2.49153717 BTC and when Google actually found a hash collision for sha1 in February of 2017 (https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html), this Script was promptly redeemed. The transaction output was 2.48 Bitcoins which was $2848.88 USD at the time.
Peter created more piñatas for sha256, hash256 and hash160, which add economic incentives to find collisions for these hashing functions.