OCB is a blockcipher-based mode of operation that simultaneously provides both privacy and authenticity for a user-supplied plaintext. Such a method is called an authenticated-encryption scheme. What makes OCB remarkable is that it achieves authenticated encryption in almost the same amount of time as the fastest conventional mode, CTR mode, achieves privacy alone. Using OCB one buys privacy+authenticity about as cheaply (or more cheaply—CBC was once the norm) as one used to pay for achieving privacy alone. Despite this, OCB is simple and clean, and easy to implement in either hardware or software. OCB accomplishes its work without bringing in the machinery of universal hashing, a technique that does not seem to lend itself to implementations that are simple and fast in both hardware and software.
Somewhat more precisely, OCB solves the problem of nonce-based authenticated-encryption with associated-data (AEAD). The associated-data part of the name means that when OCB encrypts a plaintext it can bind it to some other string, called the associated data, that is authenticated but not encrypted. The nonce-based part of the name means that OCB requires a nonce to encrypt each message. OCB does not require random the nonce to be random; a counter, say, will work fine. Unlike some modes, the plaintext provided to OCB can be of any length, as can the associated data, and OCB will encrypt the plaintext without every padding it to some convenient-length string, an approach that would yield a longer ciphertext.
The design of OCB was strongly influenced by Charanjit Jutla’s mode IAPM; I consider the papers I’ve written related to OCB to be follow-on work to Jutla’s paper. I took up the design of OCB largely in response to NIST’s modes-of-operation activities: NIST had figured out that not only do we need a modern blockcipher (AES), but one also needs, just as importantly, modern ways to use use the construct well. While NIST still has not put forward a standard for an “integrated” one-pass AEAD mode, I suspect that, eventually, they will.
In the past, when one wanted a shared-key mechanism providing
both privacy and authenticity,
the usual thing to do was to
separately encrypt and compute a MAC, using two different keys.
(The word MAC stands for Message Authentication Code.)
The encryption is what buys you privacy and the
MAC is what gets you authenticity.
The cost to get privacy-and-authenticity achieved in this way is
about the cost to encrypt (with a privacy-only scheme) plus the cost to MAC.
If one is encrypting and MACing in a conventional way,
like CTR-mode encryption and the CBC MAC,
the cost for privacy-and-authenticity is
going to be twice the cost for privacy alone, just counting blockcipher calls.
Thus people have often hoped for a simple and cheap way to get
message authenticity as an almost “incidental’ adjunct to message privacy.
Can you describe how OCB works?
Sure;
I’ll draw a pretty picture, too.
To make things concrete,
let’s assume you’re using a blockcipher of E=AES with 128-bit keys.
The block length is n=128 bits.
Let’s assume a nonce of 96 bits.
Other values can be accommodated,
but 96 bits (12 bytes) is the recommended nonce length.
Let M be the message you want to encrypt. Let A be the associated data (if there isn’t any, regard it as the empty string). Let K be the OCB encryption key, which is just a key for the underlying blockcipher, AES. Let N be the 96-bit nonce—for example, a counter that gets incremented with each message you encrypt.
Let’s first look at the setting when |M| is a positive multiple of 128 bits. Then break M into 128-bit blocks M = M1...Mm. Encryption then works as shown in the following picture. Read each column top-to-bottom, and then go across left-to-right. We’ll defer describing the Init and Inc functions for just a moment. (These functions depend on K too, but we haven’t reflected that in the notation.) The value Checksum is the 128-bit string Checksum = M1 ⊕…⊕ Mm. We’ll describe how to compute the value Auth in moment. It’s determined by A and K. The number τ, the tag length of the scheme, is, like the blockcipher E, a parameter of the mode. It is a number 0 ≤ τ ≤ 128. Now the picture:
The ciphertext is the 128m + τ bit string CT = C1 C2 … Cm T.
Let us explain how Init works. (After that, we’ll explain how the function Inc works.) We are given a 96-bit nonce N. We save the last six bits as the number Bottom (a value between 0 and 63) and then we create a 128-bit value Top by prepending to N the the 32-bit constant 0x00000001 and zeroing out the last six bits. Let Ktop = EK (Top) and let Stretch be the 256-bit string Stretch = Ktop || (Ktop⊕(Ktop<<8)). By S << i we mean the left shift of the 128-bit string S by i positions (leftmost bits vanish, zero bits come in on the right). The value denoted above as Init(N), the initial value for Δ, is the first 128 bits of Stretch << Bottom.
Why all this peculiar stuff? Beyond the fact that one can prove that it does what we want, the intent is very simple: if N is counter, Ktop is going to change only once every 64 calls. As a consequence, if we keep Ktop around as long as needed, then, 63 our of 64 times, all that will be needed to compute Init(N) is a logical operation to extract the lowest six bits of N and then a logical shift of the string Stretch by this many bits. Just a few cycles. The other one time in of 64 we will need a blockcipher call, too. The amortized cost to compute Init, meaning the average cost over a sequence of successive calls, is extremely low, and we don't need to implement anything like a GF(2128) multiply.
Now let’s explain how the Inci function works. First, for S a 128-bit string, let double(S) = S << 1 if the first bit of S is 0, and let double(S) = (S << 1)⊕135 otherwise (where 135 denotes the 128-bit string that encodes that decimal value). For those of you for whom this means something, double(S) is just the multiplication of S by the field point “x” when using a particular representation for the finite field with 2128 points. Given all these K-derived L-values, we set L∗=EK(0128), L$=double(L∗), L[0]=double(L$), and L[j]=double(L[j]) for all j ≥ 1. Given the above, define Inci (Δ) = Δ ⊕L[j] where j = ntz(i) is the number of trailing bits in the binary representation of i. Likewise define Inc∗(Δ) = Δ⊕L∗. and Inc$(Δ) = Δ⊕L$.
We call the increment function we have described “table-based” because, to implement them, it is natural to precompute a table of 128-bit L∗, L$, and L[j]. Then, for each kind of increment, we just xor in the needed value from the table.
Decryption under OCB works in the expected way: given K, N, and CT = C1 C2 … Cm T, recover M in the natural way and recompute the tag T* that “should” be at the end of CT. Regard M as the (authentic) underlying plaintext for CT if T = T*. Regard the ciphertext as invalid if T ≠ T*.
When |M| is not a positive multiple of 128 bit we process the final block a little differently. This time we break the plaintext M into blocks M = M1...Mm M∗ where |Mi| = 128 and 0 ≤ |M∗| < 128. Computation works as in the following picture. We redefine Checksum as Checksum = M1 ⊕…⊕ Mm ⊕ M∗ 10* where the M∗ 10* notation means to append a single 1-bit and then the minimum number of 0-bits to get the string to be 128 bits.
Finally, I’ll describe how to compute Auth, the 128-bit string one gets by processing the associated data A. As before, we distinguish two cases, when A is or is not a multiple of 128 bits. The picture below shows the two settings. The increment function Inc is just as before, the only thing that is different is the initial value of Δ, which is Δ = Init = 0128. When A is the empty string, the value returned is Auth = 0128.
I don’t like to change an algorithm I’ve put out; there should be a compelling reason to do so. Let me describe what was significant with each algorithm in the chain.
The changes from OCB1 to OCB2 to OCB3 aren’t really all that big, so why have I bothered to continue to refine this mode?
If you implement OCB, please use the final version, OCB3.
In time, I hope the name OCB, when understood to be a particular algorithm,
will be understood to mean OCB3.
What are some of OCB’s properties?
OCB has been designed to simultaneously have lots of desirable properties.
Listed in no particular order, these properties include the following.
AES actually comes in three flavors: AES128, AES192, and AES256. I like AES128.
If you use OCB with a blockcipher having a 64-bit blocklength
instead of a 128-bit blocklength, beware
that you’ll get a worse security bound. You’ll need to change keys
well before you encrypt 232 blocks (as with all well-known
modes that use a 64-bit blockcipher).
Should one write “OCB-AES” or “AES-OCB”?
I prefer OCB-AES (or OCB-AES128), going from high-level protocol on down.
Just like HMAC-SHA1.
Is OCB in standards?
I am not the first to give an integrated authenticated-encryption mode of operation. Charanjit Jutla, from IBM Research, was the first to publicly describe a correct blockcipher-based mode of operation that combines privacy and authenticity at a small increment to the cost of providing privacy alone. Jutla’s scheme appears as IACR-2000/39. He went on to publish the work in a EUROCRYPT 2001 paper. A Journal of Cryptology version appeared in 2008.
Jutla actually described two schemes: one is CBC-like (IACBC)
and one is ECB-like (IAPM). The latter
is what OCB builds on.
OCB uses high-level ideas from Jutla’s scheme
but adds in many additional tricks.
What did Gligor and Donescu do?
Virgil Gligor and Pompiliu Donescu
were working on authenticated-encryption schemes at the same time that Jutla was.
They described an authenticated-encryption mode, XCBC, in
[Gligor, Donescu; Aug 18, 2000], taken from
Gligor’s homepage.
The mode is not similar to OCB, but
it is similar to Jutla’s IACBC.
We do not know the precise history of
Gligor/Donescu and Jutla in devising their schemes;
Jutla’s scheme was publicly distributed first, but only by a little.
OCB was invented after both pieces of work made their debut.
One of the contributions of
[Gligor, Donescu] is the use
of mod-2n arithmetic in this context. In particular,
Jutla had suggested the use of
mod-p addition, and it was non-obvious that moving into
this weaker algebraic structure would be OK.
Later versions of [Gligor, Donescu; 18 August 2000] have added in new schemes.
Following Jutla, [Gligor, Donescu; October 2000] mention
a parallelizable mode of operation similar to Jutla’s IAPM.
Following my own work,
[Gligor, Donescu; April 2001] updated their
modes to use a single key and
to deal with messages of arbitrary bit length.
What is a nonce and why do you need one?
In OCB, you must present a nonce each time you want to encrypt a string. When you decrypt, you need to present the same nonce. The nonce doesn’t have to be random or secret or unpredictable. It does have to be something new with each message you encrypt. A counter value will work for a nonce, and that is what is recommended. It doesn’t matter what the counter is initialized to. In general, a nonce is a string that is used used at most one time (either for sure, or almost for sure) within some established context. In OCB, this “context” is the “session” associated to the underlying encryption key K. For example, you may distribute the key K by a session-key distribution protocol, and then start using it. The nonces should now be unique within that session. Generating new nonces is the user’s responsibility, as is communicating them to the party that will decrypt.
A nonce is nothing new or strange for an encryption mode. All encryption modes that achieve a strong notion of privacy need to have one. If there were no nonce there would be only one ciphertext for each plaintext and key, and this means that the scheme would necessarily leak information (e.g., is the plaintext for the message just received the same as the plaintext for the previous message received?). In CBC mode the nonce is called an IV (initialization vector) and it needs to be adversarially unpredictable if you’re going to meet a strong definition of security. In OCB we have a weaker requirement on the nonce, and so we refrain from calling it an IV.
What happens if you repeat the nonce? You’re going to mess up authenticity for all future messages, and you’re going to mess up privacy for the messages that use the repeated nonce. So don’t do this. It is the user’s obligation to ensure that nonces don’t repeat within a session.
We note that all conventional encryption modes fail, usually quite miserably,
if you repeat their supplied IV/nonce.
Only recently (2006) did cryptographer’s develop a notion for an encryption
scheme being maximally “resilient” to nonce-reuse.
The notion is due to Tom Shrimpton and me and we developed a nice scheme,
SIV, to solve this problem. But it’s a conventional
composed scheme: half the speed, or worse, of OCB.
SIV makes a nice alternative to a composed scheme like CCM,
EAX, offering comparable performance but a better
security guarantee.
Is there code available? Is it free?
Yes there’s code and yes it’s free.
Ted Krovetz wrote the code, and he's currently working on refining it.
See the top-level OCB page for a link.
In correctly coding OCB, I think the biggest difficulty is
endian problems. The spec is subtly big-endian
in character, a consequence of it’s use of “standard”
mathematical conventions.
If your implementation doesn’t agree with the reference one,
chances are there’s some silly endian error that will
take you hours to discover.
Are OCB test vectors available?
Test vectors can be found in the
Internet Draft.
Is it safe to use a new algorithm in a product / standard?
In the past, one had to wait years
before using a new cryptographic scheme;
one needed to give cryptanalysts
a fair chance to attack the thing. Assurance in a scheme’s correctness
sprang from the absence of damaging attacks by smart people,
so you needed to wait long enough that at least a few smart people would try,
and fail, to find a damaging attack.
But this entire approach has become largely outmoded,
for schemes that are not true primitives,
by the advent of provable security.
With a provably-secure scheme assurance does not stem from a failure
to find attacks; it comes from proofs,
with their associated bounds and definitions.
Our work includes a proof that OCB-E
is secure as long as blockcipher E is secure (where E is an arbitrary block
cipher and “secure” for OCB-E and secure for E
are well-defined notions).
Of course it is conceivable that OCB-AES could fail because AES
has some huge, unknown problem.
But other issues aren’t really of concern.
Here we’re in a domain
where the underlying definition is simple and rock solid;
where there are no “random oracles” to worry about;
and where the underlying cryptographic assumptions are completely standard.
Are there competing integrated authenticated-encryption proposals?
There are, although none are as refined as OCB.
See the Krovetz-Rogaway paper
The software performance of authenticated-encryption modes
for a chart pointing to a number of AE schemes. Going back to the original
integrated (aka, one-pass) schemes from Jutla and
from Gligor-Donescu:
There are various pitfalls people run into when trying to do a homebrewed combination of privacy and authenticity. Common errors include: (1) a failure to properly perform key separation; (2) a failure to use a MAC that is secure across different message lengths; (3) omitting the IV from what is MACed; (4) encrypting the MACed plaintext as opposed to the superior method (at least with respect to general assumptions) of MACing the computed ciphertext. For all of these, I cannot advise people to roll-their-own authenticated-encryption scheme.
To date, the most significant two-pass authenticated encryptions schemes are CCM and GCM. Some other techniques include SIV and EAX. All of these modes comprise IP-free means of performing authenticated-encryption with associated-data. I’ll focus on CCM and GCM:
We also elaborate on SIV, a deterministic or misuse-resistant authenticated-encryption mode. It was designed by Rogaway and Shrimpton to solve the “key wrap” problem. When used as a nonce-based AEAD scheme, SIV has the unusual property that it doesn’t “break” if a nonce should get reused: all that happens is that repetitions of this exact (nonce, AD, plaintext) are revealed. The mode can also be used without any nonce and, when used as such, all it reveals are repetitions in (nonce, plaintext) pairs. SIV suffers from a latency problem (unavoidable for the deterministic or misuse-resistant AEAD goal): one must not only make two passes over the input, but one cannot output any ciphertext bits until all plaintexts bits are read. SIV is described in a EUROCRYPT 2006 paper and an associated spec.
The main advantage of OCB over CCM, GCM, and other composed schemes is software
speed. See the performance page
for data.
Is OCB patented?
It is; I myself have received US patents
7,046,802,
7,200,227,
7,949,129, and
8,321,675 on OCB.
That said, I have recently announced that OCB is freely licensed
over a large space:
there is one license grant for open-source software,
and another for non-military software.
See the licensing page for more information.
If you want to use OCB in a manner not covered by the patent grants,
please contact me.
I license OCB under fair, reasonable, and non-discriminatory terms, with
licensees paying a modest one-time fee.
There are further patents in the AE space. I would single out those of
Gligor and Donescu (VDG) and Jutla (IBM):
6,963,976,
6,973,187,
7,093,126, and
8,107,620.
Do the claims of these patents read against OCB?
It is difficult to answer such a question.
In fact, I suspect that nobody
can give an answer. It seems extremely subjective.
Do I need a license?
Please see the separate licensing page for information on this question.
Who is the author of this webpage?
I am, Phil Rogaway, a professor in the Department of Computer Science at the University of California, Davis. I have also been a regular visitor to universities in Thailand, most often at the Department of Computer Science at Chiang Mai University.
For 20 years I’ve worked to develop and popularize a style of cryptography I call practice-oriented provable security. This framework is a perfect fit to the problem of designing an authenticated-encryption scheme. One of the “outputs” of practice-oriented provable security has been cryptographic schemes that are well-suited standards. In particular, I am co-inventor on the schemes known as CMAC, DHIES, EAX, OAEP, PSS, and XTS, which are in standards from ANSI, IEEE, ISO, and NIST. The kind of academic work I do can be seen by looking at dblp or google Scholar.