What is PMAC?
It's a Parallelizable MAC.
So what's a MAC?
The acronym stands for message authentication code.
A (deterministic) message authentication code is a
function of two strings.
The first we'll call the key and the second we'll call the
The value Tag= MAC(K,M)
is a string of some fixed length (e.g., 64 bits).
A MAC is usually used like this.
A sender and a receiver share a key K.
When the sender wants to send a message M to the receiver,
he sends along with M the string
When the receiver receives a message M* and
its tag Tag*, the receiver
computes the "expected" tag
Tag**=MAC(K,M*) for M*.
If the expected tag is identical to the received tag
then the receiver
assumes that the received message M* really did originate with
If the tags are different then the receiver regards the message M*
as a forgery. In that case the receiver may want
to ignore the message, or close down
the connection, or take some other action.
What's the traditional way to make a MAC?
The most well known way to
MAC uses a block cipher, say AES, and is called the
(The CBC stands for cipher block chaining.)
It's defined, for example, in
ISO 9797, ANSI X9.9, and ANSI X9.19.
In what I'll call the basic CBC MAC,
the message M
must have a number of bits which is a positive multiple of
the block length (128 bits for AES). The message is regarded as a
sequence of 128-bit words,
The MAC, let us write CBC(K,M),
is defined as C[m], where
C[i]=AES(K,C[i-1] xor M[i])
for i>0 and
and C=ZERO (the block of 128 zero-bits).
The basic CBC MAC is usually "embellished" a bit.
First, the domain is extended to arbitrary bits strings,
usually by padding with a 1-bit and then the minimal
number of 0-bits to reach a multiple of 128 bits.
Second, a particular security problem is fixed.
The basic CBC MAC works correctly when all the
strings you want to MAC have one fixed length
(this is a well-known result of
[Bellare, Kilian, and Rogaway])) but
it does not work correctly
when message lengths may vary.
(For example, given the MAC (under some unknown key) of ZERO,
do you see how to forge the string ZERO||ZERO?)
To fix this problem extra processing is often done to
C[m] (as it was defined above), such
as enciphering it under a separate key.
Third, many standards allow you to take a prefix
of the full n-bit MAC (where n is the block length).
The three changes above represent one way to fix up the basic CBC MAC.
A more recent suggestion due to
more efficiently deals with padding and length-variability.
Are there ways to MAC besides the CBC MAC?
Yes, numerous ways.
The most well-known construction is
which is due to
[Bellare, Canetti, Krawczyk].
It has been adopted in
IPSec, and was recently made a
draft FIPS standard.
Another MAC is
which is due to
[Black, Halevi, Krawczyk, Krovetz, Rogaway].
It's the reigning speed champion for modern CPUs.
Why was PMAC invented?
PMAC addresses a particular problem of the CBC MAC: that it is
HMAC is the same way.
Going back to the earlier description of the basic CBC MAC,
note that you can't compute
C[i] until you've
This "bottleneck" can be a problem if you want to MAC at
very high speeds.
Dedicated hardware will be limited in speed
by the latency of the underlying cryptographic primitive.
And modern processors which offer up lots of registers and
instruction-dispatch units will be limited by the amount
of parallelism there is to exploit in the underlying block cipher.
PMAC was designed to address this issue, being fully
In saying that PMAC is fully parallelizable we mean that you are
effectively unlimited in how much parallelism you can extract.
We didn't want to gain parallel efficiency at the cost of serial
efficiency, and PMAC is about as efficient as the CBC MAC in
In making a modern MAC, we were also careful to fix the limitations
of the basic CBC MAC, giving a construction that
works correctly across arbitrary bit strings.
PMAC was designed largely in response to NIST's modes-of-operation
we thought it would be nice to provide NIST with a parallelizable MAC that
looked reasonably conventional.
one route towards making a parallelizable MAC is the
(which you can read about in the paper on UMAC, for example).
But good MACs from the Carter-Wegman approach
can be complicated to spec or implement well, and they don't
look like conventional modes at all.
From where does PMAC spring?
PMAC follows several related papers.
[Bellare, Guerin, Rogaway]
was the first to draw attention to the issue of parallelizability in
MAC constructions and to offer up a fully parallelizable MAC.
PMAC is a descendent of their XOR MACs.
Two further constructions build on the XOR MAC,
and PMAC, in turn, builds on them.
[Bernstein] ("How to stretch random functions")
gives a deterministic version of the XOR MAC.
in their XECB construction,
demonstrate an alternative to
concatenation-style encoding of message-blocks and message-indices.
Finally, papers by
[Petrank, Rackoff] and
focussed attention on dealing nicely with
strings of varying and arbitrary bit lengths.
Who invented PMAC?
I did (Phil Rogaway). But I
worked with John Black,
and the academic paper on PMAC will authored by the two of us.
What are some of PMAC's properties?
PMAC has been designed to
simultaneously have lots of desirable properties. These include the following.
- PMAC is actually more than a MAC: its a
Roughly said, this means that if K is random and unknown to the
adversary, f(x)=PMAC(K,x) will look like
a random function.
- PMAC is fully parallelizable, as we've explained.
- PMAC works on any bit string. In particular, you don't need
to think about padding or issues of length-variability; it's been
taken care of for you.
- PMAC works with any block cipher.
- PMAC provides a tag with as many bits as you want,
up to the block length n of the underlying block cipher.
- PMAC makes an optimal number of block-cipher calls for a conventional mode
of operation, namely, roundup(|M|/n).
(Actually, the formula is wrong for the empty string, which takes one block
- PMAC is deterministic: it uses no counter/IV/nonce and no random bits.
This has been true in all traditional MACs.
- PMAC uses only a single block-cipher key, and every invocation of the
block cipher is keyed by that one key.
- PMAC only use the forward direction of the underlying block cipher.
- PMAC key setup is very cheap: one block-cipher call and, if desired, a
few XORs and conditional (128-bit) shifts.
- PMAC generates a sequence of offsets, which it does in a very cheap way.
Each offset is computed from the previous one either by xoring the previous
offset by a value looked up in a small table or by
doing a few shifts and xors.
- PMAC is nearly endian-neutral: the scheme can be implemented just as fast
on a big-endian machine or on a little-endian machine.
- PMAC doesn't need much memory to run (and a memory-minimal implementation
doesn't give up much speed).
- PMAC avoids 128-bit addition (which is endian-biased and can be expensive
in software or dedicated hardware). It uses xors instead.
- PMAC is
incremental, in the sense of
[Bellare, Goldreich, and Goldwasser],
with respect to several common operations.
- PMAC is simple to understand and implement. It uses GF(2^128) arithmetic
and a Gray code, but it all comes down to some xors and shifts. One doesn't
have to understand the mathematics to implement the scheme.
- PMAC is provably secure. It provably meets its goals, as long as the
underlying block cipher meet a standard cryptographic assumption.
How do you compare OCB and PMAC? These are completely different beasts.
OCB provides privacy+authenticity, whereas
PMAC gets you authenticity alone. You don't need PMAC if you're using OCB.
Please describe how PMAC works
Assume you're using AES (with whatever key length you want).
First a little notation. Where L is a 128-bit
string I'll write L<<1 for the left shift of L by 1 bit
(where the first bit vanishes and a 0 comes into the last bit). Write
L>>1 for the the right shift of L by 1 bit (where the last
bit vanishes and a 0 comes into the first bit). For a nonzero number i,
let ntz(i) be the number of trailing 0-bits in the binary representation
of i (so, for example, ntz(52)=2, because 52 is 110100 in binary, which
ends in two zeros). Let ZERO be a block of 128 zero-bits.
Where x is a
string of at most 128 bits, let pad(x) be x if x has
exactly 128 bits, and let it be x followed by a 1-bit followed by
127-|x| 0-bits otherwise.
Let M be the message you want to MAC, and let
K be the PMAC key, which is just an AES key.
Let L be AES(K, ZERO). Let
L(0) be L and, for i>0, let L(i) be L<<1 if
the first bit of L(i-1) is 0, and let L(i) be
(L(i-1)<<1) xor 0x00000000000000000000000000000087
otherwise. Let L(-1) be L>>1 if the last bit of L is
0, and let L(-1) be L>>1 xor
0x80000000000000000000000000000043 otherwise. Break the message M into
blocks MM...M[m] where each block except the
last one has 128 bits. The last one may have fewer than 128 bits. Now we MAC
function pmac (K, M)
Offset = ZERO
Σ = ZERO
for i = 1 to m-1 do begin
Offset = Offset xor L(ntz(i))
Σ = Σ xor AES(K, Offset xor M[i])
Σ = Σ xor pad(M[m])
if |M[m]|=n then Σ = Σ xor L(-1)
FullTag = AES(K, Σ)
return a prefix of FullTag (of the desired length)
Might the algorithm be changed?
No; the PMAC algorithm is in its final form.
The writeup may continue to evolve here and there,
but only to improve the exposition. The algorithm itself
will not be changed.
What about patents?
I have patent filings which relate to PMAC, but
will license these patents under
fair, reasonable, and non-discriminatory terms.
All companies will be offered the same,
publicly disclosed, contract.
I expect licensees to pay a modest, one-time fee.
Here is a patent-assurance letter
I wrote for the IEEE.
Where can I learn more?
paper, of course!
Back to the PMAC homepage.