ECRYPT'>
eSTREAM'>
IST'>
NESSIE'>
ecrypt-sync.h'>
ecrypt-sync-ae.h'>
ecrypt-test.c'>
ecrypt-portable.h'>
]>
eSTREAM Optimized Code HOWTOECRYPT NoEestreamtesting@ecrypt.eu.org2005-11-011.02005-11-01cdecannifirst public version0.92005-09-26cdecannifirst draftThis document describes the &estream; testing framework
and provides guidelines on how to write and submit optimized
code.IntroductionOne of the requirements imposed on all eSTREAM stream cipher
submissions was that they should demonstrate the potential to be
superior to the AES in at least one significant aspect. An aspect
which is particularly significant for Profile I candidates is
software performance.Software performance can be measured in many different ways,
and in order to make comparisons as fair as possible, eSTREAM has
decided to develop a testing framework. The framework has two
objectives:assuring that all stream cipher proposals are submitted
to the same tests under the same circumstancesautomating the test procedure as much as possible such
that new optimized implementations can be included and tested
with as little effort as possible.This second goal requires some cooperation from the
submitters of optimized code, and the purpose of this document is
to provide guidelines on how to write code that can easily be
integrated in the testing framework.Disclaimer&ecrypt; is a Network of Excellence within the Information
Societies Technology (&ist;) Programme of the European
Commission. The information in this note is provided as is, and
no guarantee or warranty is given or implied that the
information is fit for any particular purpose. The user thereof
uses the information at his or her sole risk and
liability.FeedbackFeedback is most certainly welcome for this document. Send
your additions, comments and criticisms to the following email
address : estreamtesting@ecrypt.eu.org.The Testing Framework: An OverviewThe testing framework consists of a collection of scripts
and C-code which test three aspects of the submitted code:
API compliance,
correctness, and
performance. Many of these tests have been
borrowed from the &nessie; Test Suite.API ComplianceThe eSTREAM API is specified in the files &ecrypt-sync.h;
and &ecrypt-sync-ae.h;. The framework verifies whether the code
complies to this API by performing the following tests:It checks that the code provides the necessary
interfaces, i.e., that it compiles and links correctly with
the test code (&ecrypt-test.c;).It checks that the ECRYPT_KEYSIZE(i) and
ECRYPT_MAXKEYSIZE macros allow key sizes to be
enumerated as specified by the API. Idem for IV and MAC
sizes.It checks that calls to the same functions with the
same parameters produce the same results, no matter how they
are interleaved. When this test fails, this is often an
indication that the code stores data in static variables, or
that it uses uninitialized variables.It checks that the incremental encryption functions
ECRYPT_encrypt_blocks and
ECRYPT_encrypt_bytes produce the same
ciphertext as ECRYPT_encrypt_packet
when fed with the same plaintext. It also verifies that this
ciphertext decrypts to the original plaintext.
CorrectnessThe correctness of the code on different platforms is
verified by generating and comparing test vectors. For
convenience, eSTREAM has chosen to use the same format as the
NESSIE
test vectors.The test vectors currently included in the testing
framework were generated by eSTREAM and still need to be
verified by the designers.PerformanceStream ciphers can be deployed in various situations, each
imposing specific requirements on the efficiency of the
primitive. Hence, defining a small set of performance criteria
which reflects all relevant implementation properties of a
stream cipher is not an easy task. In the current version of the
framework, eSTREAM has limited itself to four performance
measures. More detailed tests might be added in the future,
though.Encryption rate for long streamsThis is where stream ciphers have the biggest
potential advantage over block ciphers, and hence this
figure is likely to be the most important criterion in
many applications. The testing framework measures the
encryption rate by encrypting a long stream in chunks of
about 4KB using the
ECRYPT_encrypt_blocks function. The
encryption speed, in cycles/byte, is calculated by
measuring the number of bytes encrypted in 250
µsec. Note that the time to setup the key or the IV
is not considered in this test.Packet encryption rateWhile a block cipher is likely to be a better choice
when encrypting very short packets, it is still
interesting to determine at which length a stream cipher
starts to take the lead. Moreover, stream ciphers whose
encryption speeds do not deteriorate too much for small
packets could have a distinct advantage in applications
which use a wide range of packet sizes. The packet
encryption rate is measured by applying the
ECRYPT_encrypt_packet function to
packets of different lengths. Each call to
ECRYPT_encrypt_packet includes a
separate IV setup and, if authenticated encryption is
supported, a MAC finalization step. The packet lengths
(40, 576, and 1500 bytes) were chosen to be representative
for the traffic seen on the Internet .
AgilityWhen an application needs to encrypt many streams in
parallel on a single processor, its performance will not
only depend on the encryption speed of the cipher, but
also on the time spent switching from one session to
another. This overhead is typically determined by the
number of bytes of ECRYPT_ctx that need to be
stored or restored during each context switch. In order to
build a picture of the agility of the different
submissions, the testing framework performs the following
test: it first initiates a large number of sessions
(filling 16MB of RAM with ECRYPT_ctx
structures), and then encrypts streams of plaintext in
short blocks of around 256 bytes using
ECRYPT_encrypt_blocks, each time
jumping from one session to another.Key and IV setup (+ MAC generation)The last test in the testing framework separately
measures the efficiency of the key setup
(ECRYPT_keysetup) and the IV setup
(ECRYPT_ivsetup). Given that each
call to ECRYPT_AE_ivsetup comes
together with a call to
ECRYPT_AE_finalize, both functions
are benchmarked together in case of authenticated stream
ciphers. This is probably the least critical of the four
tests, considering that the efficiency of the IV setup is
already reflected in the packet encryption rate, and that
the time for the key setup will typically be negligible
compared to the work needed to generate and exchange the
key.The different tests are illustrated below with an example
for SNOW 2.0. The latest results for all submissions, measured
by eSTREAM on various platforms, can be found in .Output of performance testsPrimitive Name: SNOW-2.0
========================
Profile: SW
Key size: 128 bits
IV size: 128 bits
CPU speed: 1694.8 MHz
Cycles are measured using RDTSC instruction
Testing memory requirements:
Size of ECRYPT_ctx: 108 bytes
Testing stream encryption speed:
Encrypted 22 blocks of 4096 bytes (under 1 keys, 22 blocks/key)
Total time: 415015 clock ticks (244.87 usec)
Encryption speed (cycles/byte): 4.61
Encryption speed (Mbps): 2943.95
Testing packet encryption speed:
Encrypted 350 packets of 40 bytes (under 10 keys, 35 packets/key)
Total time: 411499 clock ticks (242.80 usec)
Encryption speed (cycles/packet): 1175.71
Encryption speed (cycles/byte): 29.39
Encryption speed (Mbps): 461.29
Overhead: 538.2%
Encrypted 120 packets of 576 bytes (under 10 keys, 12 packets/key)
Total time: 416341 clock ticks (245.66 usec)
Encryption speed (cycles/packet): 3469.51
Encryption speed (cycles/byte): 6.02
Encryption speed (Mbps): 2250.95
Overhead: 30.8%
Encrypted 50 packets of 1500 bytes (under 1 keys, 50 packets/key)
Total time: 395528 clock ticks (233.38 usec)
Encryption speed (cycles/packet): 7910.56
Encryption speed (cycles/byte): 5.27
Encryption speed (Mbps): 2570.96
Overhead: 14.5%
Weighted average (Simple Imix):
Encryption speed (cycles/byte): 7.35
Encryption speed (Mbps): 1844.62
Overhead: 59.6%
Testing key setup speed:
Did 7000 key setups (under 10 keys, 700 setups/key)
Total time: 446655 clock ticks (263.54 usec)
Key setup speed (cycles/setup): 63.81
Key setup speed (setups/second): 26561211.67
Testing IV setup speed:
Did 500 IV setups (under 10 keys, 50 setups/key)
Total time: 397912 clock ticks (234.78 usec)
IV setup speed (cycles/setup): 795.82
IV setup speed (setups/second): 2129634.19
Testing key agility:
Encrypted 270 blocks of 256 bytes (each time switching contexts)
Total time: 412653 clock ticks (243.48 usec)
Encryption speed (cycles/byte): 5.97
Encryption speed (Mbps): 2271.07
Overhead: 29.6%
End of performance measurementsInstalling the Testing FrameworkA tarball of the latest version of the testing framework can
always be downloaded from ECRYPT's SVN
repository. This repository contains the most recent
implementations of all stream cipher candidates, together with
test vectors, test scripts, and a few benchmark ciphers (AES in
CTR mode, RC4, SNOW 2.0).The scripts expect a shell compatible with GNU Bash, system
utilities compatible with GNU
Coreutils, and one (or more) ANSI C compiler(s). The
following sections discuss how these requirements can be fulfilled
on various platforms.x86 Live CDThe easiest way to run the testing framework on a x86
platform is to download eSTREAM's bootable Live CD. The CD
allows the framework to run without any installation and without
affecting the existing configuration on the host machine in any
way. The Live CD is based on Ubuntu and
includes:a stripped-down version of Ubuntu
5.04.a working copy of the testing
framework.different versions of GCC (2.95, 3.3, 3.4, and
4.0).Intel
C++ Compiler 8.1 for Linux.Microsoft
Visual C++ Toolkit 2003.To run the Live CD, complete the following steps:Download the ISO-file
(about 400MB) and burn it on a CD.If you want to use the Intel C Compiler, you will need
a license file. You can obtain a free non-commercial license
by subscribing here.
If you store the license file on a memory stick, the Live CD
will recognize it automatically.Boot the CD and choose your language, keyboard layout,
etc.When the screen shown in
comes up, double-click on the ECRYPT test
suite icon. This will install the testing framework
in ~/ecrypt-test-suite, fetch
updates from the eSTREAM server if requested, and
immediately launch the testing process (see ). The tests can be aborted at any time by
pressing CtrlC.
The Live CD does not store anything on the hard disk. Any
changes you make will be lost after reboot, unless you
copy them manually (e.g., on a memory stick).
Screenshot of Live CD800x600GNU/LinuxThe testing framework should install and run without
problems on any recent Linux distribution. Here are the
Installation instructions:Download and untar the tarball of the testing framework:$ $ Correct the permissions of the scripts (this fix is
necessary because the current version of ViewCVS does
not understand the svn:executable
property).$ $ Make sure you have a compiler (GCC) installed and
configure the framework as explained in .Microsoft WindowsThe shell and system utilities required by the testing
framework are not present on a standard Windows
platform. Fortunately, there exist several freely available
software packages which provide this functionality. The
instructions below explain how to install the framework using
the MinGW/MSYS
packages.First install an ANSI C compiler. The testing
framework currently detects two compilers under
Windows:Microsoft C/C++ Optimizing Compiler, which is
included in Microsoft
Visual Studio and can be downloaded separately
from MSDN.GCC, which
has been ported to Windows by the MinGW project
(amongst others). The installation program is available
here.Install the MSYS
shell. The current version can be downloaded here.If you plan to use the MinGW compiler, make sure to
install it before installing MSYS.Download the tarball
of the testing framework and store it in your MSYS home
directory. In a default installation, this directory is
located in C:\msys\1.0\home\%USERNAME%. Open
an MSYS terminal and extract the tarball:$ $ Configure the framework as explained in .If the configuration script does not detect the
Microsoft Compiler, this probably indicates that the
environment variables PATH,
INCLUDE, and LIB are not set
correctly. The correct values can be found in a file named
vcvars32.bat. You can either add
these variables to your system in
Control PanelSystemAdvancedEnvironment Variables, or (for a default installation) add the
following line in
C:\msys\1.0\msys.bat:C:\Program Files\Microsoft Visual C++ Toolkit 2003\vcvars32.batUNIX PlatformsThe installation instructions for UNIX platforms are the
same as the ones given in . However,
depending on your operating system, some of the scripts might
fail to run correctly because of small compatibility
problems. The easiest way to avoid this is to replace some of
the UNIX utilities by their GNU equivalents:GNU
BashGNU
MakeGNU
CoreutilsWith these tools installed, the framework is known to run
correctly on the following platforms:HP-UX 11.00 (PA-RISC version) with HP C/HP-UX Version
B.11.11.02 and/or GCC.Solaris 8 (SPARC edition) with SUN Forte Developer 7 C
5.4 and/or GCC.Tru64 UNIX V5.1B (Alpha) with Compaq C V6.5-011 and/or
GCC.Other SystemsIt is probably not too hard to make the framework run on
other systems (e.g., Mac OS X on PowerPC). As soon as we have a
chance to test it, we will update this document.Running TestsThis section explains how to use the scripts inluded in the
testing framework. The three most important scripts are called
configure, run, and
collect and are located in the directory
./scripts. A fourth script,
called start, runs the three previous commands
one after another.In order to avoid having to prefix all commands with
./scripts/, add the scripts directory to
the PATH variable:$ ConfiguringThe configure script searches the path
for compilers, tests which compiler options are supported, and
collects information about the CPU. All information is stored in
the directory ./reports-$HOSTNAME. The first part
of the script's output is reproduced below:$ yIf the list of executables above contains programs which
you definitely do not want the script to run and test, simply
press n, edit the file
./reports-$HOSTNAME/candidates, and start
the script again.The final list of supported compilers and options is
stored in
./reports-$HOSTNAME/compilers. Note that
this first stage is normally only executed when the script is
launched for the very first time. The script will only perform a
new search, if the file
./reports-$HOSTNAME/compilers has been
deleted.The second part of the script provides the possibility to
enable or disable any compiler previously detected. The set of
active compilers can be modified at any time by running
configure again. to select everything: ]]>1,2,3The result of the script is a list of configuration files
in ./reports-$HOSTNAME/configs, each
of which defines a compiler and a combination of flags. Which
configurations will eventually be used during the benchmarking,
and in which order, is determined by the file
./reports-$HOSTNAME/shortlist, constructed
in the next step of the script: 5yIf you want the script to finish immediately after all
configurations in the shortlist have been tested, answer
n. The framework's default behavior is to
start testing compiler options which are not on the list, until
the script is either interrupted by the user (pressing
CtrlC),
or all configuration files in ./reports-$HOSTNAME/configs have
been tested.The last task of the configuration script is to determine
the CPU's clock frequency. The correct frequency should
automatically be detected on most platforms: if this is correct, or enter the clock speed:
]]>Launching TestsThe actual tests are launched with the command
run. When invoked without arguments, the
script will run through all implementations in the test suite,
compile each of them using the current compiler settings, and
perform the tests described in . This
is repeated for all compiler configurations in ./reports-$HOSTNAME/configs (or at
least those in the shortlist), and the results are stored in the
current working directory. The following example shows how
run is invoked in the
start script (all test reports are stored in
./reports-$HOSTNAME in
this case):$ $ An optional argument can be used to specify the directory
which contains the implementations to be tested. If the
specified path does not point to an existing directory, the
script will check whether it matches a directory in submissions or benchmarks. The command below, for
example, will only test the implementation of SNOW 2.0 (assuming
that the scripts
directory is in the PATH):$ $ $ On some platforms, the testing framework might uses the
standard clock() function to measure
timings. Unfortunately, this function has a rather low
resolution on most platforms. As a consequence, tests need to
be run for several seconds in order for the timing results to
be accurate. Depending on the number of primitives and
compiler options, the benchmarking can therefore take a very
long time. The tests can be aborted at any time by pressing
CtrlC,
though.Collecting the ResultsOnce the tests are finished (or have been aborted by the
user), the results can be collected with the
collect command. This script traverses the
current working directory tree and creates an HTML report (named
index.html) summarizing the benchmark
results for each subdirectory it encounters. For examples of the
HTML output, see .Submitting Optimized CodeThis section is a step-by-step guide to writing code which
can easily be integrated into the testing framework. Please make
sure to follow the steps described below before submitting
optimized code to the eSTREAM project.Step 1: Install the Testing FrameworkFirst, dowload, install, and configure the testing
framework as explained in detail in
and .Step 2: Edit Source FilesThe easiest way to create a new API-compliant
implementation of a stream cipher is to copy and edit the
reference implementation included in the testing
framework:$ $ $ There are a number of issues that should be taken into
account when editing the source code:PortabilityThe optimized code should compile under any ANSI C
compiler and run on any platform. This does not mean that
compiler or platform specific extensions cannot be
used. However, if non-standard constructs are used, then
the code should first check if the extensions are
supported by the compiler, and if not, provide (possibly
non-optimized) alternatives. Here is an example:= 1300)
/* optimized code using Microsoft Visual C++ .NET extensions */
#elif defined(__x86_64__)
/* code optimized for the AMD64 architecture */
#else
/* standard C code */
#endif]]>The code should also anticipate possible endianness
and data alignment problems when running on non-x86
platforms. This involves the following measures:Use the UXTOY_LITTLE or
UXTOY_BIG macros defined in
&ecrypt-portable.h; anywhere words are stored into bytes
or the other way around. These macros will change the
order of the bytes on all platforms where it is
required.Align all memory accesses. Several UNIX machines
will generate bus errors when words loaded from or
stored into the memory are not aligned on multiples of
the word size. Note that the code can safely assume that
all byte strings passed to the API are aligned on
multiples of the largest word size supported on the
machine (e.g., 128 bit on modern Pentium 4
processors).ReadabilityAvoid writing unnecessary complex code, i.e.:Only implement the functions strictly required by
the API. Code providing additional debugging
functionality, interactive user interfaces, etc. should
be disabled (using #ifndef ECRYPT_API), or
better yet, completely removed.Remove unused variables and 'dead' code.Manually unroll loops only if this makes the code
significantly faster.AssemblyAn efficient C implementation, compiled with a
modern compiler and with the proper optimization flags, is
often pretty difficult to beat using hand-coded
assembly. However, if you feel that your implementation
could significantly benefit from assembly, then here are
some guidelines:Always analyze the machine code generated by your
C compiler before writing your own assembly.If the only purpose of the assembly is to take
advantage of SIMD instructions, consider to replace it
by C intrinsics (the MMX intrinsics defined in mmintrin.h, for example,
are supported by GNU, Intel, and Microsoft).Avoid the use of assembly in non-critical parts of
your implementation. The preferred approach is to use
only a few short blocks of inline assembly in the inner
loops of your algorithm. As before, make sure that the
code checks whether the compiler supports the assembly
syntax, for example:> 27)) + c;
#endif]]>While eSTREAM does not encourage this, plain
assembly implementations can also be submitted, provided
that they can be processed by GCC (i.e.,
.s or .S
files). Here is an example of a .S
file:Step 3: Run TestsBefore submitting an optimized implementation, you should
make sure that (1) it interacts correctly with the testing
framework, and (2) it is indeed more efficient than the existing
code. In order to verify these conditions, create a test
directory and run the tests as described in :$ $ $ $ The scripts should not return any error. If they do, then
the code is probably not API compliant. The sections below
describe a number of common problems and explain how to resolve
them.The compilation failsIf the compilation fails, check the error messages in
errors_*. Typical compilation errors
are:undefined reference to
`ECRYPT_init'This message indicates that your code does not
define the ECRYPT_init
function. This function should always be defined, even
if it is left empty.syntax error before '/'
tokenReplace non-standard
// comments by
/* ... */.syntax error before
"u64"On 32-bit platforms, u64 variables
are defined as unsigned long long. This
type exists in the ISO C99 standard, but not in ISO
C89. Therefore, if your code uses u64
variables, replace the third line of the
Makefile by:The execution fails when trying to generate test
vectorsIf the script refuses to generate test vectors, this
typically indicates that the code did not pass the API
compliance tests described in . The file errors_*
reports which tests failed. In order to correct these
problems, make sure that:all variables that need to be transferred from one
function call to another are stored in the
ECRYPT_ctx structure, and not in static
variables or dynamically allocated memory;all variables are initialized;the function ECRYPT_ivsetup
reinitializes all variables whose values were changed
during the encryption with previous IVs.Another common problem which causes the execution to
fail on 64-bit machines are bus errors due to misaligned
data. When loading or storing words into the memory, always
make sure that they are aligned on multiples of the word
size (see also ).The test vectors do not matchIf the test vectors generated by the scripts differ from
the ones included in the testing framework, this might have
two causes:There is a bug in the optimized
implementation.A common problem on UNIX platforms are endianness
issues. As explained in , always
use the UXTOY_LITTLE or
UXTOY_BIG macros when translating
bytes into words or vice versa.There was a bug in the original reference code. In
this case new test vectors need to be generated. This can
be done by issuing the following command in the source
directory:$ If you do not have GCC installed, you will also need
to specify which configuration file (see ) to use, for example:
.The command above will generate a file called
unverified.test-vectors, which can be
renamed to verified.test-vectors once
the correctness of the test vectors have been
verified:$ Step 4: Submit CodeOptimized implementations which pass all tests described
above should be mailed to
estreamtesting@ecrypt.eu.org. The mail should
contain the following information:A .tar.gz or
.zip attachment containing the following
files (and only these files):an API compliant header file (i.e., the ecrypt-sync.h or ecrypt-sync-ae.h file);the .c file (and
.h files, if any) implementing the
primitive;a Makefile (this file should
normally not have been modified);a file called
verified.test-vectors containing
the correct test vectors (only if the existing test
vectors happened to be incorrect).A confirmation that the test vectors included in the
testing framework have been verified.A note stating whether or not the implementation
should replace the existing reference code.Latest Performance FiguresThe table below links to the most recent reports produced by
the eSTREAM testing framework. These reports will regularly be
updated as new implementations are submitted. It is important to
emphasize that the current results are very preliminary: the
implementations currently included in the framework only serve as
reference code, and are not necessarily optimized.
Frequently Asked QuestionsSend your questions to
estreamtesting@ecrypt.eu.org. Frequently asked
questions will be added to this section.Further InformationFor further information about the eSTREAM project, please
visit the eSTREAM webpage and the
discussion
forum.ReferencesAgilent TechnologiesJTC 003 Mixed Packet Size Throughput