OpenSSLNTRU announces software online for a demo of web browsing taking just 166000 Haswell cycles to generate a new one-time sntrup761 public key for each TLS 1.3 session. This demo uses
- the Gnome web browser (client) and
- a patched version of OpenSSL 1.1.1g using
- a new OpenSSL ENGINE using
- a new sntrup761 library.
This is joint work.
Authors in alphabetical order:
Daniel J. Bernstein,
Billy Bob Brumley,
(leader for #4),
and Nicola Tuveri
(leader for #3).
The new speed is much faster
than previously announced speeds for
In combination with the (recently announced) 48780 Haswell cycles for enc
and 59120 Haswell cycles for dec,
this new keygen speed
means a total of just 273900 cycles for sntrup761 keygen+enc+dec.
The TLS 1.3 integration here uses the same basic data flow as the CECPQ2 experiment carried out by Google and Cloudflare: the client generates a one-time public key, the server encapsulates to that one-time key, and the client decapsulates, obtaining a one-time session key. Beware that this data flow is designed only to protect against attacks by future quantum computers ("transitional" security); stopping active attacks will also require long-term post-quantum identity keys.
CECPQ2 used (a minor variant of)
ntruhrss701 for this data flow.
The state-of-the-art (March 2020) software for
takes 272028 cycles for keygen,
26116 cycles for enc,
and 63632 cycles for dec,
for a total of 361776 cycles.
The CECPQ2 experiments showed that
ntruhrss701's CPU time
consumes very little of the overall TLS time.
sntrup761 software here
consumes even less time.
The CECPQ2 experiments
showed a somewhat more noticeable impact of network traffic
on the slowest connections;
sntrup761 sends 2197 bytes (one-time key+ciphertext)
ntruhrss701 sends 2276.
Here's the comparison table
(all numbers are from SUPERCOP except for the new 166000 for
This should put an end to the idea
sntrup761 keygen is too slow for TLS.
ntruhrss701 are designed for IND-CCA2 security,
as recommended in most of the NISTPQC lattice submissions
and in Google's CECPQ2 announcement:
"CCA2-security is worthwhile, even though TLS can do without. ...
CPA vs CCA security is a subtle and dangerous distinction,
and if we're going to invest in a post-quantum primitive,
better it not be fragile."
Taking away IND-CCA2 security
would speed up both
by removing some hashing and removing (basically) a copy of enc from dec.
Google's earlier CECPQ1 experiment used
an early non-IND-CCA-secure version of
with approximately 200000 cycles of total computation
and more than 4000 bytes of network traffic
(more than 4 million in the 1000*bytes+cycles metric),
and concluded that this "would be practical to quickly deploy".
sntrup761 keygen speed
comes from generating 32 independent keys at once,
using Montgomery's trick for batch inversion.
This option has been pointed out before.
The new demo shows that this option
fits into a CECPQ2-type data flow in TLS 1.3.
The total latency of generating 32 keys is around two milliseconds;
keys can be generated in advance of being used,
reducing the impact on TLS latency to zero
(with or without Montgomery's trick).
one still has to generate each new key at some point,
but the new
shows that Montgomery's trick provides excellent throughput.
Montgomery's trick replaces each batch of inversions with one shared inversion and a batch of multiplications. In the context here, there is a batch of 32 inversions mod q and a batch of 32 inversions mod 3, using 1 shared inversion mod q and 1 shared inversion mod 3. Out of the 166000 cycles per key for a batch of 32 keys, about 30000 cycles per key are spent on the shared inversions, and simply increasing the batch size further reduces this cost. With slightly more work it is possible to share transforms across various multiplications. Consequently, the current software speed is not the limit of what can be achieved.
One can also use Montgomery's trick
for some other NISTPQC submissions
that rely on inversion as part of keygen,
but the dramatic speedup for
doesn't imply a similarly dramatic (or even nonzero) speedup
for those other submissions.
already exploits the power-of-2 structure of its q
for a Hensel lift.
In the Montgomery context,
the Hensel speedup rapidly vanishes,
while multiplication speeds and other overheads
become more important.
There's some gap
in multiplication speed
sntrup761 aims for a higher security level,
uses larger polynomials,
and requires a field)
but this is only about 8000 cycles
per multiplication with the current software.
Demo instructions appear below.
This demo comes with no cryptographic warranties
and no other security warranties.
The software here is experimental,
and is built upon other software
with a long history of security problems,
such as OpenSSL.
The purpose of this demo
is purely to show the
achievable with a CECPQ2-type data flow for TLS 1.3.
The demo has two parts: a server side and a client side. We recommend running each side in its own VM.
The server side uses
stunnel for SSL termination.
It receives TLS connections,
and passes along the answers
provided by a preexisting back-end web server,
which does not need to support
For example, the demo site
looks just like the preexisting site
but with the extra feature
passes requests along through a local connection
to the preexisting back-end web server
You can use
as the server side of this demo,
or you can set up the server side
for a web server of your choice.
The client side uses Epiphany,
the Gnome web browser,
with no modifications to the Epiphany source code.
glib-networking library used inside Epiphany
already supports OpenSSL as an option for outgoing connections,
and is configured below to use this option.
Both sides use a version of OpenSSL 1.1.1g
libssl to support
as experimental group 0xfe00 for TLS 1.3,
and patched inside
to include a reference implementation of
then overrides this reference implementation
with a fast implementation,
which in turn is built on top of our new
This way of using the OpenSSL ENGINE feature
allows OpenSSL to take advantage of fast software implementations
while allowing those implementations
to be developed in separate libraries;
Various other applications that use OpenSSL
have been verified to work with
This demo focuses on
stunnel on the server side
and Epiphany on the client side.
The following instructions for setting up the server side
have been tested in a VM
running Debian 11 (Bullseye)
on a CPU supporting AVX2.
You can skip down to the client side
if you simply want to try
as the server.
apt install wget python3 build-essential clang cmake ruby pkg-config -y adduser --disabled-password --gecos opensslntru opensslntru
As the new
(change the first three lines
for your own demo server name,
demo server address,
and preexisting back-end server address—of course,
you should use your favorite VPN
to protect the connection
from this SSL terminator to the back-end server):
EXTERNALNAME=test761.cr.yp.to EXTERNALADDRESS=18.104.22.168:65024 # provide TLS service on this address INTERNALADDRESS=22.214.171.124:80 # use existing server on this address export PATH=$HOME/bin:$PATH cd wget https://www.openssl.org/source/openssl-1.1.1g.tar.gz wget https://ntruprime.cr.yp.to/opensslntru/openssl-1.1.1g-ntru.patch tar -xf openssl-1.1.1g.tar.gz mv openssl-1.1.1g openssl-1.1.1g-ntru cd openssl-1.1.1g-ntru patch -p1 < ../openssl-1.1.1g-ntru.patch ./config shared --prefix=$HOME --openssldir=$HOME -Wl,-rpath=$HOME/lib make -j8 # a few minutes make test # more minutes make install_sw cd wget https://ntruprime.cr.yp.to/opensslntru/libsntrup761-20200415.tar.gz tar -xf libsntrup761-20200415.tar.gz cd libsntrup761-20200415 env USE_RPATH=RUNPATH DESTDIR=$HOME CPATH=$HOME/include LIBRARY_PATH=$HOME/lib make all install test cd wget https://ntruprime.cr.yp.to/opensslntru/engntru-20200415.tar.gz tar -xf engntru-20200415.tar.gz cd engntru-20200415 mkdir build cd build cmake -DCMAKE_PREFIX_PATH="$HOME;$HOME/usr/local" .. make make test make install cd wget https://www.stunnel.org/downloads/stunnel-5.56.tar.gz tar -xf stunnel-5.56.tar.gz cd stunnel-5.56 ./configure --prefix=$HOME --with-ssl=$HOME LDFLAGS=-Wl,-rpath=$HOME/lib make make install cd mkdir service cd service openssl req -x509 -sha256 -nodes -newkey rsa:2048 -keyout "$EXTERNALNAME.key" -days 730 -out "$EXTERNALNAME.crt" -subj "/CN=$EXTERNALNAME" -config /etc/ssl/openssl.cnf ( echo "key = $EXTERNALNAME.key" echo "cert = $EXTERNALNAME.crt" echo 'foreground = yes' echo 'engine = engntru' echo 'engineDefault = ALL' echo '[forward]' echo "accept = $EXTERNALADDRESS" echo "connect = $INTERNALADDRESS" echo 'curves = SNTRUP761:X25519:P-256' echo 'config = MinProtocol:TLSv1.2' echo 'ciphers = ECDHE+CHACHA20:ECDHE+AES256:ECDHE+AES128:!aNULL:!eNULL:!LOW:!EXPORT:!DES:!3DES:!RC4:!MD5:!PSK:!SRP:!DSS:!aECDSA' ) > stunnel.conf
( echo '[Unit]' echo 'Description=opensslntru forwarding' echo 'DefaultDependencies=no' echo 'After=network.target' echo '' echo '[Service]' echo 'Type=simple' echo 'User=opensslntru' echo 'Group=opensslntru' echo 'WorkingDirectory=/home/opensslntru/service' echo 'ExecStart=/home/opensslntru/bin/stunnel stunnel.conf' echo '' echo '[Install]' echo 'WantedBy=default.target' ) > /etc/systemd/system/opensslntru.service ln -s /etc/systemd/system/opensslntru.service /etc/systemd/system/multi-user.target.wants systemctl restart opensslntru
At this point the server should be working. Try any browser to connect to the server's external address. The certificate is self-signed; signing it with Let's Encrypt is recommended but is outside the scope of these instructions.
passes SNI along from the client to the server,
so the client is free to access
any server name provided by the server.
For example, almost all
are hosted on the same back-end server
and can now be retrieved through
although for the moment
this is announced to the client (and signed)
You can advertise multiple names
on the same server
through the same stunnel configuration
by adding those names to DNS
and creating an appropriate certificate.
You can instead configure
to forward different SNI choices
to different servers with different certificates.
The following instructions for setting up the client side have been tested in a VM running Debian 10 (Buster) on a CPU supporting AVX2.
apt install wget python3 build-essential clang cmake \ ruby pkg-config epiphany-browser meson gnome-pkg-tools \ libglib2.0-dev libproxy-dev \ gsettings-desktop-schemas-dev ca-certificates -y adduser --disabled-password --gecos opensslntru opensslntru
As the new
export PATH=$HOME/bin:$PATH cd wget https://www.openssl.org/source/openssl-1.1.1g.tar.gz wget https://ntruprime.cr.yp.to/opensslntru/openssl-1.1.1g-ntru.patch tar -xf openssl-1.1.1g.tar.gz mv openssl-1.1.1g openssl-1.1.1g-ntru cd openssl-1.1.1g-ntru patch -p1 < ../openssl-1.1.1g-ntru.patch ./config shared --prefix=$HOME --openssldir=$HOME -Wl,-rpath=$HOME/lib make -j8 # a few minutes make test # more minutes make install_sw cd wget https://ntruprime.cr.yp.to/opensslntru/libsntrup761-20200415.tar.gz tar -xf libsntrup761-20200415.tar.gz cd libsntrup761-20200415 env USE_RPATH=RUNPATH DESTDIR=$HOME CPATH=$HOME/include LIBRARY_PATH=$HOME/lib make all install test cd wget https://ntruprime.cr.yp.to/opensslntru/engntru-20200415.tar.gz tar -xf engntru-20200415.tar.gz cd engntru-20200415 mkdir build cd build cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH="$HOME;$HOME/usr/local" .. make make test make install cd git clone --branch 2.60.2 https://gitlab.gnome.org/GNOME/glib-networking.git cd glib-networking mkdir build cd build env PKG_CONFIG_PATH=$HOME/lib/pkgconfig CPATH=$HOME/include LIBRARY_PATH=$HOME/lib meson --prefix=$HOME -Dopenssl=enabled -Dgnutls=disabled .. ninja ninja install cd wget https://ntruprime.cr.yp.to/opensslntru/openssl-engntru.cnf export OPENSSL_CONF=$HOME/openssl-engntru.cnf export LD_LIBRARY_PATH=$HOME/lib export GIO_MODULE_DIR=$HOME/lib/x86_64-linux-gnu/gio/modules export ENGNTRU_DEBUG=4 # to watch engntru activating ln -s /etc/ssl/certs $HOME/certs epiphany https://test761.cr.yp.to
You should be able to browse to this demo server
whichever other demo servers you set up above
and other sites
(typically not using
ENGNTRU_DEBUG=4 log information in the terminal
includes a note for each
a note for each
and a note for each computation of a batch of 32 keys.
Version: This is version 2020.08.17 of the "Demo" web page.