OpenSSLNTRU

This demo was announced 2020.04.16 on the pqc-forum mailing list, and updated 2020.04.23 from OpenSSL 1.1.1f to OpenSSL 1.1.1g. The same patch works for both versions of OpenSSL.


OpenSSLNTRU announces software online for a demo of web browsing taking just 166000 Haswell cycles to generate a new one-time sntrup761 public key for each TLS 1.3 session. This demo uses

  1. the Gnome web browser (client) and stunnel (server) using
  2. a patched version of OpenSSL 1.1.1g using
  3. a new OpenSSL ENGINE using
  4. a new sntrup761 library.

This is joint work. Authors in alphabetical order: Daniel J. Bernstein, Billy Bob Brumley, Ming-Shing Chen (leader for #4), and Nicola Tuveri (leader for #3). Email address: authorcontact-opensslntru at box.cr.yp.to.

The new speed is much faster than previously announced speeds for sntrup761 keygen. In combination with the (recently announced) 48780 Haswell cycles for enc and 59120 Haswell cycles for dec, this new keygen speed means a total of just 273900 cycles for sntrup761 keygen+enc+dec.

The TLS 1.3 integration here uses the same basic data flow as the CECPQ2 experiment carried out by Google and Cloudflare: the client generates a one-time public key, the server encapsulates to that one-time key, and the client decapsulates, obtaining a one-time session key. Beware that this data flow is designed only to protect against attacks by future quantum computers ("transitional" security); stopping active attacks will also require long-term post-quantum identity keys.

CECPQ2 used (a minor variant of) ntruhrss701 for this data flow. The state-of-the-art (March 2020) software for ntruhrss701 takes 272028 cycles for keygen, 26116 cycles for enc, and 63632 cycles for dec, for a total of 361776 cycles.

The CECPQ2 experiments showed that ntruhrss701's CPU time consumes very little of the overall TLS time. The new sntrup761 software here consumes even less time. The CECPQ2 experiments showed a somewhat more noticeable impact of network traffic on the slowest connections; sntrup761 sends 2197 bytes (one-time key+ciphertext) where ntruhrss701 sends 2276. Here's the comparison table (all numbers are from SUPERCOP except for the new 166000 for sntrup761 keygen):

ntruhrss701 sntrup761
public-key bytes 1138 1158
ciphertext bytes 1138 1039
pk+ciphertext bytes 2276 2197
keygen cycles 272028 166000
enc cycles 26116 48780
dec cycles 63632 59120
keygen+enc+dec cycles 361776 273900
1000*bytes+cycles 2637776 2470900

This should put an end to the idea that sntrup761 keygen is too slow for TLS.

Both sntrup761 and ntruhrss701 are designed for IND-CCA2 security, as recommended in most of the NISTPQC lattice submissions and in Google's CECPQ2 announcement: "CCA2-security is worthwhile, even though TLS can do without. ... CPA vs CCA security is a subtle and dangerous distinction, and if we're going to invest in a post-quantum primitive, better it not be fragile."

Taking away IND-CCA2 security would speed up both ntruhrss701 and sntrup761 by removing some hashing and removing (basically) a copy of enc from dec. For comparison, Google's earlier CECPQ1 experiment used an early non-IND-CCA-secure version of newhope1024, with approximately 200000 cycles of total computation and more than 4000 bytes of network traffic (more than 4 million in the 1000*bytes+cycles metric), and concluded that this "would be practical to quickly deploy".

Algorithmically, the new sntrup761 keygen speed comes from generating 32 independent keys at once, using Montgomery's trick for batch inversion. This option has been pointed out before. The new demo shows that this option fits into a CECPQ2-type data flow in TLS 1.3. The total latency of generating 32 keys is around two milliseconds; even better, keys can be generated in advance of being used, reducing the impact on TLS latency to zero (with or without Montgomery's trick). Of course, one still has to generate each new key at some point, but the new sntrup761 software shows that Montgomery's trick provides excellent throughput.

Montgomery's trick replaces each batch of inversions with one shared inversion and a batch of multiplications. In the context here, there is a batch of 32 inversions mod q and a batch of 32 inversions mod 3, using 1 shared inversion mod q and 1 shared inversion mod 3. Out of the 166000 cycles per key for a batch of 32 keys, about 30000 cycles per key are spent on the shared inversions, and simply increasing the batch size further reduces this cost. With slightly more work it is possible to share transforms across various multiplications. Consequently, the current software speed is not the limit of what can be achieved.

One can also use Montgomery's trick for some other NISTPQC submissions that rely on inversion as part of keygen, but the dramatic speedup for sntrup761 doesn't imply a similarly dramatic (or even nonzero) speedup for those other submissions. In particular, the current ntruhrss701 keygen already exploits the power-of-2 structure of its q for a Hensel lift. In the Montgomery context, the Hensel speedup rapidly vanishes, while multiplication speeds and other overheads become more important. There's some gap between ntruhrss701 and sntrup761 in multiplication speed (sntrup761 aims for a higher security level, uses larger polynomials, and requires a field) but this is only about 8000 cycles per multiplication with the current software.

Demo instructions appear below.

Demo overview

Warning: This demo comes with no cryptographic warranties and no other security warranties. The software here is experimental, and is built upon other software with a long history of security problems, such as OpenSSL. The purpose of this demo is purely to show the sntrup761 performance achievable with a CECPQ2-type data flow for TLS 1.3.

The demo has two parts: a server side and a client side. We recommend running each side in its own VM.

The server side uses stunnel for SSL termination. It receives TLS connections, including sntrup761 connections, and passes along the answers provided by a preexisting back-end web server, which does not need to support sntrup761 connections. For example, the demo site https://test761.cr.yp.to looks just like the preexisting site https://ntruprime.cr.yp.to, but with the extra feature of supporting sntrup761 connections. Internally, https://test761.cr.yp.to passes requests along through a local connection to the preexisting back-end web server for ntruprime.cr.yp.to. You can use https://test761.cr.yp.to as the server side of this demo, or you can set up the server side for a web server of your choice.

The client side uses Epiphany, the Gnome web browser, with no modifications to the Epiphany source code. The glib-networking library used inside Epiphany already supports OpenSSL as an option for outgoing connections, and is configured below to use this option.

Both sides use a version of OpenSSL 1.1.1g patched inside libssl to support sntrup761 as experimental group 0xfe00 for TLS 1.3, and patched inside libcrypto to include a reference implementation of sntrup761. Our new engntru library then overrides this reference implementation with a fast implementation, which in turn is built on top of our new libsntrup761. This way of using the OpenSSL ENGINE feature allows OpenSSL to take advantage of fast software implementations while allowing those implementations to be developed in separate libraries; see https://eprint.iacr.org/2018/354.

Various other applications that use OpenSSL have been verified to work with libsntrup761 via engntru. This demo focuses on stunnel on the server side and Epiphany on the client side.

Server side

The following instructions for setting up the server side have been tested in a VM running Debian 11 (Bullseye) on a CPU supporting AVX2. You can skip down to the client side if you simply want to try https://test761.cr.yp.to as the server.

As root:

    apt install wget python3 build-essential clang cmake ruby pkg-config -y
    adduser --disabled-password --gecos opensslntru opensslntru

As the new opensslntru user (change the first three lines for your own demo server name, demo server address, and preexisting back-end server address—of course, you should use your favorite VPN to protect the connection from this SSL terminator to the back-end server):

    EXTERNALNAME=test761.cr.yp.to
    EXTERNALADDRESS=1.2.3.4:65024  # provide TLS service on this address
    INTERNALADDRESS=5.6.7.8:80     # use existing server on this address

    export PATH=$HOME/bin:$PATH

    cd
    wget https://www.openssl.org/source/openssl-1.1.1g.tar.gz
    wget https://ntruprime.cr.yp.to/opensslntru/openssl-1.1.1g-ntru.patch
    tar -xf openssl-1.1.1g.tar.gz
    mv openssl-1.1.1g openssl-1.1.1g-ntru
    cd openssl-1.1.1g-ntru
    patch -p1 < ../openssl-1.1.1g-ntru.patch
    ./config shared --prefix=$HOME --openssldir=$HOME -Wl,-rpath=$HOME/lib
    make -j8   # a few minutes
    make test  # more minutes
    make install_sw

    cd
    wget https://ntruprime.cr.yp.to/opensslntru/libsntrup761-20200415.tar.gz
    tar -xf libsntrup761-20200415.tar.gz
    cd libsntrup761-20200415
    env USE_RPATH=RUNPATH DESTDIR=$HOME CPATH=$HOME/include LIBRARY_PATH=$HOME/lib make all install test

    cd
    wget https://ntruprime.cr.yp.to/opensslntru/engntru-20200415.tar.gz
    tar -xf engntru-20200415.tar.gz
    cd engntru-20200415
    mkdir build
    cd build
    cmake -DCMAKE_PREFIX_PATH="$HOME;$HOME/usr/local" ..
    make
    make test
    make install

    cd
    wget https://www.stunnel.org/downloads/stunnel-5.56.tar.gz
    tar -xf stunnel-5.56.tar.gz
    cd stunnel-5.56
    ./configure --prefix=$HOME --with-ssl=$HOME LDFLAGS=-Wl,-rpath=$HOME/lib
    make
    make install

    cd
    mkdir service
    cd service
    openssl req -x509 -sha256 -nodes -newkey rsa:2048 -keyout "$EXTERNALNAME.key" -days 730 -out "$EXTERNALNAME.crt" -subj "/CN=$EXTERNALNAME" -config /etc/ssl/openssl.cnf
    (
      echo "key = $EXTERNALNAME.key"
      echo "cert = $EXTERNALNAME.crt"
      echo 'foreground = yes'
      echo 'engine = engntru'
      echo 'engineDefault = ALL'
      echo '[forward]'
      echo "accept = $EXTERNALADDRESS"
      echo "connect = $INTERNALADDRESS"
      echo 'curves = SNTRUP761:X25519:P-256'
      echo 'config = MinProtocol:TLSv1.2'
      echo 'ciphers = ECDHE+CHACHA20:ECDHE+AES256:ECDHE+AES128:!aNULL:!eNULL:!LOW:!EXPORT:!DES:!3DES:!RC4:!MD5:!PSK:!SRP:!DSS:!aECDSA'
    ) > stunnel.conf

As root:

    (
      echo '[Unit]'
      echo 'Description=opensslntru forwarding'
      echo 'DefaultDependencies=no'
      echo 'After=network.target'
      echo ''
      echo '[Service]'
      echo 'Type=simple'
      echo 'User=opensslntru'
      echo 'Group=opensslntru'
      echo 'WorkingDirectory=/home/opensslntru/service'
      echo 'ExecStart=/home/opensslntru/bin/stunnel stunnel.conf'
      echo ''
      echo '[Install]'
      echo 'WantedBy=default.target'
    ) > /etc/systemd/system/opensslntru.service
    ln -s /etc/systemd/system/opensslntru.service /etc/systemd/system/multi-user.target.wants
    systemctl restart opensslntru

At this point the server should be working. Try any browser to connect to the server's external address. The certificate is self-signed; signing it with Let's Encrypt is recommended but is outside the scope of these instructions.

This stunnel configuration passes SNI along from the client to the server, so the client is free to access any server name provided by the server. For example, almost all *.cr.yp.to are hosted on the same back-end server and can now be retrieved through sntrup761, although for the moment this is announced to the client (and signed) only for test761.cr.yp.to. You can advertise multiple names on the same server through the same stunnel configuration by adding those names to DNS and creating an appropriate certificate. You can instead configure stunnel to forward different SNI choices to different servers with different certificates.

Client side

The following instructions for setting up the client side have been tested in a VM running Debian 10 (Buster) on a CPU supporting AVX2.

As root:

    apt install wget python3 build-essential clang cmake \
      ruby pkg-config epiphany-browser meson gnome-pkg-tools \
      libglib2.0-dev libproxy-dev \
      gsettings-desktop-schemas-dev ca-certificates -y
    adduser --disabled-password --gecos opensslntru opensslntru

As the new opensslntru user:

    export PATH=$HOME/bin:$PATH

    cd
    wget https://www.openssl.org/source/openssl-1.1.1g.tar.gz
    wget https://ntruprime.cr.yp.to/opensslntru/openssl-1.1.1g-ntru.patch
    tar -xf openssl-1.1.1g.tar.gz
    mv openssl-1.1.1g openssl-1.1.1g-ntru
    cd openssl-1.1.1g-ntru
    patch -p1 < ../openssl-1.1.1g-ntru.patch
    ./config shared --prefix=$HOME --openssldir=$HOME -Wl,-rpath=$HOME/lib
    make -j8   # a few minutes
    make test  # more minutes
    make install_sw

    cd
    wget https://ntruprime.cr.yp.to/opensslntru/libsntrup761-20200415.tar.gz
    tar -xf libsntrup761-20200415.tar.gz
    cd libsntrup761-20200415
    env USE_RPATH=RUNPATH DESTDIR=$HOME CPATH=$HOME/include LIBRARY_PATH=$HOME/lib make all install test

    cd
    wget https://ntruprime.cr.yp.to/opensslntru/engntru-20200415.tar.gz
    tar -xf engntru-20200415.tar.gz
    cd engntru-20200415
    mkdir build
    cd build
    cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_PREFIX_PATH="$HOME;$HOME/usr/local" ..
    make
    make test
    make install

    cd
    git clone --branch 2.60.2 https://gitlab.gnome.org/GNOME/glib-networking.git
    cd glib-networking
    mkdir build
    cd build
    env PKG_CONFIG_PATH=$HOME/lib/pkgconfig CPATH=$HOME/include LIBRARY_PATH=$HOME/lib meson --prefix=$HOME -Dopenssl=enabled -Dgnutls=disabled ..
    ninja
    ninja install

    cd
    wget https://ntruprime.cr.yp.to/opensslntru/openssl-engntru.cnf
    export OPENSSL_CONF=$HOME/openssl-engntru.cnf
    export LD_LIBRARY_PATH=$HOME/lib
    export GIO_MODULE_DIR=$HOME/lib/x86_64-linux-gnu/gio/modules
    export ENGNTRU_DEBUG=4  # to watch engntru activating
    ln -s /etc/ssl/certs $HOME/certs

    epiphany https://test761.cr.yp.to

You should be able to browse to this demo server (using sntrup761), whichever other demo servers you set up above (using sntrup761), and other sites (typically not using sntrup761 yet). The ENGNTRU_DEBUG=4 log information in the terminal includes a note for each sntrup761 keygen, a note for each sntrup761 dec, and a note for each computation of a batch of 32 keys.


Version: This is version 2020.08.17 of the "Demo" web page.