Commit 194c1564 authored by Magnus Holmgren's avatar Magnus Holmgren Committed by Ritesh Raj Sarraf
Browse files

Import Debian changes 3.7.3-1

parents 5a259d5c 67fbaf5f
Pipeline #280006 passed with stages
in 21 seconds
2021-05-22 Niels Möller <nisse@lysator.liu.se>
* configure.ac: Bump package version, to 3.7.3.
(LIBNETTLE_MINOR): Bump minor number, to 8.4.
(LIBHOGWEED_MINOR): Bump minor number, to 6.4.
2021-05-17 Niels Möller <nisse@lysator.liu.se>
* rsa-decrypt-tr.c (rsa_decrypt_tr): Check up-front that input is
in range.
* rsa-sec-decrypt.c (rsa_sec_decrypt): Likewise.
* rsa-decrypt.c (rsa_decrypt): Likewise.
* testsuite/rsa-encrypt-test.c (test_main): Add tests with input > n.
2021-05-14 Niels Möller <nisse@lysator.liu.se>
* rsa-sign-tr.c (rsa_sec_blind): Delete mn argument.
(_rsa_sec_compute_root_tr): Delete mn argument, instead require
that input size matches key size. Rearrange use of temporary
storage, to support in-place operation, x == m. Update all
callers.
* rsa-decrypt-tr.c (rsa_decrypt_tr): Make zero-padded copy of
input, for calling _rsa_sec_compute_root_tr.
* rsa-sec-decrypt.c (rsa_sec_decrypt): Likewise.
* testsuite/rsa-encrypt-test.c (test_main): Test calling all of
rsa_decrypt, rsa_decrypt_tr, and rsa_sec_decrypt with zero input.
2021-05-06 Niels Möller <nisse@lysator.liu.se>
* pkcs1-sec-decrypt.c (_pkcs1_sec_decrypt): Check that message
length is valid, for given key size.
* testsuite/rsa-sec-decrypt-test.c (test_main): Add test cases for
calls to rsa_sec_decrypt specifying a too large message length.
2021-03-21 Niels Möller <nisse@lysator.liu.se>
* NEWS: NEWS entries for 3.7.2.
2021-03-17 Niels Möller <nisse@lysator.liu.se>
* configure.ac: Bump package version, to 3.7.2.
(LIBNETTLE_MINOR): Bump minor number, to 8.3.
(LIBHOGWEED_MINOR): Bump minor number, to 6.3.
2021-03-13 Niels Möller <nisse@lysator.liu.se>
* gostdsa-vko.c (gostdsa_vko): Use ecc_mod_mul_canonical to
compute the scalar used for ecc multiplication.
* eddsa-hash.c (_eddsa_hash): Ensure result is canonically
reduced. Two of the three call sites need that.
* ecc-gostdsa-verify.c (ecc_gostdsa_verify): Use ecc_mod_mul_canonical
to compute the scalars used for ecc multiplication.
* ecc-ecdsa-sign.c (ecc_ecdsa_sign): Ensure s output is reduced to
canonical range.
* ecc-ecdsa-verify.c (ecc_ecdsa_verify): Use ecc_mod_mul_canonical
to compute the scalars used for ecc multiplication.
* testsuite/ecdsa-verify-test.c (test_main): Add test case that
triggers an assert on 64-bit platforms, without above fix.
* testsuite/ecdsa-sign-test.c (test_main): Test case generating
the same signature.
2021-03-13 Niels Möller <nisse@lysator.liu.se>
* eddsa-verify.c (equal_h): Use ecc_mod_mul_canonical.
2021-03-11 Niels Möller <nisse@lysator.liu.se>
* ecc-mod-arith.c (ecc_mod_mul_canonical, ecc_mod_sqr_canonical):
New functions.
* ecc-internal.h: Declare and document new functions.
* curve448-eh-to-x.c (curve448_eh_to_x): Use ecc_mod_sqr_canonical.
* curve25519-eh-to-x.c (curve25519_eh_to_x): Use ecc_mod_mul_canonical.
* ecc-eh-to-a.c (ecc_eh_to_a): Likewise.
* ecc-j-to-a.c (ecc_j_to_a): Likewise.
* ecc-mul-m.c (ecc_mul_m): Likewise.
2021-02-17 Niels Möller <nisse@lysator.liu.se>
* Released Nettle-3.7.1.
2021-02-15 Niels Möller <nisse@lysator.liu.se>
* examples/nettle-openssl.c (nettle_openssl_arcfour128): Deleted
glue to openssl arcfour.
(openssl_arcfour128_set_encrypt_key)
(openssl_arcfour128_set_decrypt_key): Deleted.
* nettle-internal.h: Deleted declaration.
* examples/nettle-benchmark.c (aeads): Delete benchmarking.
2021-02-13 Niels Möller <nisse@lysator.liu.se>
* configure.ac: Bump package version, to 3.7.1.
(LIBNETTLE_MINOR): Bump minor number, to 8.2.
(LIBHOGWEED_MINOR): Bump minor number, to 6.2.
2021-02-10 Niels Möller <nisse@lysator.liu.se>
* chacha-crypt.c (_nettle_chacha_crypt_4core): Fix for the case
that counter increment should be 3 (129 <= message length <= 192).
(_nettle_chacha_crypt32_4core): Likewise.
* testsuite/chacha-test.c (test_chacha_rounds): New function, for
tests with non-standard round count. Extracted from _test_chacha.
(_test_chacha): Deleted rounds argument. Reorganized crypt/crypt32
handling. When testing message prefixes of varying length, also
encrypt the remainder of the message, to catch errors in counter
value update.
(test_main): Add a few tests with large messages (16 blocks, 1024
octets), to improve test coverage for _nettle_chacha_crypt_4core
and _nettle_chacha_crypt32_4core.
2021-01-25 Niels Möller <nisse@lysator.liu.se>
* arm/neon/salsa20-core-internal.asm: Deleted file. This ARM Neon
implementation reportedly gave a speedup of 45% on Cortex A9,
compared to the C implementation, when it was added back in 2013.
That appears to no longer be the case with more recent processors
and compilers. And it's even significantly slower than the C
implementation on some platforms, including the Raspberry Pi 4.
With the introduction of salsa20-2core.asm, performance of this
function is also less important.
* arm/neon/chacha-core-internal.asm: Deleted file, for analogous reasons.
* arm/fat/salsa20-core-internal-2.asm: Deleted file.
* arm/fat/chacha-core-internal-2.asm: Deleted file.
* fat-arm.c (_nettle_salsa20_core, _nettle_chacha_core): Delete fat setup.
2021-01-31 Niels Möller <nisse@lysator.liu.se>
New variants, contributed by Nicolas Mora.
* pbkdf2-hmac-sha384.c (pbkdf2_hmac_sha384): New file and function.
* pbkdf2-hmac-sha512.c (pbkdf2_hmac_sha512): New file and function.
* testsuite/pbkdf2-test.c (test_main): Corresponding tests.
2021-01-20 Niels Möller <nisse@lysator.liu.se>
* ecc-ecdsa-verify.c (ecc_ecdsa_verify): Fix corner case with
all-zero hash. Reported by Guido Vranken.
* testsuite/ecdsa-verify-test.c: Add corresponding test case.
2021-01-10 Niels Möller <nisse@lysator.liu.se>
* fat-ppc.c: Don't use __GLIBC_PREREQ in the same preprocessor
conditional as defined(__GLIBC_PREREQ), but move to a nested #if
conditional. Fixes compile error on OpenBSD/powerpc64, reported by
Jasper Lievisse Adriaanse.
2021-01-04 Niels Möller <nisse@lysator.liu.se>
* Released Nettle-3.7.
......
......@@ -131,7 +131,7 @@ nettle_SOURCES = aes-decrypt-internal.c aes-decrypt.c \
nettle-meta-aeads.c nettle-meta-armors.c \
nettle-meta-ciphers.c nettle-meta-hashes.c nettle-meta-macs.c \
pbkdf2.c pbkdf2-hmac-gosthash94.c pbkdf2-hmac-sha1.c \
pbkdf2-hmac-sha256.c \
pbkdf2-hmac-sha256.c pbkdf2-hmac-sha384.c pbkdf2-hmac-sha512.c \
poly1305-aes.c poly1305-internal.c \
realloc.c \
ripemd160.c ripemd160-compress.c ripemd160-meta.c \
......
NEWS for the Nettle 3.7.3 release
This is bugfix release, fixing bugs that could make the RSA
decryption functions crash on invalid inputs.
Upgrading to the new version is strongly recommended. For
applications that want to support older versions of Nettle,
the bug can be worked around by adding a check that the RSA
ciphertext is in the range 0 < ciphertext < n, before
attempting to decrypt it.
Thanks to Paul Schaub and Justus Winter for reporting these
problems.
The new version is intended to be fully source and binary
compatible with Nettle-3.6. The shared library names are
libnettle.so.8.4 and libhogweed.so.6.4, with sonames
libnettle.so.8 and libhogweed.so.6.
Bug fixes:
* Fix crash for zero input to rsa_sec_decrypt and
rsa_decrypt_tr. Potential denial of service vector.
* Ensure that all of rsa_decrypt_tr and rsa_sec_decrypt return
failure for out of range inputs, instead of either crashing,
or silently reducing input modulo n. Potential denial of
service vector.
* Ensure that rsa_decrypt returns failure for out of range
inputs, instead of silently reducing input modulo n.
* Ensure that rsa_sec_decrypt returns failure if the message
size is too large for the given key. Unlike the other bugs,
this would typically be triggered by invalid local
configuration, rather than by processing untrusted remote
data.
NEWS for the Nettle 3.7.2 release
This is a bugfix release, fixing a bug in ECDSA signature
verification that could lead to a denial of service attack
(via an assertion failure) or possibly incorrect results. It
also fixes a few related problems where scalars are required
to be canonically reduced modulo the ECC group order, but in
fact may be slightly larger.
Upgrading to the new version is strongly recommended.
Even when no assert is triggered in ecdsa_verify, ECC point
multiplication may get invalid intermediate values as input,
and produce incorrect results. It's trivial to construct
alleged signatures that result in invalid intermediate values.
It appears difficult to construct an alleged signature that
makes the function misbehave in such a way that an invalid
signature is accepted as valid, but such attacks can't be
ruled out without further analysis.
Thanks to Guido Vranken for setting up the fuzzer tests that
uncovered this problem.
The new version is intended to be fully source and binary
compatible with Nettle-3.6. The shared library names are
libnettle.so.8.3 and libhogweed.so.6.3, with sonames
libnettle.so.8 and libhogweed.so.6.
Bug fixes:
* Fixed bug in ecdsa_verify, and added a corresponding test
case.
* Similar fixes to ecc_gostdsa_verify and gostdsa_vko.
* Similar fixes to eddsa signatures. The problem is less severe
for these curves, because (i) the potentially out or range
value is derived from output of a hash function, making it
harder for the attacker to to hit the narrow range of
problematic values, and (ii) the ecc operations are
inherently more robust, and my current understanding is that
unless the corresponding assert is hit, the verify
operation should complete with a correct result.
* Fix to ecdsa_sign, which with a very low probability could
return out of range signature values, which would be
rejected immediately by a verifier.
NEWS for the Nettle 3.7.1 release
This is primarily a bug fix release, fixing a couple of
problems found in Nettle-3.7.
The new version is intended to be fully source and binary
compatible with Nettle-3.6. The shared library names are
libnettle.so.8.2 and libhogweed.so.6.2, with sonames
libnettle.so.8 and libhogweed.so.6.
Bug fixes:
* Fix bug in chacha counter update logic. The problem affected
ppc64 and ppc64el, with the new altivec assembly code
enabled. Reported by Andreas Metzler, after breakage in
GnuTLS tests on ppc64.
* Support for big-endian ARM platforms has been restored.
Fixes contributed by Michael Weiser.
* Fix build problem on OpenBSD/powerpc64, reported by Jasper
Lievisse Adriaanse.
* Fix corner case bug in ECDSA verify, it would produce
incorrect result in the unlikely case of an all-zero
message hash. Reported by Guido Vranken.
New features:
* Support for pbkdf2_hmac_sha384 and pbkdf2_hmac_sha512,
contributed by Nicolas Mora.
Miscellaneous:
* Poorly performing ARM Neon code for doing single-block
Salsa20 and Chacha has been deleted. The code to do two or
three blocks in parallel, introduced in Nettle-3.7, is
unchanged.
NEWS for the Nettle 3.7 release
This release adds one new feature, the bcrypt password hashing
......
......@@ -70,12 +70,24 @@ If data is to be processed with bit operations only, endianness can be ignored
because byte-swapping on load and store will cancel each other out. Shifts
however have to be inverted. See arm/memxor.asm for an example.
3. vld1.8
3. v{ld,st}1.{8,32}
NEON's vld instruction can be used to produce endianness-neutral code. vld1.8
will load a byte sequence into a register regardless of memory endianness. This
can be used to process byte sequences. See arm/neon/umac-nh.asm for example.
In the same fashion, vst1.8 can be used do a little-endian store. See
arm/neon/salsa and chacha routines for examples.
NOTE: vst1.x (at least on the Allwinner A20 Cortex-A7 implementation) seems to
interfer with itself on subsequent calls, slowing it down. This can be avoided
by putting calculcations or loads inbetween two vld1.x stores.
Similarly, vld1.32 is used in chacha and salsa routines where 32-bit operands
are stored in host-endianness in RAM but need to be loaded sequentially without
the distortion introduced by vldm/vstm. Consecutive vld1.x instructions do not
seem to suffer from slowdown similar to vst1.x.
4. vldm/vstm
Care has to be taken when using vldm/vstm because they have two non-obvious
......
......@@ -36,6 +36,7 @@ ifelse(`
define(`DST', `r0')
define(`SRC', `r1')
define(`ROUNDS', `r2')
define(`SRCp32', `r3')
C State, X, Y and Z representing consecutive blocks
define(`X0', `q0')
......@@ -64,10 +65,13 @@ define(`T3', `q7')
C _chacha_3core(uint32_t *dst, const uint32_t *src, unsigned rounds)
PROLOGUE(_nettle_chacha_3core)
vldm SRC, {X0,X1,X2,X3}
C loads using vld1.32 to be endianness-neutral wrt consecutive 32-bit words
add SRCp32, SRC, #32
vld1.32 {X0,X1}, [SRC]
vld1.32 {X2,X3}, [SRCp32]
vpush {q4,q5,q6,q7}
adr r12, .Lcount1
vld1.64 {Z3}, [r12]
vld1.32 {Z3}, [r12]
vadd.i64 Y3, X3, Z3 C Increment 64-bit counter
vadd.i64 Z3, Y3, Z3
......@@ -213,33 +217,49 @@ PROLOGUE(_nettle_chacha_3core)
vadd.i32 Y3, Y3, T2
vadd.i32 Z3, Z3, T3
vldm SRC, {T0,T1,T2,T3}
vld1.32 {T0,T1}, [SRC]
vadd.i32 X0, X0, T0
vadd.i32 X1, X1, T1
C vst1.8 because caller expects results little-endian
C interleave loads, calculations and stores to save cycles on stores
C use vstm when little-endian for some additional speedup
IF_BE(` vst1.8 {X0,X1}, [DST]!')
vld1.32 {T2,T3}, [SRCp32]
vadd.i32 X2, X2, T2
vadd.i32 X3, X3, T3
vstmia DST!, {X0,X1,X2,X3}
IF_BE(` vst1.8 {X2,X3}, [DST]!')
IF_LE(` vstmia DST!, {X0,X1,X2,X3}')
vadd.i32 Y0, Y0, T0
vadd.i32 Y1, Y1, T1
IF_BE(` vst1.8 {Y0,Y1}, [DST]!')
vadd.i32 Y2, Y2, T2
vstmia DST!, {Y0,Y1,Y2,Y3}
IF_BE(` vst1.8 {Y2,Y3}, [DST]!')
IF_LE(` vstmia DST!, {Y0,Y1,Y2,Y3}')
vadd.i32 Z0, Z0, T0
vadd.i32 Z1, Z1, T1
IF_BE(` vst1.8 {Z0,Z1}, [DST]!')
vadd.i32 Z2, Z2, T2
vpop {q4,q5,q6,q7}
vstm DST, {Z0,Z1,Z2,Z3}
IF_BE(` vst1.8 {Z2,Z3}, [DST]')
IF_LE(` vstm DST, {Z0,Z1,Z2,Z3}')
bx lr
EPILOGUE(_nettle_chacha_3core)
PROLOGUE(_nettle_chacha_3core32)
vldm SRC, {X0,X1,X2,X3}
add SRCp32, SRC, #32
vld1.32 {X0,X1}, [SRC]
vld1.32 {X2,X3}, [SRCp32]
vpush {q4,q5,q6,q7}
adr r12, .Lcount1
vld1.64 {Z3}, [r12]
vld1.32 {Z3}, [r12]
vadd.i32 Y3, X3, Z3 C Increment 32-bit counter
vadd.i32 Z3, Y3, Z3
......
C arm/neon/chacha-core-internal.asm
ifelse(`
Copyright (C) 2013, 2015 Niels Möller
This file is part of GNU Nettle.
GNU Nettle is free software: you can redistribute it and/or
modify it under the terms of either:
* the GNU Lesser General Public License as published by the Free
Software Foundation; either version 3 of the License, or (at your
option) any later version.
or
* the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your
option) any later version.
or both in parallel, as here.
GNU Nettle is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received copies of the GNU General Public License and
the GNU Lesser General Public License along with this program. If
not, see http://www.gnu.org/licenses/.
')
.file "chacha-core-internal.asm"
.fpu neon
define(`DST', `r0')
define(`SRC', `r1')
define(`ROUNDS', `r2')
define(`X0', `q0')
define(`X1', `q1')
define(`X2', `q2')
define(`X3', `q3')
define(`T0', `q8')
define(`S0', `q12')
define(`S1', `q13')
define(`S2', `q14')
define(`S3', `q15')
define(`QROUND', `
C x0 += x1, x3 ^= x0, x3 lrot 16
C x2 += x3, x1 ^= x2, x1 lrot 12
C x0 += x1, x3 ^= x0, x3 lrot 8
C x2 += x3, x1 ^= x2, x1 lrot 7
vadd.i32 $1, $1, $2
veor $4, $4, $1
vshl.i32 T0, $4, #16
vshr.u32 $4, $4, #16
veor $4, $4, T0
vadd.i32 $3, $3, $4
veor $2, $2, $3
vshl.i32 T0, $2, #12
vshr.u32 $2, $2, #20
veor $2, $2, T0
vadd.i32 $1, $1, $2
veor $4, $4, $1
vshl.i32 T0, $4, #8
vshr.u32 $4, $4, #24
veor $4, $4, T0
vadd.i32 $3, $3, $4
veor $2, $2, $3
vshl.i32 T0, $2, #7
vshr.u32 $2, $2, #25
veor $2, $2, T0
')
.text
.align 4
C _chacha_core(uint32_t *dst, const uint32_t *src, unsigned rounds)
PROLOGUE(_nettle_chacha_core)
vldm SRC, {X0,X1,X2,X3}
vmov S0, X0
vmov S1, X1
vmov S2, X2
vmov S3, X3
C Input rows little-endian:
C 0 1 2 3 X0
C 4 5 6 7 X1
C 8 9 10 11 X2
C 12 13 14 15 X3
C Input rows big-endian:
C 1 0 3 2 X0
C 5 4 7 6 X1
C 9 8 11 10 X2
C 13 12 15 14 X3
C even and odd columns switched because
C vldm loads consecutive doublewords and
C switches words inside them to make them BE
.Loop:
QROUND(X0, X1, X2, X3)
C In little-endian rotate rows, to get
C 0 1 2 3
C 5 6 7 4 >>> 3
C 10 11 8 9 >>> 2
C 15 12 13 14 >>> 1
C In big-endian rotate rows, to get
C 1 0 3 2
C 6 5 4 7 >>> 1
C 11 10 9 8 >>> 2
C 12 15 14 13 >>> 3
C different number of elements needs to be
C extracted on BE because of different column order
IF_LE(` vext.32 X1, X1, X1, #1')
IF_BE(` vext.32 X1, X1, X1, #3')
vext.32 X2, X2, X2, #2
IF_LE(` vext.32 X3, X3, X3, #3')
IF_BE(` vext.32 X3, X3, X3, #1')
QROUND(X0, X1, X2, X3)
subs ROUNDS, ROUNDS, #2
C Inverse rotation
IF_LE(` vext.32 X1, X1, X1, #3')
IF_BE(` vext.32 X1, X1, X1, #1')
vext.32 X2, X2, X2, #2
IF_LE(` vext.32 X3, X3, X3, #1')
IF_BE(` vext.32 X3, X3, X3, #3')
bhi .Loop
vadd.u32 X0, X0, S0
vadd.u32 X1, X1, S1
vadd.u32 X2, X2, S2
vadd.u32 X3, X3, S3
C caller expects result little-endian
IF_BE(` vrev32.u8 X0, X0
vrev32.u8 X1, X1
vrev32.u8 X2, X2
vrev32.u8 X3, X3')
vstm DST, {X0,X1,X2,X3}
bx lr
EPILOGUE(_nettle_chacha_core)
divert(-1)
define chachastate
p/x $q0.u32
p/x $q1.u32
p/x $q2.u32
p/x $q3.u32
end
......@@ -36,6 +36,7 @@ ifelse(`
define(`DST', `r0')
define(`SRC', `r1')
define(`ROUNDS', `r2')
define(`SRCp32', `r3')
C State, even elements in X, odd elements in Y
define(`X0', `q0')
......@@ -58,11 +59,14 @@ define(`T3', `q15')
C _salsa20_2core(uint32_t *dst, const uint32_t *src, unsigned rounds)
PROLOGUE(_nettle_salsa20_2core)
vldm SRC, {X0,X1,X2,X3}
C loads using vld1.32 to be endianness-neutral wrt consecutive 32-bit words
add SRCp32, SRC, #32
vld1.32 {X0,X1}, [SRC]
vld1.32 {X2,X3}, [SRCp32]
adr r12, .Lcount1
vmov Y3, X0
vld1.64 {Y1}, [r12]
vld1.32 {Y1}, [r12]
vmov Y0, X1
vadd.i64 Y1, Y1, X2 C Increment counter
vmov Y2, X3
......@@ -180,7 +184,8 @@ C Inverse swaps and transpositions
vswp D1REG(Y0), D1REG(Y2)
vswp D1REG(Y1), D1REG(Y3)
vldm SRC, {T0,T1,T2,T3}
vld1.32 {T0,T1}, [SRC]
vld1.32 {T2,T3}, [SRCp32]
vtrn.32 X0, Y3
vtrn.32 X1, Y0
......@@ -190,17 +195,26 @@ C Inverse swaps and transpositions
C Add in the original context
vadd.i32 X0, X0, T0
vadd.i32 X1, X1, T1
C vst1.8 because caller expects results little-endian
C interleave loads, calculations and stores to save cycles on stores
C use vstm when little-endian for some additional speedup
IF_BE(` vst1.8 {X0,X1}, [DST]!')
vadd.i32 X2, X2, T2
vadd.i32 X3, X3, T3
IF_BE(` vst1.8 {X2,X3}, [DST]!')
IF_LE(` vstmia DST!, {X0,X1,X2,X3}')
vstmia DST!, {X0,X1,X2,X3}
vld1.64 {X0}, [r12]
vld1.32 {X0}, [r12]
vadd.i32 T0, T0, Y3
vadd.i64 T2, T2, X0
vadd.i32 T1, T1, Y0
IF_BE(` vst1.8 {T0,T1}, [DST]!')
vadd.i32 T2, T2, Y1
vadd.i32 T3, T3, Y2
vstm DST, {T0,T1,T2,T3}
IF_BE(` vst1.8 {T2,T3}, [DST]')
IF_LE(` vstm DST, {T0,T1,T2,T3}')
bx lr
EPILOGUE(_nettle_salsa20_2core)