Discussion:
[RFC PATCH] enable configurable SECMEM_BUFFER_SIZE [WAS Re: Please Consider Increasing SECMEM_BUFFER_SIZE To 1048576]
(too old to reply)
Matthew Summers
2017-10-21 04:20:07 UTC
Permalink
Please see the attached patch, I'll also include it in the body since
it's very short.

What this does is change configure.ac to allow the end user to choose
a value for secmem, or accept one of the two existing values.
Following `autoreconf` on STABLE-BRANCH-2-2 with this patch applied,
the following options are available:
./configure --enable-large-secmem # uses default 64k
./configure --enable-large-secmem=no # uses 32k value
./configure --enable-large-secmem=SIZE # can input any number > 32k
(probably could use an upper bound too, perhaps)

I attempted to follow the style and structure used for
PK_UID_CACHE_SIZE via the --enable-key-cache option.

If you prefer, I can make a pull request against github



From 5fd1e60f65048b3d0ccfd309f8c195e70d9fd027 Mon Sep 17 00:00:00 2001
From: Matthew Summers <***@syapse.com>
Date: Fri, 20 Oct 2017 23:14:51 -0500
Subject: [RFC PATCH] enable configurable SECMEM_BUFFER_SIZE

---
configure.ac | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/configure.ac b/configure.ac
index 70c122615..46cfd855a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -236,20 +236,30 @@ AC_ARG_ENABLE(selinux-support,
AC_MSG_RESULT($selinux_support)


-AC_MSG_CHECKING([whether to allocate extra secure memory])
+#
+# Check for size of secure memory allocation. If on Linux and if >
65536, it may be
+# necessary to increase max locked memory, ulimit -l to accommodate.
+#
+AC_MSG_CHECKING([for the size of the secure memory allocation])
AC_ARG_ENABLE(large-secmem,
- AC_HELP_STRING([--enable-large-secmem],
- [allocate extra secure memory]),
- large_secmem=$enableval, large_secmem=no)
-AC_MSG_RESULT($large_secmem)
-if test "$large_secmem" = yes ; then
- SECMEM_BUFFER_SIZE=65536
-else
- SECMEM_BUFFER_SIZE=32768
+ AC_HELP_STRING([--enable-large-secmem=SIZE],
+ [set secure memory allocation to SIZE
(default 65536)]),,enableval=65536)
+if test "$enableval" = "no"; then
+ enableval=32768
+elif test "$enableval" = "yes" || test "$enableval" = ""; then
+ enableval=65536
fi
-AC_DEFINE_UNQUOTED(SECMEM_BUFFER_SIZE,$SECMEM_BUFFER_SIZE,
+changequote(,)dnl
+secmem_buffer_size=`echo "$enableval" | sed 's/[A-Za-z]//g'`
+changequote([,])dnl
+if test "$enableval" != "$secmem_buffer_size" || test
"$secmem_buffer_size" -lt 32768; then
+ AC_MSG_ERROR([invalid large-secmem buffersize])
+fi
+AC_MSG_RESULT($secmem_buffer_size)
+AC_DEFINE_UNQUOTED(SECMEM_BUFFER_SIZE,$secmem_buffer_size,
[Size of secure memory buffer])

+
AC_MSG_CHECKING([whether to enable trust models])
AC_ARG_ENABLE(trust-models,
AC_HELP_STRING([--disable-trust-models],
--
2.13.6

On Fri, Oct 20, 2017 at 4:42 PM, Matthew Summers
Hello,
This is following up on an issue that started around in the 2.1.17
days, and I have been carrying a small patch addressing this for my
team since we figured it out. This is all documented in an earlier set
of threads to this ML.
Would you consider increasing SECMEM_BUFFER_SIZE to 1048576 in the
case where the configure option `large_secmem` = yes, please?
The patch itself is a super trivial in both configure and/or
configure.ac <http://configure.ac>.
You haven't explicitly referenced the rationale for this. Is this about
RSA keys in excession of 4096 bits?
Yes, we are using 4096bit RSA keys, and generally recommend/require that internally at this stage in the game.
Here is the start of the 1st thread: https://lists.gnutls.org/pipermail/gnupg-devel/2017-January/032486.html
Here is the start of the 2nd thread: https://lists.gnupg.org/pipermail/gnupg-devel/2017-June/032894.html
I would like to point out that this creates an issue on systems with the
gpg: Warning: using insecure memory!
mlock(0x7f5814308000, 262144) = -1 ENOMEM (Cannot allocate memory)
Thus exposing private key material to being swapped out to permanent
storage. I am not sure if a mere warning is sufficient here.
I have been adjusting this ulimit as a matter of course, so it's not been an issue. It appears that 64k 'default' max locked memory and/or max memory size may vary from distro to disto as well.
I would like to stress that this is not a theoretical problem we are dealing with, but rather an issue that were we not able to patch the SECMEM_BUFFER_SIZE we would not be able to use GnuPG to protect sensitive data as we currently do (via saltstack).
What about an extra arg in configure.ac that would let us either specify the SECMEM_BUFFER_SIZE or set it with `--extra-large-secmem`? I'd be happy to provide a patch for that, of course.
Thanks,
Matt Summers
Rainer Perske
2017-11-16 12:36:14 UTC
Permalink
Hello

Usually you absolutely do not want to place any private data (keyrings,
sockets) on a network drive. But there are exceptions when it comes to
clustering for fail safety and the complete system (including network
components) is under your full control.

I have this situation: The user home directory of my webmailer is
located on a network file system so it can be accessed from all nodes
in the cluster.

common/homedir.c places the socket for the agent communication into the
same directory. But multiple nodes cannot share the same socket file;
this causes curious problems.

So the socket files must be node-specific, either by placing them into
a non-shared directory or by using node-specific files, so that each
node can run its own gpg-agent for a user.

For this reason, I have patched common/homedir.c to use a
nodename-specific subdirectory of the user directory for the sockets,
see below. I am using this patch since long time in our production
environment.

I'd like to propose to incorporate this patch into GnuPG. It will
change the default location of the socket files into a subdirectory of
the previous location but I cannot see any way how it could hurt,
except that you may need to restart running agents when installing this
patch.

Signed-off-by: Rainer Perske <***@uni-muenster.de>

diff -ur gnupg-2.2.2/common/homedir.c gnupg-2.2.2rp/common/homedir.c
--- gnupg-2.2.2/common/homedir.c 2000-01-01 00:00:00.000000000 +0000
+++ gnupg-2.2.2rp/common/homedir.c 2000-01-01 00:00:00.000000000 +0000
@@ -57,7 +57,9 @@
#include <sys/stat.h> /* for stat() */
#endif

-
+#ifndef HAVE_W32_SYSTEM
+#include <sys/utsname.h>
+#endif

#include "util.h"
#include "sysutils.h"
@@ -547,6 +549,9 @@
char prefix[13 + 1 + 20 + 6 + 1];
const char *s;
char *name = NULL;
+#ifndef HAVE_W32_SYSTEM
+ struct utsname utsbuf;
+#endif

*r_info = 0;

@@ -694,6 +699,21 @@
name = xstrdup (prefix);

leave:
+#ifndef HAVE_W32_SYSTEM
+ /* try hostname specific subdirectory of gnupg_homedir */
+ if (!name && !uname (&utsbuf) && utsbuf.nodename && !strchr (utsbuf.nodename, '/'))
+ {
+ name = xmalloc (strlen (gnupg_homedir ()) + 7 + strlen(utsbuf.nodename) +1);
+ strcpy (name, gnupg_homedir ());
+ strcat (name, "/S.dir.");
+ strcat (name, utsbuf.nodename);
+ if (-1 == gnupg_mkdir (name, "-rwx") && errno != EEXIST)
+ {
+ xfree (name);
+ name = NULL;
+ }
+ }
+#endif
/* If nothing works fall back to the homedir. */
if (!name)
{

Thank you very much for thinking about it.

Best regards
--
Rainer Perske
System operations dept. and director of the certification authority (WWUCA)
Center for Information Processing (university computer center)

Westfälische Wilhelms-Universität
Zentrum für Informationsverarbeitung
Rainer Perske
Röntgenstraße 7-13
48149 Münster
Germany

phone: +49 251 83-31582
fax: +49 251 83-31555
e-mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/Mitarbeiter/RainerPerske.shtml
office: room 006, Röntgenstraße 11
site map: http://wwwuv2.uni-muenster.de/uniplan/?action=spot&gebnr=7474

Certification Authority of the University of Münster (WWUCA):
phone: +49 251 83-31590
fax: +49 251 83-31555
e-mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/WWUCA/

Center for Information Processing:
phone: +49 251 83-31600 (Mon-Fri 7:30-17:30)
fax: +49 251 83-31555
e-mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/
Werner Koch
2017-11-16 14:09:23 UTC
Permalink
Post by Rainer Perske
So the socket files must be node-specific, either by placing them into
a non-shared directory or by using node-specific files, so that each
node can run its own gpg-agent for a user.
Actually the default in 2.1 is to use a non-shared socket directry.
From the README

** Socket directory

GnuPG uses Unix domain sockets to connect its components (on Windows
an emulation of these sockets is used). Depending on the type of
the file system, it is sometimes not possible to use the GnuPG home
directory (i.e. ~/.gnupg) as the location for the sockets. To solve
this problem GnuPG prefers the use of a per-user directory below the
the /run (or /var/run) hierarchy for the the sockets. It is thus
suggested to create per-user directories on system or session
startup. For example the following snippet can be used in
/etc/rc.local to create these directories:

[ ! -d /run/user ] && mkdir /run/user
awk -F: </etc/passwd '$3 >= 1000 && $3 < 65000 {print $3}' \
| ( while read uid rest; do
if [ ! -d "/run/user/$uid" ]; then
mkdir /run/user/$uid
chown $uid /run/user/$uid
chmod 700 /run/user/$uid
fi
done )

Depending on the system it is /var/run. You can use

gpgconf --list-dirs socketdir

to check whether GnuPG is actually using it. To check for problems you
may explicitly create the directory (which gpg does on the fly) using

gpgconf --verbose --create-socketdir

this uses the same code as gpg and --verbose (or --dry-run) prints
warnings if the permissions are not as expected.


Salam-Shalom,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Rainer Perske
2017-11-16 16:31:01 UTC
Permalink
Hello
Post by Werner Koch
Actually the default in 2.1 is to use a non-shared socket directry.
Thank you very much!

Unfortunately I cannot use /run/user/(userid) because it is maintained
by systemd and in my webmailer situation it can be deleted even if the
agent is still running. (systemd does not know anything about the
sessions of a webmailer.)

But your objection brings me a much better and more simple idea:

Line 544 of common/homedir.c contains:

static const char * const bases[] = { "/run", "/var/run", NULL};

Could you please prepend a configurable directory to this list?
I propose this most simple change:

static const char * const bases[] = {
GNUPG_LOCALSTATEDIR"/run",
"/run", "/var/run", NULL};

Then I can simply use your mechanism.

Best regards
--
Rainer Perske
Abteilung Systembetrieb und Leiter der Zertifizierungsstelle (WWUCA)
Zentrum für Informationsverarbeitung (Universitätsrechenzentrum)

Westfälische Wilhelms-Universität
Zentrum für Informationsverarbeitung
Rainer Perske
Röntgenstraße 7-13
48149 Münster

Tel.: +49 251 83-31582
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/Mitarbeiter/RainerPerske.shtml
Büro: Raum 006, Röntgenstraße 11
Lageplan: http://wwwuv2.uni-muenster.de/uniplan/?action=spot&gebnr=7474

Zertifizierungsstelle der Universität Münster (WWUCA):
Tel.: +49 251 83-31590
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/WWUCA/

Zentrum für Informationsverarbeitung (ZIV):
Tel.: +49 251 83-31600 (Mo-Fr 7:30-17:30 Uhr)
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/
Werner Koch
2017-11-17 07:55:07 UTC
Permalink
Post by Rainer Perske
Could you please prepend a configurable directory to this list?
static const char * const bases[] = {
GNUPG_LOCALSTATEDIR"/run",
"/run", "/var/run", NULL};
Thus for building without any options this would be

"/usr/local/var/run",
"/run",
"/var/run"

and for a distro build (--prefix=/) this would be

"/var/run",
"/run",
"/var/run"

so that we need to detect this case to get the order right again.

I assume you are not using a a distro supplied gnupg; in this case a
dedicated configure option for the first socketdir to try would be
easier because it does not break existing usages. A runtime option
can't be used here, except if we would take the list of socketdirs to
try from a global config files (e.g. /etc/gnupg/socketdirs.list).


Shalom-Salam,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Daniel Kahn Gillmor
2017-11-17 23:28:01 UTC
Permalink
Post by Rainer Perske
Unfortunately I cannot use /run/user/(userid) because it is maintained
by systemd and in my webmailer situation it can be deleted even if the
agent is still running. (systemd does not know anything about the
sessions of a webmailer.)
I'm not convinced this response makes much sense. Why *wouldn't* the
system's service manager (systemd, in your case) be unaware of webmailer
sessions?

What is your webmail configuration doing that it is switching to a new
user session, but deliberately avoiding registering that user session
with the local system service manager?

Fixing this problem at the GnuPG layer seems like an odd approach.

--dkg
Rainer Perske
2017-11-17 19:03:12 UTC
Permalink
Hello,

[common/homedir.c]
Post by Rainer Perske
static const char * const bases[] = {
GNUPG_LOCALSTATEDIR"/run",
"/run", "/var/run", NULL};
[...] and for a distro build (--prefix=/) this would be
"/var/run",
"/run",
"/var/run"
so that we need to detect this case to get the order right again.
ok, I had not seen this common edge case. What about this? (line 544)

static const char * const bases[] = {"/run/gnupg", "/var/run/gnupg", "/run", "/var/run", NULL};

btw we must not forget to enlarge char prefix[] accordingly: (line 547)

char prefix[6 + 13 + 1 + 20 + 6 + 1];
I assume you are not using a a distro supplied gnupg; in this case a
dedicated configure option for the first socketdir to try would be
easier because it does not break existing usages.
Absolutely correct. However, inventing a new configure option would
make my change request much larger and I always try to make such
changes or change requests as small as possible hoping to avoid
breaking something by accident.
A runtime option
can't be used here, except if we would take the list of socketdirs to
try from a global config files (e.g. /etc/gnupg/socketdirs.list).
I agree completely, but that would be even more work that IMHO is
probably not justified.

Have a fine weekend!
--
Rainer Perske
Abteilung Systembetrieb und Leiter der Zertifizierungsstelle (WWUCA)
Zentrum für Informationsverarbeitung (Universitätsrechenzentrum)

Westfälische Wilhelms-Universität
Zentrum für Informationsverarbeitung
Rainer Perske
Röntgenstraße 7-13
48149 Münster

Tel.: +49 251 83-31582
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/Mitarbeiter/RainerPerske.shtml
Büro: Raum 006, Röntgenstraße 11
Lageplan: http://wwwuv2.uni-muenster.de/uniplan/?action=spot&gebnr=7474

Zertifizierungsstelle der Universität Münster (WWUCA):
Tel.: +49 251 83-31590
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/WWUCA/

Zentrum für Informationsverarbeitung (ZIV):
Tel.: +49 251 83-31600 (Mo-Fr 7:30-17:30 Uhr)
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/
Rainer Perske
2017-11-19 13:59:27 UTC
Permalink
Hello,
Post by Daniel Kahn Gillmor
Post by Rainer Perske
Unfortunately I cannot use /run/user/(userid) because it is
maintained by systemd and in my webmailer situation it can be
deleted even if the agent is still running. (systemd does not know
anything about the sessions of a webmailer.)
I'm not convinced this response makes much sense. Why *wouldn't* the
system's service manager (systemd, in your case) be unaware of
webmailer sessions?
What is your webmail configuration doing that it is switching to a
new user session, but deliberately avoiding registering that user
session with the local system service manager?
Because systemd manages processes on a *single* host. I have servers
clustered and distributed over two locations for fail safety and load
distribution and a webmailer session is valid on all cluster hosts.
systemd (more exactly: "systemd-logind") cannot be used to manage
cluster-wide sessions.

So I have to fight with the fact that my webmailer is running in a
cluster, but GnuPG and its agent are not because they use
localhost-only sockets for interprocess communication and never were
designed to be used in a cluster environment.

I want to use GnuPG because it is the best software for this purpose so
my webmailer gives GnuPG an environment it is happy with.

Some more background information:

I do not need GnuPG sessions at all. If I could call gpgsm in a way
that no gpg-agent or dirmngr process and no socket file would survive
this call, this would be slower but I could live with it.

But GnuPG is now built in a way that always socket files are created
and that always gpg-agent and dirmngr are started the first time they
are needed. (You definitely have very good reasons to do so, avoiding
long startup times is one of them.) So I have to live with these files
and processes, for nearly 100,000 possible users.

Because the processes are running on single hosts, the socket files
must be placed on host-local file systems. Otherwise processes on other
hosts see the socket files but do not see the agents. Fall-back
location of the socket file is the user's home directory. In my
situation, this is a cluster-wide file system. And so I got into
trouble. This is the main cause for my patch and proposal.

To solve the problems, I must make sure that GnuPG places the socket
files on host-only file systems. My patch and proposal have this single
aim: Place the socket on a host-only file system but do not allow
cluster-unaware managers like "systemd-logind" to bother with them. So
I cannot use /run/user/ or /var/run/user/ that are managed by
systemd-logind.

A general solution would be to make these directories configurable. I
do not dare to ask you to develop such a general solution.

A simple solution would be to prepend GnuPG-specific host-local
directories not managed by systemd-logind to the list of directories.
Hence my proposal. According to the Linux File System Standard,
/var/run/gnupg/ (or /run/gnupg/ on those systems using /run/ ) seems to
be the best place in my eyes.

So my proposal (prepend /run/gnupg and /var/run/gnupg to /run and
/var/run ) would solve my problem. (My webmailer can make sure that
/run/gnupg/user/<UID> exists and has the correct owner, group, and
permissions before calling gpgsm. And my cluster-aware session
management can clean these directories.)

(I know that my solution can cause multiple agents running for the same
user on different hosts concurrently. But as far as I can see you are
using proper file locking so this does not cause any problem. At least
in the last 3 years my patch (see first mail of this thread) has not
caused any problem.)

Best regards
--
Rainer Perske
Abteilung Systembetrieb und Leiter der Zertifizierungsstelle (WWUCA)
Zentrum für Informationsverarbeitung (Universitätsrechenzentrum)

Westfälische Wilhelms-Universität
Zentrum für Informationsverarbeitung
Rainer Perske
Röntgenstraße 7-13
48149 Münster

Tel.: +49 251 83-31582
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/Mitarbeiter/RainerPerske.shtml
Büro: Raum 006, Röntgenstraße 11
Lageplan: http://wwwuv2.uni-muenster.de/uniplan/?action=spot&gebnr=7474

Zertifizierungsstelle der Universität Münster (WWUCA):
Tel.: +49 251 83-31590
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/WWUCA/

Zentrum für Informationsverarbeitung (ZIV):
Tel.: +49 251 83-31600 (Mo-Fr 7:30-17:30 Uhr)
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/
Werner Koch
2017-12-12 08:45:14 UTC
Permalink
Hi,

I took your suggestion and gnupg 2.2.4 (o be released next week) will
support a new build option:

commit 17efcd2a2acdc3b7f00711272aa51e5be2476921
Author: Werner Koch <***@gnupg.org>
Date: Tue Dec 12 09:42:43 2017 +0100

build: New configure option --enable-run-gnupg-user-socket.

* configure.ac: (USE_RUN_GNUPG_USER_SOCKET): New ac_define.
* common/homedir.c (_gnupg_socketdir_internal): Add extra directories.
--

This allows to build GnuPG with an extra socketdir below /run. See
https://lists.gnupg.org/pipermail/gnupg-devel/2017-November/033250.html
for a longer explanation why this is sometimes useful.

Suggested-by: Rainer Perske
Signed-off-by: Werner Koch <***@gnupg.org>

You may use

gpgconf -v --dry-run --create-socketdir

to check permissions etc.


Shalom-Salam,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Matthew Summers
2017-11-21 18:12:56 UTC
Permalink
Hello!

What can I do to get this into good enough shape to be included in a
release? I can sign the contrib agreement. I can submit this under my
@gentoo.org dev ID if that is better, generally versus the corporate
id. This is a severe issue for us that seems pretty easy to remedy
with a patch like this.

Thanks,
Matt
Werner Koch
2017-11-22 08:47:10 UTC
Permalink
Hi!
Post by Matthew Summers
id. This is a severe issue for us that seems pretty easy to remedy
with a patch like this.
Can you please explain the problem you try to solve.

Most crypto operations which temporary require allocation of extra
chunks of data (big integer computation) the secure memory area is
enlarged on the fly.

The gpg-agent stores secret keys in a cache which is allocated in
standard memory. This can be done because the keys are stored encrypted
in this cache (using a per process ephemeral key).

Any problems creating or using more keys is either due to a memory leak
in gpg-agent or because you have a lot of active connections to the
gpg-agent using private keys.

For that latter case there are two solutions:

1. Enlarging the secure memory area. This has the problem that you
will never known how much is sufficient.

2. Enlarge the secure memory area on the fly as we do it with some of
the Libgcrypt internal mallocs (actually all xmalloc style
allocations). That would require an update to Libgcrypt and a
runtime option to gpg-agent to enable this.


Shalom-Salam,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Matthew Summers
2017-11-22 15:23:48 UTC
Permalink
Post by Werner Koch
Hi!
Post by Matthew Summers
id. This is a severe issue for us that seems pretty easy to remedy
with a patch like this.
Can you please explain the problem you try to solve.
Any problems creating or using more keys is either due to a memory leak
in gpg-agent or because you have a lot of active connections to the
gpg-agent using private keys.
We have a lot of active connections to the gpg-agent using a single
private key. I outlined our use case in more detail here [1].
Post by Werner Koch
1. Enlarging the secure memory area. This has the problem that you
will never known how much is sufficient.
We are enlarging the secmem_buffer using the patch attached
previously. We are determining the size using a simple empirical test
we constructed based on our known use case and concurrency
requirements. This is a fairly standard tuning process we have to go
through now so we are able to continue using GnuPG (we love it). This
test is outlined here [2].

This is the reason why we would love to see the secmem_buffer size
configurable, per the patch. It is perhaps notable that we needed to
go from ~1mb to 2mb when moving from 2.1.23 to 2.2.0.

Thanks for taking the time to reply and investigate this issue. I know
how busy it is maintaining OSS, and want you all to know how much we
all appreciate your efforts all these years. Many many thanks.

Cheers!
Matt Summers

aka quantumsummers at gentoo dot oh arrrrr geeeee

[1] https://lists.gnupg.org/pipermail/gnupg-devel/2017-June/032913.html
[2] https://lists.gnutls.org/pipermail/gnupg-devel/2017-November/033227.html
Werner Koch
2017-11-22 19:46:10 UTC
Permalink
Post by Matthew Summers
We have a lot of active connections to the gpg-agent using a single
private key. I outlined our use case in more detail here [1].
Frankly I followed the discussion in June only briefly. Thanks for
pointing me again to this.
Post by Matthew Summers
We are enlarging the secmem_buffer using the patch attached
previously. We are determining the size using a simple empirical test
I would accept this as a quick workaround but we need a better solution.
Post by Matthew Summers
configurable, per the patch. It is perhaps notable that we needed to
go from ~1mb to 2mb when moving from 2.1.23 to 2.2.0.
I just checked the commits between tehse release and I can't see
anything which shoudl affect the allocation pattern. This only shows
how fragile the fixed memory approach is.
Post by Matthew Summers
very fast. With fis-gtm, we spawn so many processes that exhaust
mlock()ed space frequently enough for this to be a problem for us.
Okay, that is the same pattern.
Post by Matthew Summers
[amul] "secure" memory allocations only use the first mlock()ed memory area. Is
this what you mean by "standard memory"?
I call this all secure memory. The first chunk is mlock()ed but if we
need to allocate more chunks to satisfy memory requests from Libgcrypt
which do not expect to fail [1], we can't use mlock anymore. There are
two other properties of secmem:

- A free() wipes out the formerly allocated memory
- Some crypto code in Libgcrypt enables slower but less side-channel
leaking algorithms if the material is stored in secmem.

If possible gpg-agent also does an extra wipememory before a free to
protect against not weel behaving external memory allocaters used with
Libgcrypt.

As dkg already mentioned, the protection against swapping out sensitive
data can nowdays better achieved by using encrypted swap. Along with no
way to protect against suspend/resume, I view the mlock more of a
historical feature.
Post by Matthew Summers
[amul] When you say that the agent can cache a secret key, does that mean
subsequent requests for the same key will be serviced from cache?
Right. However, each running session needs a copy of the key because we
don't have reference counting. This is the problem of too many
concurrent session (more exact: decrypt or sign commands).
Post by Matthew Summers
[amul] In our case, we have many active connections to the gpg-agent (via
libgpgme and gpg) as processes decrypt the secret key that encrypts the
Well, adding a limit to GPGME so that it will block in case of too many
concurrent gpg operations should be easy to implement. But that does
not solve the problem of having several gpgme using processes running
concurrently. So let's forget this.
Post by Matthew Summers
[amul] I agree with your assessment of option #1. I think, however, that there
is a third option. In its current form, the gpg-agent accepts every request and
spawns a thread to handle it. There is no limit on the rate at which it accepts
connections. Other than crudely limiting the thread count, I don't know of a
good way to slew the agent. Do you have any suggestions? Additionally, I have
not seen any timeouts in gpg/libassuan when communicating with the gpg-agent.
Limiting the number of concurrent connections to gpg-agent is a useful
feature but it requires more thought and is unlikley to make it into 2.2
anytime soon. I opened a ticket at https://dev.gnupg.org/T3529 .
Post by Matthew Summers
[amul] If I were to prepare a patch for option #2, assuming it passes a review and
conforms to the coding standards, would it be acceptable?
We should try this. Let me implement something for testing.
https://dev.gnupg.org/T3530


Shalom-Salam,

Werner


[1] Error checking all gcry_mpi_foo function would make the code
unreadable and requires complex and hard to test error cleanup.

--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Shah, Amul
2017-11-23 09:43:31 UTC
Permalink
-----Original Message-----
Post by Matthew Summers
We have a lot of active connections to the gpg-agent using a single
private key. I outlined our use case in more detail here [1].
Frankly I followed the discussion in June only briefly. Thanks for pointing
me again to this.
Post by Matthew Summers
We are enlarging the secmem_buffer using the patch attached
previously. We are determining the size using a simple empirical test
I would accept this as a quick workaround but we need a better solution.
Post by Matthew Summers
configurable, per the patch. It is perhaps notable that we needed to
go from ~1mb to 2mb when moving from 2.1.23 to 2.2.0.
I just checked the commits between tehse release and I can't see anything
which shoudl affect the allocation pattern. This only shows how fragile the
fixed memory approach is.
As dkg already mentioned, the protection against swapping out sensitive data
can nowdays better achieved by using encrypted swap. Along with no way to
protect against suspend/resume, I view the mlock more of a historical feature.
[amul:2] I see two issues with how libgcrypt treats xmalloc style allocations and
secure malloc style allocations.

[amul:2] Secure malloc style allocations are currently limited to the mainpool,
but xmalloc allocations can use the overflow pool(s). This means that every
xmalloc allocation consumes the limited main pool. Once the mainpool is
exhausted, xmalloc allocations continue, but secure mallocs stop.

[amul:2] An ugly solution would move lines 610-615 below line 666. This would
have the negative effect of allocating two memory pool sized chunks which could
cause problems on memory strapped systems.

576 _gcry_secmem_malloc_internal (size_t size, int xhint)
...
610 mb = mb_get_new (pool, (memblock_t *) pool->mem, size);
611 if (mb)
612 {
613 stats_update (pool, mb->size, 0);
614 return &mb->aligned.c;
615 }
616
617 /* If we are called from xmalloc style function resort to the
618 * overflow pools to return memory. We don't do this in FIPS mode,
619 * though. */
620 if (xhint && !fips_mode ())
...
666 }
667
668 return NULL;
669 }

[amul:2] Currently, when the secure malloc fails, libgcrypt reports an ENOMEM.
That is not accurate as ENOMEM would imply that the process cannot allocate
more memory. The problem is that there is no secure memory available to service
the request. libgcrypt should report a different error.

[amul:2] If we implement option 2 below, these points are moot.
Post by Matthew Summers
[amul] I agree with your assessment of option #1. I think, however,
that there is a third option. In its current form, the gpg-agent
accepts every request and spawns a thread to handle it. There is no
limit on the rate at which it accepts connections. Other than crudely
limiting the thread count, I don't know of a good way to slew the
agent. Do you have any suggestions? Additionally, I have not seen any
timeouts in gpg/libassuan when communicating with the gpg-agent.
Limiting the number of concurrent connections to gpg-agent is a useful feature
but it requires more thought and is unlikley to make it into 2.2 anytime soon.
I opened a ticket at https://dev.gnupg.org/T3529 .
[amul:2] Thank you for creating the bug. Given the trouble that a buggy agent
can cause, I concur!
Post by Matthew Summers
[amul] If I were to prepare a patch for option #2, assuming it passes
a review and conforms to the coding standards, would it be acceptable?
We should try this. Let me implement something for testing.
https://dev.gnupg.org/T3530
[amul:2] I updated the bug with the test script that I used to expose the problem.

[amul:2] While I have you reading my mail, let me thank you and the others
involved in producing this useful suite of tools. I appreciate your time and
effort.
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
Werner Koch
2017-11-24 09:59:57 UTC
Permalink
Post by Shah, Amul
but xmalloc allocations can use the overflow pool(s). This means that every
xmalloc allocation consumes the limited main pool. Once the mainpool is
exhausted, xmalloc allocations continue, but secure mallocs stop.
To clarify: that is xmalloc_secure. I meanwhile implemented a Libgcrypt
feature to allow expanding the secmem pool. It is also possible to
advice Libcgrypt on the size of the newly allocated pools. The latter
can be important because all calls to free need to check whether that
free is affects the secmem pool - this is done by comparing the tagnges
of all secmem pools - many pools obviously take a little bit longer.

gpg-agent has a new option --auto-expand-secmem to enable this features.
This is currently in the 2.2 branch but will soon be merged into master.
Post by Shah, Amul
[amul:2] Currently, when the secure malloc fails, libgcrypt reports an
ENOMEM. That is not accurate as ENOMEM would imply that the process
cannot allocate more memory. The problem is that there is no secure
memory available to service the request. libgcrypt should report a
different error.
ENOMEM does not mean it is not possible to allocate more memory. It
should always been viewed as a temporary error code. Right a different
error code would be useful but has the disadvantgae that all ENOMEM
handling code needs to be adjusted. With the auto-expand-secmem feature
any ENOMEM will anyway be a "real" ENOMEM.
Post by Shah, Amul
[amul:2] I updated the bug with the test script that I used to expose the problem.
Thanks. All as been pushed and a Libgcrypt 1.8.2 release can be done
soonish. GnuPG 2.2.4 needs to wait a few weeks.


Salam-Shalom,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Shah, Amul
2017-11-24 10:30:48 UTC
Permalink
Post by Werner Koch
Post by Shah, Amul
but xmalloc allocations can use the overflow pool(s). This means that
every xmalloc allocation consumes the limited main pool. Once the
mainpool is exhausted, xmalloc allocations continue, but secure mallocs stop.
To clarify: that is xmalloc_secure. I meanwhile implemented a Libgcrypt
feature to allow expanding the secmem pool. It is also possible to advice
Libcgrypt on the size of the newly allocated pools. The latter can be
important because all calls to free need to check whether that free is affects
the secmem pool - this is done by comparing the tagnges of all secmem pools -
many pools obviously take a little bit longer.
gpg-agent has a new option --auto-expand-secmem to enable this features.
This is currently in the 2.2 branch but will soon be merged into master.
[amul:3] Awesome! And thanks for the explanation.
Post by Werner Koch
ENOMEM does not mean it is not possible to allocate more memory. It should
always been viewed as a temporary error code. Right a different error code
would be useful but has the disadvantgae that all ENOMEM handling code needs
to be adjusted. With the auto-expand-secmem feature any ENOMEM will anyway be
a "real" ENOMEM.
[amul:3] I never considered ENOMEM as a transient error. Something new to think about.
Post by Werner Koch
Post by Shah, Amul
[amul:2] I updated the bug with the test script that I used to expose the problem.
Thanks. All as been pushed and a Libgcrypt 1.8.2 release can be done soonish.
GnuPG 2.2.4 needs to wait a few weeks.
[amul:3] DKG, What does one need to do to back-port these changes to stable?
File a bug, attach patches that apply cleanly to the target sources and request
the maintainer to add them?
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
Daniel Kahn Gillmor
2017-11-28 01:02:33 UTC
Permalink
Hi Amul --

thanks for your persistence on this, i'm glad to see new ideas and
approaches are being tried out. You're not the only person who has run
into concurrency problems with gpg-agent.
Post by Shah, Amul
Post by Werner Koch
Thanks. All as been pushed and a Libgcrypt 1.8.2 release can be done
soonish. GnuPG 2.2.4 needs to wait a few weeks.
[amul:3] DKG, What does one need to do to back-port these changes to stable?
File a bug, attach patches that apply cleanly to the target sources and request
the maintainer to add them?
Are you asking about debian stable, or the GnuPG stable branch? Since
you're asking me specifically, i assume you're asking about debian
stable (aka "stretch"). please ignore the rest of this if that isn't
what you meant ;)

At first glance, it looks like it would require patches to libgcrypt
itself, in addition to patches to GnuPG.

That would be something to coordinate for a point release perhaps, but
it could be complicated; there are several other bugfix improvements
that it would also be good to include into debian stable.

To make this stand out clearly, you probably want to start by opening a
bug report (in the debian BTS) against the gcrypt package, tagged
appropriately for the affected versions, explaining the problem,
pointing to the upstream reports and commits, and also mark it so that
it "affects" the gnupg2 source package.

make sense?

--dkg
Werner Koch
2017-11-28 06:59:32 UTC
Permalink
Post by Daniel Kahn Gillmor
At first glance, it looks like it would require patches to libgcrypt
itself, in addition to patches to GnuPG.
FWIW, changes in the repo since 1.8.1 are:

59df8d6 * sexp: Avoid a fatal error in case of ENOMEM in called functions.
f4582f8 * api: Add auto expand secmem feature
334e1a1 * tests: Add HAVE_MMAP check for MinGW.
da127f7 * Fix secmem test for machine with larger page.

The first one actually fixes a side-effect of the "out of secure memory"
state. I can do a Libcgrypt 1.8.2 at any time. Just let me know.

Backporting the required change for GnuPG to any 2.1 version will be
trivial.

Shalom-Salam,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Shah, Amul
2017-11-28 11:14:21 UTC
Permalink
thanks for your persistence on this, i'm glad to see new ideas and approaches
are being tried out. You're not the only person who has run into concurrency
problems with gpg-agent.
[amul:4] Thanks for the kind words.
Post by Shah, Amul
[amul:3] DKG, What does one need to do to back-port these changes to stable?
File a bug, attach patches that apply cleanly to the target sources
and request the maintainer to add them?
Are you asking about debian stable, or the GnuPG stable branch? Since you're
asking me specifically, i assume you're asking about debian stable (aka
"stretch"). please ignore the rest of this if that isn't what you meant ;)
[amul:4] Ack! I did mean Debian stable.
At first glance, it looks like it would require patches to libgcrypt itself,
in addition to patches to GnuPG.
That would be something to coordinate for a point release perhaps, but it
could be complicated; there are several other bugfix improvements that it
would also be good to include into debian stable.
[amul:4] If I understood Werner's last mail, Debian stable should update to
libgcrypt 1.8.1 and then back port his changes to GnuPG 2.1.
To make this stand out clearly, you probably want to start by opening a bug
report (in the debian BTS) against the gcrypt package, tagged appropriately
for the affected versions, explaining the problem, pointing to the upstream
reports and commits, and also mark it so that it "affects" the gnupg2 source
package.
[amul:4] I created #882985 for this issue. I ran reportbug from my Debian
"unstable" server, and since this is my first time through the utility, I'm
sure that I made more than one mistake.

[amul:4] I'm not sure if I set the "affects" gnupg2 correctly.

Amul
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
Shah, Amul
2017-11-22 13:40:15 UTC
Permalink
[amul] Hi, please excuse any Outlook reformatting.
Post by Werner Koch
Post by Matthew Summers
id. This is a severe issue for us that seems pretty easy to remedy
with a patch like this.
Can you please explain the problem you try to solve.
[amul] Similar to Matt, we're trying to use GnuPG to decrypt secret keys. If I
understood Matt's problem, they use 4k keys which exhaust the mlock()ed space
very fast. With fis-gtm, we spawn so many processes that exhaust
mlock()ed space frequently enough for this to be a problem for us.
Post by Werner Koch
Most crypto operations which temporary require allocation of extra chunks of
data (big integer computation) the secure memory area is enlarged on the fly.
The gpg-agent stores secret keys in a cache which is allocated in standard
memory. This can be done because the keys are stored encrypted in this cache
(using a per process ephemeral key).
[amul] "secure" memory allocations only use the first mlock()ed memory area. Is
this what you mean by "standard memory"?

[amul] When you say that the agent can cache a secret key, does that mean
subsequent requests for the same key will be serviced from cache?
Post by Werner Koch
Any problems creating or using more keys is either due to a memory leak in
gpg-agent or because you have a lot of active connections to the gpg-agent
using private keys.
[amul] In our case, we have many active connections to the gpg-agent (via
libgpgme and gpg) as processes decrypt the secret key that encrypts the
fis-gtm database.
Post by Werner Koch
1. Enlarging the secure memory area. This has the problem that you
will never known how much is sufficient.
2. Enlarge the secure memory area on the fly as we do it with some of
the Libgcrypt internal mallocs (actually all xmalloc style
allocations). That would require an update to Libgcrypt and a
runtime option to gpg-agent to enable this.
[amul] I agree with your assessment of option #1. I think, however, that there
is a third option. In its current form, the gpg-agent accepts every request and
spawns a thread to handle it. There is no limit on the rate at which it accepts
connections. Other than crudely limiting the thread count, I don't know of a
good way to slew the agent. Do you have any suggestions? Additionally, I have
not seen any timeouts in gpg/libassuan when communicating with the gpg-agent.

[amul] If I were to prepare a patch for option #2, assuming it passes a review and
conforms to the coding standards, would it be acceptable?

[amul] Best Regards, Amul
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
Rainer Perske
2017-12-14 11:26:05 UTC
Permalink
Hallo,
Post by Werner Koch
I took your suggestion and gnupg 2.2.4 (o be released next week) will
support a new build option: [...]
Großartig! Ganz, ganz herzlichen Dank!

[Great!, many many thanks!]

Beste Grüße
--
Rainer Perske
Abteilung Systembetrieb und Leiter der Zertifizierungsstelle (WWUCA)
Zentrum für Informationsverarbeitung (Universitätsrechenzentrum)

Westfälische Wilhelms-Universität
Zentrum für Informationsverarbeitung
Rainer Perske
Röntgenstraße 7-13
48149 Münster

Tel.: +49 251 83-31582
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/Mitarbeiter/RainerPerske.shtml
Büro: Raum 006, Röntgenstraße 11
Lageplan: http://wwwuv2.uni-muenster.de/uniplan/?action=spot&gebnr=7474

Zertifizierungsstelle der Universität Münster (WWUCA):
Tel.: +49 251 83-31590
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/WWUCA/

Zentrum für Informationsverarbeitung (ZIV):
Tel.: +49 251 83-31600 (Mo-Fr 7:30-17:30 Uhr)
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/
Rainer Perske
2017-12-20 11:48:50 UTC
Permalink
Hello,

I have just downloaded GnuPG 2.2.4 an will check the patch now.

Thanks again!

In my proposal, I had enlarged the character array "prefix" (now line
557 of common/homedir.c) from "13 + 1 + 20 + 6 + 1" to "6 + "13 + 1 +
20 + 6 + 1", the additional "6" is space for the characters "/gnupg" I
had added to the prefix.

I do not understand yet how the figures 13 + 1 + 20 are calculated.
Maybe I was too careful by enlarging the space, but I'd propose to
check whether the array is still large enough for all possible cases
(if you did not already do so).

Best regards
--
Rainer Perske
Abteilung Systembetrieb und Leiter der Zertifizierungsstelle (WWUCA)
Zentrum für Informationsverarbeitung (Universitätsrechenzentrum)

Westfälische Wilhelms-Universität
Zentrum für Informationsverarbeitung
Rainer Perske
Röntgenstraße 7-13
48149 Münster

Tel.: +49 251 83-31582
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/Mitarbeiter/RainerPerske.shtml
Büro: Raum 006, Röntgenstraße 11
Lageplan: http://wwwuv2.uni-muenster.de/uniplan/?action=spot&gebnr=7474

Zertifizierungsstelle der Universität Münster (WWUCA):
Tel.: +49 251 83-31590
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/WWUCA/

Zentrum für Informationsverarbeitung (ZIV):
Tel.: +49 251 83-31600 (Mo-Fr 7:30-17:30 Uhr)
Fax.: +49 251 83-31555
E-Mail: ***@uni-muenster.de
WWW: https://www.uni-muenster.de/ZIV/
Werner Koch
2017-12-20 14:32:17 UTC
Permalink
Post by Rainer Perske
I do not understand yet how the figures 13 + 1 + 20 are calculated.
char prefix[13 + 1 + 20 + 6 + 1];

/var/run/user -> 13
/ -> 1
UID -> 20 (for a 64 bit unsigned integer)
/gnupg -> 6
nul -> 1

and indeed I did not add 6 extra bytes for the leading "/gnupg".
However, it is not a problem: A 32 bit user id (e.g. 2147483647) does
only need 10 bytes and thus there is enough space for the extra string.
If the ID would really grow to a larger 64 bit value, this will not
work anymore and the explict check will let it return an error.

Although I tested this I obviously did not noticed that we will end up
with a socket directory

/run/gnupg/user/gnupg/

The second gnupg would not be needed. The question is whether to remove
it or to keep it for code simplicity. I will fix the size of the
prefix, though.


Salam-Shalom,

Werner
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.
Loading...