net/tls: prevent skb_orphan() from leaking TLS plain text with offload
sk_validate_xmit_skb() and drivers depend on the sk member of
struct sk_buff to identify segments requiring encryption.
Any operation which removes or does not preserve the original TLS
socket such as skb_orphan() or skb_clone() will cause clear text
leaks.
Make the TCP socket underlying an offloaded TLS connection
mark all skbs as decrypted, if TLS TX is in offload mode.
Then in sk_validate_xmit_skb() catch skbs which have no socket
(or a socket with no validation) and decrypted flag set.
Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
sk->sk_validate_xmit_skb are slightly interchangeable right now,
they all imply TLS offload. The new checks are guarded by
CONFIG_TLS_DEVICE because that's the option guarding the
sk_buff->decrypted member.
Second, smaller issue with orphaning is that it breaks
the guarantee that packets will be delivered to device
queues in-order. All TLS offload drivers depend on that
scheduling property. This means skb_orphan_partial()'s
trick of preserving partial socket references will cause
issues in the drivers. We need a full orphan, and as a
result netem delay/throttling will cause all TLS offload
skbs to be dropped.
Reusing the sk_buff->decrypted flag also protects from
leaking clear text when incoming, decrypted skb is redirected
(e.g. by TC).
See commit 0608c69c9a
("bpf: sk_msg, sock{map|hash} redirect
through ULP") for justification why the internal flag is safe.
The only location which could leak the flag in is tcp_bpf_sendmsg(),
which is taken care of by clearing the previously unused bit.
v2:
- remove superfluous decrypted mark copy (Willem);
- remove the stale doc entry (Boris);
- rely entirely on EOR marking to prevent coalescing (Boris);
- use an internal sendpages flag instead of marking the socket
(Boris).
v3 (Willem):
- reorganize the can_skb_orphan_partial() condition;
- fix the flag leak-in through tcp_bpf_sendmsg.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:

committed by
David S. Miller

parent
0de94de180
commit
414776621d
@@ -373,9 +373,9 @@ static int tls_push_data(struct sock *sk,
|
||||
struct tls_context *tls_ctx = tls_get_ctx(sk);
|
||||
struct tls_prot_info *prot = &tls_ctx->prot_info;
|
||||
struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx);
|
||||
int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
|
||||
int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
|
||||
struct tls_record_info *record = ctx->open_record;
|
||||
int tls_push_record_flags;
|
||||
struct page_frag *pfrag;
|
||||
size_t orig_size = size;
|
||||
u32 max_open_record_len;
|
||||
@@ -390,6 +390,9 @@ static int tls_push_data(struct sock *sk,
|
||||
if (sk->sk_err)
|
||||
return -sk->sk_err;
|
||||
|
||||
flags |= MSG_SENDPAGE_DECRYPTED;
|
||||
tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
|
||||
|
||||
timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
|
||||
if (tls_is_partially_sent_record(tls_ctx)) {
|
||||
rc = tls_push_partial_record(sk, tls_ctx, flags);
|
||||
@@ -576,7 +579,9 @@ void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
|
||||
gfp_t sk_allocation = sk->sk_allocation;
|
||||
|
||||
sk->sk_allocation = GFP_ATOMIC;
|
||||
tls_push_partial_record(sk, ctx, MSG_DONTWAIT | MSG_NOSIGNAL);
|
||||
tls_push_partial_record(sk, ctx,
|
||||
MSG_DONTWAIT | MSG_NOSIGNAL |
|
||||
MSG_SENDPAGE_DECRYPTED);
|
||||
sk->sk_allocation = sk_allocation;
|
||||
}
|
||||
}
|
||||
|
Reference in New Issue
Block a user