[MDEV-17529] Document internal details of data-at-rest encryption Created: 2018-10-23  Updated: 2023-12-13

Status: Open
Project: MariaDB Server
Component/s: Documentation - Support, Encryption, Storage Engine - InnoDB
Fix Version/s: N/A

Type: Task Priority: Critical
Reporter: Geoff Montee (Inactive) Assignee: Ian Gilfillan
Resolution: Unresolved Votes: 1
Labels: None

Issue Links:
PartOf

 Description   

Some users would like to have internal details of MariaDB's data-at-rest encryption documented. For example:

InnoDB

  • When is InnoDB data encrypted? Is a page encrypted when it is flushed to disk?

Answer:

InnoDB pages are encrypted when they are written to disk.

  • When is InnoDB data decrypted? Is a page decrypted when it is read into the buffer pool? If so, are all of the pages in the buffer pool always in their decrypted form?

Answer:

InnoDB pages are decrypted when they read from disk and before they are put into the buffer pool. Page will be in its decrypted form in memory as long as it is in the buffer pool and that page could contain columns, rows and even tables that queries do not even use.


Aria

Information about Aria is still needed. That information might have to come from serg or monty.

  • When is Aria data encrypted?
  • When is Aria data decrypted?

Binary Logs and Relay Logs

  • When is an event encrypted?

Events are encrypted when they are written to the IO_CACHE, regardless of whether the IO_CACHE is in memory or on disk (whether it is in memory or on disk depends on the transaction size and the values of binlog_cache_size/binlog_stmt_cache_size). This means that events are encrypted even before they are written to the physical binary log or relay log file.

  • When is an event decrypted?

Events are decrypted as they are read if a START_ENCRYPTION_EVENT is encountered in the binary log or relay log. In encrypted binary logs *and* relay logs, this START_ENCRYPTION_EVENT is the second event in the log file, right after the FORMAT_DESCRIPTION_EVENT.



 Comments   
Comment by Geoff Montee (Inactive) [ 2018-10-23 ]

jplindst may be able to provide feedback on this.

Comment by Jacob Moorman (Inactive) [ 2019-04-12 ]

Geoff to pull notes from developer interview to this ticket, to facilitate clean update of the docs from primary source material; developers will be asked to review the changes.

Comment by Jacob Moorman (Inactive) [ 2019-04-23 ]

Kenneth: Geoff's notes have been merged to the ticket so this should now be ready for action.

Comment by Andrei Elkin [ 2019-04-26 ]

To the binlog encryption questions,

> When is the binary log encrypted?
When the server is launched with --encrypt-binlog set to ON and --file-key-management-filename is set to
a key file, see https://mariadb.com/kb/en/library/file-key-management-encryption-plugin/ for the latter.
With such settings the very first binlog event Format_descriptor_log_event is recorded unencrypted.
Then it follows with another unencrypted START_ENCRYPTION_EVENT. After that all following events
are stored in encrypted form.

>Is it when the transaction is written to the physical binary log file?
Encryption is done per part of an event, that is each of the replication event's header, data and footer part and processed to produce encrypted blocks. The blocks are produced before their transaction gets committed, that is
before it's ready to be logged into binary logs. The encrypted blocks therefore may exist in files associated with
user connection binlog IO_CACHE:s. Eventually they will be extracted and flushed into a binlog file.

> Are the binlog events in their decrypted form when they are in the in-memory buffers?
No.

Comment by Geoff Montee (Inactive) [ 2019-04-26 ]

Hi Elkin,

> When is the binary log encrypted?
When the server is launched with --encrypt-binlog set to ON and --file-key-management-filename is set to
a key file, see https://mariadb.com/kb/en/library/file-key-management-encryption-plugin/ for the latter.
With such settings the very first binlog event Format_descriptor_log_event is recorded unencrypted.
Then it follows with another unencrypted START_ENCRYPTION_EVENT. After that all following events
are stored in encrypted form.

Thanks for the feedback.

That tells us how to configure encryption for binary logs. We're already familiar with that process. This question was asking something a bit different. It is asking when the binary log events get encrypted, as in, at what point does MariaDB encrypt the binary log events?

>Is it when the transaction is written to the physical binary log file?
Encryption is done per part of an event, that is each of the replication event's header, data and footer part and processed to produce encrypted blocks. The blocks are produced before their transaction gets committed, that is
before it's ready to be logged into binary logs. The encrypted blocks therefore may exist in files associated with
user connection binlog IO_CACHE:s. Eventually they will be extracted and flushed into a binlog file.

> Are the binlog events in their decrypted form when they are in the in-memory buffers?
No.

  • Does MariaDB encrypt each binlog event immediately when it creates the event and writes it to the IO_CACHE? It sounds like the answer is "yes".
  • Does MariaDB encrypt each binlog event when it writes it to the IO_CACHE, only if the IO_CACHE contents are so big that they are written to disk (i.e. if the cache is bigger than binlog_cache_size/binlog_stmt_cache_size)? It sounds like the answer is "no".
  • Does MariaDB encrypt each binlog event when it writes it to the IO_CACHE, even if the the IO_CACHE are small enough that they can fit in memory (i.e. if the cache is smaller than binlog_cache_size/binlog_stmt_cache_size)? It sounds like the answer is "yes".
  • Does MariaDB only encrypt each binlog event when it writes it to the actual binary log file? It sounds like the answer is "no", and the events are already encrypted when they are in the IO_CACHE.

I'm also a bit confused about some of the edits that you made to the issue description regarding relay logs:

In encrypted relay logs, it sounds like those can contain some metadata about the master's binary logs too, and I can't tell if the START_ENCRYPTION_EVENT is still expected to be the second event in encrypted relay logs (yes it is - Andrei confirms), or if this event occurs further in than in binary logs (right).

You say "yes" to the claim that the START_ENCRYPTION_EVENT is still expected to be the second event in encrypted relay logs, but you also say "right" that this event occurs further in than in encrypted binary logs. If the START_ENCRYPTION_EVENT is the second event in encrypted binary logs, then how can both of these be true? Do you mean to say that the START_ENCRYPTION_EVENT is still expected to be the second event in encrypted relay logs, but that it might not be the second event in some scenarios, because encrypted relay logs can contain additional optional metadata that is not in encrypted binary logs?

Thanks!

Generated at Thu Feb 08 08:37:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.