Details
-
New Feature
-
Status: Open (View Workflow)
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
The task is to enable native BIT fields for HEAP tables
This means that BIT(1) NOT NULL will after the change use up 1 bit in the record, instead of 1 byte.
While developing this, I found out that bit fields has a several problems:
- Wrong key length allocation of BIT() fields that are a multiple of 8
- HEAP tables had several bugs in BIT fields handling. These where hidden as
HEAP TABLES reported that it does not support bit fields.
Summary from claude:
(This does not include a bug that was fixed as part of MDEV-38975)
BIT column support in the HEAP/MEMORY engine exposed three correctness
bugs in BIT key handling — one in the HEAP RB-tree key builder and two pre-existing
latent bugs in Field_bit::get_key_image().
Bug 1 — hp_rb_make_key(): BIT segment over-advances the key buffer
For a BTREE (RB-tree) index on a BIT column, the per-segment code wrote the uneven
high-bits byte with *key++, decremented char_length, then still advanced key += seg->length. Because RB-tree keys are fixed-width (seg->length), this advanced the cursor one byte too far per BIT segment, misaligning all following key parts.
Fix: write the high-bits byte to key[0] without advancing, copy the remaining whole bytes to key+1 via an offset, and advance by exactly seg->length. The reassembled layout ([high-bits byte][whole bytes]) now matches both the SQL-layer key image and hp_rb_pack_key().
Bug 2 — Field_bit::get_key_image(): return value off by one when bit_len == 0
The function unconditionally returned data_length + 1, but the leading byte only exists when there are uneven high bits (bit_len > 0). For BIT(8), BIT(16), … (any field_length % 8 == 0) and for every Field_bit_as_char, bit_len == 0, so it reported one more byte than it actually wrote.
The contract (field.h) is "number of copied bytes." Reachable consumer: JSON histogram endpoints (opt_histogram_json.cc) do out->length(bytes) after allocating only pack_length(), producing a string marked one byte too long. (In key_copy() it was harmless because the return is only used to decide trailing-space fill, which a full BIT value never needs — hence it stayed latent for years.)
Fix: track whether the leading byte was emitted and return data_length + bit_byte.
Bug 3 — Field_bit::get_key_image(): data bytes read from the wrong record
When the ptr_arg parameter was added (commit 7f9b3ea, "pass ptr into more Field methods"), the uneven bits read was relocated to the passed-in record (ptr_arg + (bit_ptr - ptr)), but the data bytes memcpy still read from the field's bound ptr. With the 4-arg caller in key.cc passing from_ptr = ptr_in_record(from_record), when from_record is not the field's bound record the bits come from onerecord and the data bytes from another (record[0]).
Fix: memcpy from ptr_arg instead of ptr, consistent with the bits read and with
Field_varstring::get_key_image().
BIT fields should also be enabled for HEAP tables natively.
This means that BIT(1) NOT NULL will after the change use up 1 bit in the record, instead of 1 byte.