Describe the bug, including details regarding any error messages, version, and platform.
There is an issue in recent Arrow releases where memory allocations made by Arrow conflict with the virtual memory region reserved for the AddressSanitizer (ASan) shadow memory on x86_64 Linux. This causes any subsequent attempt to initialize ASan (e.g. by loading a sanitized library) to fail.
This bug is currently fixed on main but is present in major releases 19.x - 21.x (and possibly earlier, though I haven't verified this).
Minimal Reproduction
The following test currently reliably fails on recent Arrow releases on x86_64 Linux:
#include <sys/mman.h>
#include <cerrno>
#include <cstdint>
#include <cstring>
#include <memory>
#include "arrow/api.h"
TEST(Lester, LesterTest) {
arrow::Int64Builder a, b;
ASSERT_TRUE(a.AppendValues({1, 2}).ok());
ASSERT_TRUE(b.AppendValues({3, 4}).ok());
std::shared_ptr<arrow::Array> aa, bb;
ASSERT_TRUE(a.Finish(&aa).ok());
ASSERT_TRUE(b.Finish(&bb).ok());
auto schema = arrow::schema(
{arrow::field("a", arrow::int64()), arrow::field("b", arrow::int64())});
auto table = arrow::Table::Make(schema, {aa, bb});
(void)table;
constexpr int PROT = PROT_READ | PROT_WRITE;
constexpr int FLAGS = MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED_NOREPLACE | MAP_NORESERVE;
// https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
// This is a reserved region in virtual memory needed for the address sanitizer to properly work.
// Note that the `MAP_FIXED_NOREPLACE` tells `mmap` to exit if any region overlaps with existing allocated regions.
constexpr uintptr_t BEGIN = 0x02008fff7000ULL;
constexpr uintptr_t END = 0x10007fff7fffULL;
constexpr size_t LEN = static_cast<size_t>(END - BEGIN + 1);
void* shadow_region = mmap(reinterpret_cast<void*>(BEGIN), LEN, PROT, FLAGS, 0, 0);
ASSERT_EQ(shadow_region, reinterpret_cast<void*>(BEGIN)) << "mmap failed (" << errno << "): " << std::strerror(errno);
}
The test does the following:
- Performs an
arrow::Table allocation.
- Makes an
mmap call to reserve memory corresponding to the shadow memory reserved for the address sanitizer (Note that the MAP_FIXED_NOREPLACE flag tells mmap to fail if any region overlaps with existing allocated regions, which it does in this test).
On affected versions, the assertion fails because the initial Arrow allocation has already claimed a portion of the shadow memory, causing the second mmap call to fail with EEXIST.
Diagnosis
Running strace on the test confirms that libarrow.so makes an mmap call that allocates memory inside the ASan shadow range. For example:
mmap(0x3e780000000, 67108864, ...) = 0x3e780000000
The address 0x3e780000000 is within the ASan 64-bit shadow memory region [0x02008fff7000, 0x10007fff7fff]. The address provided to mmap by the allocator appears to be a bad hint. The address is also relatively inscrutable which makes us wonder whether it's an uninitialized variable read.
Root Cause and Resolution
Since this bug was reproducible on recent releases but not on main. I ran a git bisect using this script and minimal test: main...lesterfan:arrow:20251014-cpp-asan-error. The bisect identified #47589 as the first commit which fixes this test. I was able to verify this by cherry-picking this commit onto our release and observing that our observed issues with ASAN were resolved. This suggests that this bug is somewhere within mimalloc v2 and has since been fixed in v3.
Given that this bug prevents using Arrow with ASan-instrumented code, would it be possible to cherry-pick this commit into a maintenance release for the affected Arrow versions?
Even if not, I wanted to file this issue upstream for visibility in case other people run into similar issues. Thank you!
Component(s)
C++
Describe the bug, including details regarding any error messages, version, and platform.
There is an issue in recent Arrow releases where memory allocations made by Arrow conflict with the virtual memory region reserved for the AddressSanitizer (ASan) shadow memory on x86_64 Linux. This causes any subsequent attempt to initialize ASan (e.g. by loading a sanitized library) to fail.
This bug is currently fixed on
mainbut is present in major releases 19.x - 21.x (and possibly earlier, though I haven't verified this).Minimal Reproduction
The following test currently reliably fails on recent Arrow releases on x86_64 Linux:
The test does the following:
arrow::Tableallocation.mmapcall to reserve memory corresponding to the shadow memory reserved for the address sanitizer (Note that theMAP_FIXED_NOREPLACEflag tellsmmapto fail if any region overlaps with existing allocated regions, which it does in this test).On affected versions, the assertion fails because the initial Arrow allocation has already claimed a portion of the shadow memory, causing the second mmap call to fail with
EEXIST.Diagnosis
Running
straceon the test confirms thatlibarrow.somakes an mmap call that allocates memory inside the ASan shadow range. For example:The address
0x3e780000000is within the ASan 64-bit shadow memory region[0x02008fff7000, 0x10007fff7fff]. The address provided tommapby the allocator appears to be a bad hint. The address is also relatively inscrutable which makes us wonder whether it's an uninitialized variable read.Root Cause and Resolution
Since this bug was reproducible on recent releases but not on
main. I ran agit bisectusing this script and minimal test: main...lesterfan:arrow:20251014-cpp-asan-error. The bisect identified #47589 as the first commit which fixes this test. I was able to verify this by cherry-picking this commit onto our release and observing that our observed issues with ASAN were resolved. This suggests that this bug is somewhere withinmimallocv2 and has since been fixed in v3.Given that this bug prevents using Arrow with ASan-instrumented code, would it be possible to cherry-pick this commit into a maintenance release for the affected Arrow versions?
Even if not, I wanted to file this issue upstream for visibility in case other people run into similar issues. Thank you!
Component(s)
C++