Skip to content

index: avoid unsafe buffer sharing in Log_file#358

Merged
craigfe merged 1 commit intomirage:mainfrom
craigfe:guard-unsafe-scratch-buffer
Oct 11, 2021
Merged

index: avoid unsafe buffer sharing in Log_file#358
craigfe merged 1 commit intomirage:mainfrom
craigfe:guard-unsafe-scratch-buffer

Conversation

@craigfe
Copy link
Member

@craigfe craigfe commented Oct 10, 2021

#355 introduced a small optimisation to re-use a "local" scratch buffer when decoding values from the log file.

This is actually unsafe: during the merge, the asynchronous merge thread and the main writer thread can attempt concurrent reads from the log file, causing contention over the scratch buffer. This can be observed by inserting Thread.yield calls inside the Value.decode implementation and then stress-testing the interface (e.g. by running the replay benchmarks).

This commit ensures that each call to a Log_file function gets its own scratch buffer, ensuring safe concurrent access without introducing potential contention issues.

Note: I initially tried to just wrap t.scratch_buf in a mutex (0a7b05c), but this was a ~5% performance regression in the replay benchmarks due to contention over the lock. The alternative of just allocating on entry to Log_file has a relatively small cost.

@craigfe craigfe force-pushed the guard-unsafe-scratch-buffer branch from 6771e45 to 0a7b05c Compare October 10, 2021 11:40
@samoht
Copy link
Member

samoht commented Oct 10, 2021

What's the mutex impact on the performance? Wouldn't it be better to have a scratch buffer by thread?

@craigfe craigfe force-pushed the guard-unsafe-scratch-buffer branch from 0a7b05c to 5154202 Compare October 10, 2021 13:26
@craigfe
Copy link
Member Author

craigfe commented Oct 10, 2021

@samoht: I took some measurements of that :-) Yes, that mutex has a ~5% performance impact on very write-intensive workloads like the replay benchmarks when the log size is large (but I couldn't see any impact for other workloads).

I've changed the implementation to allocate new buffers on entry to Log_file. It might be better to do things per-thread indeed, but we'd need an LRU to avoid leaking memory there. It's not sufficient to just keep separate threads for the writer + merge thread, because readers in different threads can contend for the buffer as well.

mirage#355 introduced a small optimisation
to re-use a "local" scratch buffer when decoding values from the log
file.

This is actually unsafe: during the merge, the asynchronous merge thread
and the main writer thread can attempt concurrent reads from the log
file, causing contention over the scratch buffer. This can be observed
by inserting `Thread.yield` calls inside the `Value.decode`
implementation and then stress-testing the interface (e.g. by running
the replay benchmarks).

This commit ensures that each call to a `Log_file` function gets its own
scratch buffer, ensuring safe concurrent access without introducing
potential contention issues.
@craigfe craigfe force-pushed the guard-unsafe-scratch-buffer branch from 5154202 to 4e3ba1e Compare October 10, 2021 13:38
@craigfe craigfe changed the title index: guard scratch buffer in Log_file with a mutex index: avoid unsafe buffer sharing in Log_file Oct 10, 2021
@craigfe craigfe merged commit f359eb8 into mirage:main Oct 11, 2021
craigfe added a commit to craigfe/irmin that referenced this pull request Oct 12, 2021
This contains mirage/index#358, which doesn't
change the API but is an important bug-fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants