| title | A Sentinel for Null-Terminated Strings | |||||
|---|---|---|---|---|---|---|
| document | P3705R2 | |||||
| date | 2025-06-18 | |||||
| audience |
|
|||||
| author |
|
|||||
| toc | true | |||||
| monofont | DejaVu Sans Mono |
Consider using std::views::take to get the first five characters from a string. If we
pass a string literal with five or more characters, it looks okay:
static_assert(std::string{std::from_range, "Brubeck" | std::views::take(5)} == "Brube"); // passesHowever, if we pass it a string literal with fewer than five characters, we get in trouble:
// static_assert(std::string{std::from_range, "Dave" | std::views::take(5)} == "Dave"); // fails
using namespace std::string_view_literals;
static_assert(std::string{std::from_range, "Dave" | std::views::take(5)} == "Dave\0"sv); // passesThe reason the null terminator is included in the output here is because undecayed string literals are arrays of characters, including the null terminator, which the ranges library treats like any other array:
#include <algorithm>
#include <array>
#include <type_traits>
static_assert(std::is_same_v<std::remove_reference_t<decltype("foo")>, const char[4]>);
static_assert(std::ranges::equal("foo", std::array{'f', 'o', 'o', '\0'}));A common workaround is to wrap the string literal in std::string_view. But this has a
performance cost: despite the fact that we only care about the first five characters, we
still need to make a pass through the entire string to find the null terminator:
constexpr std::string take_five(char const* long_string) {
std::string_view const long_string_view = long_string; // read all of long_string!
return std::string{std::from_range, long_string_view | std::views::take(5)};
}This paper introduces null_sentinel to solve this problem. It ends the range when it
encounters a null:
constexpr std::string take_five(char const* long_string) {
std::ranges::subrange const long_string_range(long_string, std::null_sentinel); // lazy!
return std::string{std::from_range, long_string_range | std::views::take(5)};
}It also introduces a null_term CPO for the common case of constructing a subrange like
the one in the above example:
constexpr std::string take_five(char const* long_string) {
return std::string{std::from_range, std::views::null_term(long_string) | std::views::take(5)};
}The sentinel type matches any iterator position it at which *it is equal to a
value-initialized object of type iter_value_t<I>. This works for null-terminated
strings, but can also serve as the sentinel for any range terminated by a
value-initialized value.
For example, null_term can be used to iterate argv and environ. The following program demonstrates this:
#include <print>
extern char** environ;
int main(int argc, char** argv) {
std::println("argv: {}", std::views::null_term(argv));
std::println("environ: {}", std::views::null_term(environ));
}Output:
$ env --ignore-environment FOO=bar BAZ=quux ./test corge
argv: ["./test", "corge"]
environ: ["FOO=bar", "BAZ=quux"]
range-v3
provides
similar functionality to null_term using the name c_str-- unlike null_term, c_str
only works with char, wchar_t, and so forth.
The name c_str would not make sense here because, for example, std::views::c_str(argv)
is nonsensical and misleading as to what the range is doing (terminating the sequence of
strings rather than terminating the strings themselves).
An example from Barry Revzin's [@P2210R2] provides similar functionality under the names
zstring_sentinel and zstring.
I dislike this name for similar reasons as c_str. argv is not a "string."
Additionally, it invites confusion with [@P3655R1] zstring_view.
The advantages of these names are that they are terse, they do not include the word
"string," and the value-initialized iter_value_t value is referred to as "null," which
reflects the language we use in the standard ("null terminated byte string," "null
terminated multibyte string," "null terminated character sequence," "null pointer")
This is used by think-cell's implementation of this utility. I like it slightly less because it's more verbose but also consider it acceptable.
This is not an exhaustive list.
namespace std {
template<class I>
concept @*default-initializable-and-equality-comparable-iter-value*@ =
default_initializable<iter_value_t<I>> &&
equality_comparable_with<iter_reference_t<I>, iter_value_t<I>>; // @*exposition only*@
}namespace std {
struct null_sentinel_t {
template<input_iterator I>
requires @*default-initializable-and-equality-comparable-iter-value*@<I>
friend constexpr bool operator==(I const& it, null_sentinel_t);
};
inline constexpr null_sentinel_t null_sentinel;
}template<input_iterator I>
requires @*default-initializable-and-equality-comparable-iter-value*@<I>
friend constexpr bool operator==(I const& it, null_sentinel_t);Effects:
Equivalent to return *it == iter_value_t<I>();.
namespace std::views {
inline constexpr @*unspecified*@ null_term;
}The name null_term denotes a customization point object ([customization.point.object]).
Given a subexpression E, the expression null_term(E) is expression-equivalent to
ranges::subrange(E, null_sentinel).
Header <version> synopsis [version.syn]
#define __cpp_lib_ranges_null_sentinel 2026XXLThis proposal was originally written by Zach Laine as part of [@P2728R0], then updated and split out by Eddie Nolan.
- Generalize
null_sentinel_tto a non-Unicode-specific facility.
- Change
null_sentinel_tback to being Unicode-specific.
- Move
null_sentinel_ttostd, remove itsbasemember function, and make it useful for more than just pointers, based on SG-9 guidance.
- Simplify the complicated constraint on the comparison operator for
null_sentinel_t.
- Fix a bug in
null_sentinel_tcausing it not to satisfysentinel_forby changing itsoperator==to returnbool. - Fix a bug in
null_sentinel_twhere it did not support non-copyable input iterators by havingoperator==take input iterators by reference.
- Move
null_sentinelandnull_terminto P3705
- Fix off-by-one in textual description in motivation section
- Add example with argv and environ
- Add
unchecked_take_beforedesign alternative
- Remove
unchecked_take_beforeand pass-by-value design alternatives - Add polls and minutes from Sofia 2025 SG9 review
- Move
null_termto namespacestd::views - Add feature test macro
- Add naming discussion
- Use
iter_value_t<I>()overiter_value_t<I>{}to avoidinitializer_listinteraction
POLL: We would like something like the proposed null_sentinel regardless of the addition of a hypothetical unchecked_take_before view that may come in the future.
Outcome: No objection to unanimous consent
POLL: Always pass the iterator by reference to operator==, move null_term to namespace std::views, add a discussion about alternative names, and forward P3705R1 with those changes to LEWG for inclusion in C++29.
| SF | F | N | A | SA |
|---|---|---|---|---|
| 9 | 7 | 0 | 0 | 0 |
# Of Authors: 1
Author's Position: SF
Attendance: 18 (2 abstentions)
Outcome: Unanimous consent
POLL: Move null_sentinel_t to std:: namespace
| SF | F | N | A | SA |
|---|---|---|---|---|
| 1 | 3 | 1 | 0 | 0 |
# Of Authors: 1
Author's Position: F
Attendance: 9 (4 abstentions)
Outcome: Consensus in favor
POLL: Remove null_sentinel_t::base member function from the proposal
| SF | F | N | A | SA |
|---|---|---|---|---|
| 0 | 4 | 1 | 0 | 0 |
# Of Authors: 1
Author's Position: F
Attendance: 8 (3 abstentions)
Outcome: Consensus in favor
POLL: Separate std::null_sentinel_t from P2728 into a separate paper for
SG9 and LEWG; SG16 does not need to see it again.
+----+---+---+---+----+ | SF | F | N | A | SA | +====+===+===+===+====+ | 1 |1 |4 |2 | 1 | +----+---+---+---+----+
Attendance: 12 (3 abstentions)
Outcome: No consensus; author's discretion for how to continue.