What if C++ had explicit destruction?
Scope-based lifetime is a fantastic language feature, and it's a large contributing factor to the popularity of languages like Rust. C++ calls it RAII for historical reasons, but the general idea is that the compiler automatically inserts calls to cleanup functions (destructors) at the instant when the object ceases to be accessible (goes out of scope). This works very well for a wide variety of object types, such as allocated memory, operating system handles (e.g. files), locks/mutexes, logging, and more. The primary strength here is that the compiler doesn't let us forget to do a necessary operation.
However, destructors in C++ are also incredibly limited compared to how they could be. They cannot take any parameters, and they cannot fail. You can of course add interfaces to set parameters in advance or to explicitly clean up before the destructor so you can check for failures, but those are things that can be forgotten. The compiler always remembers to call the destructor, but it has no idea that the programmer is supposed to also do other things with the object before that point. For example on Windows it is common to want to close file handles on a separate thread due to the blocking nature of that action, but we're used to just letting file handles be implicitly closed when we're done with them on whatever thread we happen to be on, accidentally blocking as we do so.
I remember stumbling across an article about the Vale programming language's "Higher RAII" feature, which others have compared to linear types and type states.
An example motivation is asynchronous programming: you spin off one or more async tasks, and some of them require their results to be investigated once they complete.
With synchronous programming in C++ we can use the [[nodiscard]]
attribute or similar hints for static analyzers to indicate that the result of a function call must be inspected, which should really be the default in most cases.
Exceptions by default can't be ignored, you have to explicitly write code that handles them to ignore them.
But when all you have is a handle to an asynchronous task that will complete later, the compiler and type system think the job is already done: you got the return value and stored it somewhere, and any exception from starting the task would have propagated already.
But now that it's off and running, who is going to stop you from forgetting about it?
Some compilers have warnings for storing a value in a variable and then not accessing the variable again, but there's no real way to inform the compiler of what ways of accessing the variable are "good enough". For example, maybe you have code to check if a task has a valid state (has been started), but you forgot to add code to actually check for when it finishes and obtain its results. Because you used the object, the compiler has no reason to think that you're forgetting something. There are so many asynchronous libraries in existence that it's unreasonable to make the compiler or static analyzer figure out all the correct ways to use them, and I think annotations would be unnecessarily arduous for cases like this.
Vale's solution is explicit destruction: the object cannot be implicitly destructed by the compiler, so if it is abandoned and left to fall out of scope, the code cannot compile.
Instead, the programmer must pick from one or more functions that take the object and end its lifetime correctly.
With our async example, we could have functions like detach
, waitAndDiscard
, waitAndGet
, and overloads with timeouts that can fail if too much time elapses.
Really, we already have and use functions like these, but the problem is we have to remember to do so which can be difficult to do when your brain is juggling all the other concerns and hazards of asynchronous programming.
There are many other situations like this too, not just in asynchronous programming; recall the Windows file handle example I mentioned previously. Any time you need to remember to put something back, and that operation can be done in multiple ways and/or it could fail or take a long time sometimes, this applies. We just end up relying on documentation and runtime asserts instead, and we make our destructors ignore errors entirely. C++ does support throwing destructors, but as seen previously on this blog, the language is not well-equipped to handle them, and the standard library especially is not.
That's also what makes this such a tricky issue: it's sort of too late to add such a feature to C++ now, there is too much technical "debt" in the form of having to radically change almost the entire standard library just to support types which require explicit destruction. You could of course wrap such types in wrappers that can be implicitly destructed, but then you're right back to square one: you can forget to handle them properly, and the default destructor has to do something that might not be the right choice.
We already see the pain caused by this problem in std::thread
and std::future
, both of which are types which require explicit destruction, and instead they sometimes have to do something in their destructor which is undesirable.
std::jthread
sort of alleviates the std::thread
problem, but things are still contentious with std::future
.
As Vale points out though, std::promise
is arguably worse because it can be destructed without setting a value, and this results in the destructor informing the shared state of the broken promise, which means the mistake isn't discovered until much later in an entirely different thread.
Compilers already have to perform code path analysis to validate many requirements, such as requiring every path to either return or throw, and compilers already have to know all the different places and times that objects need to be destructed.
Almost all the necessary information to fix this problem is already there, we're just missing a way to give the compiler the last piece of the puzzle it needs to say "hey, you forgot to properly dispose of this type in this code path!"
So what would C++ even be like if we had such a feature? What changes would truly be needed, and what could we do without? Let's try writing some psuedoC++ to analyze for potential problems.
[[nodiscard]] std::future<Buffer> startLoadingResource(std::filesystem::path path) | |
{ | |
std::packaged_task<Buffer ()> task([path = std::move(path)] | |
{ | |
| |
}); | |
| |
} |
So far, this is actually just valid C++ that you could write today, and with all warnings enabled, major compilers don't output anything when compiling this, they happily accept it.
Of course, the result is that the startLoadingResource
function just returns a std::future
instance that always throws std::future_error
with an error code of broken_promise
, because packaged_task
has to be invoked to actually do anything, and by letting it be automatically destructed we have forgotten to do our one job.
The compiler sees that we called a member function on the task
variable, so it doesn't feel any need to warn us of anything - we used the variable!
Job done!
Wrong.
So, explicit destructors to the rescue, right?
Pause for a moment.
Let's think: what we want the compiler to force us to do is to pass off the task to an executor, such as another thread or a threadpool.
"That doesn't really align with our core values here at Destruct Corp, let's try to be a team player." - said an imaginary person.
To that I point out that actually, the clue is in the name of the type: the task is packaged
.
It isn't itself a task, it is a package that contains a task, and it actually does make sense that a valid way of destroying the package is to first remove that task and put it where it belongs, such as in the hands of another thread.
Let's run with that and start inventing stuff.
{ | |
| |
| |
| |
| |
{ | |
| |
} | |
| |
}; |
Concepts and explicit functions aren't new, but the rest of the destructor synax is completely fictional - destructors can't be
, they can't be templates, they can't have a return type, and they can't take parameters.
But let's roll with this and imagine how it could work in practice.
The idea is that if the default destructor isn't accessible (e.g.
, =
) or is marked
, the compiler would generate an error unless all code paths called a destructor in the required places, forcing us to really be aware of all the places we are destroying the object and also how we're destroying it.
One approach we could take is to either automatically or manually mark such variables to behave like std::optional
but in a compiler-aware way, where you can either let them be automatically destructed with their default destructor once they go out of scope, or some code paths can explicitly call one of the destructors early, including the default constructor if early destruction is desired.
This means all existing types work the same, but if you start calling destructors explicitly the variables have to be converted into optional-likes.
Perhaps this could be done by marking the variable as
so that, for example, the compiler doesn't have to see the source of a class destructor to know whether it needs to store that information in the class layout.
But then how would you check if a variable had been destructed?
Would placement-new flip it back into a constructed state again?
This approach seems to require too many new rules and too much new syntax, so let's try something else.
Another approach is to only allow explicit destructor calls in the same places the compiler would have tried to insert implicit destructor calls (and in the same order). This sounds easier for the compiler to deal with because it already has to know what to do at those places anyway, so telling it to use a different destructor overload at those spots isn't that big of an ask. But in practice this can become unwieldy fast, and we would have discovered this with the prior approach too: there are many, many different places that compilers automatically insert implicit destructor calls. To see how this breaks down, let's try and fix our example function:
[[nodiscard]] std::future<Buffer> startLoadingResource(std::filesystem::path path) | ||
{ | ||
std::packaged_task<Buffer ()> task([path = std::move(path)] | ||
{ | ||
| ||
}); | ||
| ||
std::future<Buffer> ret(task.get_future()); | ||
task.~packaged_task( | ||
| ||
} |
Seems reasonable enough at a glance, and indeed it should work in theory, but the compiler is unhappy because get_future
can throw an exception, which means it has to be able to destroy the task variable in that case.
Obviously with this particular code that will never happen because the only time it throws an exception is if we call it more than once, and the optimizer will end up deleting that unnecessary destructor code anyway, but the compiler frontend isn't smart enough to realize all this and requires us to handle it.
So, what exactly are we supposed to do to handle this with our brilliant new syntax?
Well, one way is to use already-existing exception handling language features:
[[nodiscard]] std::future<Buffer> startLoadingResource(std::filesystem::path path) | ||
{ | ||
std::packaged_task<Buffer ()> task([path = std::move(path)] | ||
{ | ||
| ||
}); | ||
| ||
{ | ||
std::future<Buffer> ret(task.get_future()); | ||
task.~packaged_task( | ||
| ||
} | ||
| ||
{ | ||
task.~packaged_task(); | ||
| ||
} | ||
} |
This appeases the compiler over the get_future
call, albeit in a rather ugly way, but unfortunately it also introduces an entirely new problem.
On line 8|10, the excplicit destructor can throw any exception thrown by the executor, such as a failure to schedule the task on the threadpool.
That would jump to our catch block and call the destructor again, which isn't allowed, so we still can't compile this with our imaginary compiler.
Let's undo this change and try again with a different approach:
[[nodiscard]] std::future<Buffer> startLoadingResource(std::filesystem::path path) | ||
{ | ||
std::packaged_task<Buffer ()> task([path = std::move(path)] | ||
{ | ||
| ||
}); | ||
std::future<Buffer> ret | ||
| ||
{ | ||
ret = task.get_future(); | ||
} | ||
| ||
{ | ||
task.~packaged_task(); | ||
| ||
} | ||
task.~packaged_task( | ||
| ||
} |
There we go.
Lucky for us, std::future
has a
default constructor, because the shared state is allocated by the other end.
This code satisfies our imaginary compiler and does what we want: removing either of the explicit destructor calls generates an error because the compiler thinks we forgot to destroy the packaging surrounding the packaged task.
It's rather ugly though, and the entire try/catch is only necessary for the compiler frontend and gets completely optimized out by the optimizer.
I think the ugliness could be acceptable, since not many types would need this treatment, and I have a feeling any attempts to make it less ugly would require inventing yet more new syntax.
What if instead of fixing the ugliness, we fixed the class design to preclude it?
I've always found it kind of strange that you always have to use a four-step approach with std::packaged_task
, where first you construct it, then you call get_future
exactly once, then you pass the task off to some executor, and then finally you return or utilize the future.
In fact, looking at our latest code above, it's obvious we completely forgot another possible mistake we could make: we could forget to call get_future
and instead just send off the task to be executed without ever knowing its result.
That's not how std::packaged_task
is meant to be used, and the only reason we didn't make that mistake here is because we started with the premise of needing to return the std::future
, which isn't always the case.
Let's get the compiler to protect us from ever making that mistake in any circumstance, and solve the ugliness problem at the same time!
{ | ||
| ||
[[nodiscard]] future<R> get_future(){ | ||
| ||
| ||
| ||
{ | ||
| ||
| ||
| ||
{ | ||
| ||
{ | ||
future<R> f; | ||
} ret(get_future()); | ||
executor(move(* | ||
| ||
} | ||
| ||
{ | ||
| ||
{ | ||
future<R> f; | ||
executorResult_t e; | ||
} ret(get_future(), executor(move(* | ||
| ||
} | ||
} | ||
| ||
}; |
We make get_future
private, and then we can call it in our second explicit destructor.
This makes the scheduling and future obtaining a single step, so you can't forget either one, you have to do both or you have to explicitly say you want to do neither by calling the other destructor.
This aligns perfectly with the preconditions of only being allowed to call get_future
and
exactly one or zero times each, except now we have the compiler enforcing it for us.
Technically we should now make
and make_ready_at_thread_exit
private too, and provide a corresponding destructor for the latter.
Anyway, let's update our example client code to take advantage of the above changes:
[[nodiscard]] std::future<Buffer> startLoadingResource(std::filesystem::path path) | ||
{ | ||
std::packaged_task<Buffer ()> task([path = std::move(path)] | ||
{ | ||
| ||
}); | ||
std::future<Buffer> ret; | ||
| ||
{ | ||
ret = task.get_future(); | ||
} | ||
| ||
{ | ||
task.~packaged_task(); | ||
| ||
} | ||
| ||
| ||
} |
Don't you just love it when a simple change in library design dramatically simplifies client code? Though, we may have simplified a little too much - we removed the ability to obtain the future in a spearate location from the location where the task is actually scheduled, which is a valid use case even if it opens up the door for making mistakes. Luckily, it's not difficult to add back that functionality while still preserving the "force me to do this right" compiler behavior:
{ | ||
| ||
shipping_task(packaged_task<R, ArgTypes...> task) | ||
| ||
| ||
| ||
{ | ||
| ||
} | ||
| ||
}; | ||
{ | ||
| ||
[[nodiscard]] future<R> get_future(){ | ||
| ||
| ||
[[nodiscard]] | ||
{ | ||
| ||
| ||
{ | ||
| ||
{ | ||
future<R> f; | ||
} ret(get_future()); | ||
executor(move(* | ||
| ||
} | ||
| ||
{ | ||
| ||
{ | ||
future<R> f; | ||
executorResult_t e; | ||
} ret(get_future(), executor(move(* | ||
| ||
} | ||
} | ||
[[nodiscard]] | ||
{ | ||
| ||
{ | ||
future<R> f; | ||
shipping_task<R, ArgTypes...> t; | ||
} ret(get_future(), move(* | ||
| ||
} | ||
| ||
}; |
First we create a new shipping_task
type to act as the second step after our packaged_task
type.
It looks suspiciously similar to our first design for adding explicit destructors to packaged_task
; in fact, the only difference is that shipping_task
knows the future has already been obtained.
Then we add another destructor to packaged_task
with a tag type to differentiate it in overload resolution, and it gives us the future and the shipping_task
in a single step, allowing us to then split the two apart and send them to separate areas of our codebase.
Thus, we've added back that optional use pattern, just with compiler enforcement now.
Looking back at all this though, it seems what we've really done is merge some functional programming language ideas with C++ destructors: we effectively have call-once functions enforced by the compiler, but we did it by leveraging existing compiler mechanics for object lifetime and destruction. It seems to have turned out well, so far...
Anyway, let's take a look at how our two-phase task shipment process can be used with our imaginary compiler - here's a first rough draft of a new function:
[[nodiscard]] std::vector<std::future<Buffer>> startLoadingResources(std::span<std::filesystem::path | |
{ | |
std::vector<std::future<Buffer>> futures; | |
std::vector<std::shipping_task<Buffer ()>> tasks; | |
futures.reserve(std::size(paths)); | |
tasks .reserve(std::size(paths)); | |
| |
{ | |
std::future<Buffer> future; | |
std::shipping_task<Buffer ()> shipping; | |
{ | |
std::packaged_task<Buffer ()> packaged([path] | |
{ | |
| |
}); | |
std::tie(future, shipping) = packaged.~packaged_task(std::begin_shipment); | |
} | |
futures.emplace_back(std:move(future)); | |
tasks .emplace_back(std:move(shipping)); | |
shipping.~shipping_task(); | |
} | |
| |
{ | |
std::shipping_task<Buffer ()> shipping(std::move(task)); | |
shipping.~shipping_task( | |
} | |
| |
} |
Oh no.
This is already off to a terrible start.
Firstly, std::vector
has no clue about explicit destruction, since we haven't updated it, and doing so would be a daunting challenge.
Secondly, every time we call reserve
, std::begin_shipment
, and emplace_back
, the imaginary compiler yells at us about not calling an explicit destructor for std::shipping_task
before the function exits via exception.
We as a programmer know that no reallocations of the std::vector
will occur and that std::begin_shipment
will never have a reason to throw, so guarding those calls results in the optimized-out-but-required code problem again: in practice exceptions cannot be thrown from those function calls while std::shipping_task
instances exist, but the compiler still wants to see us do the work of accounting for that impossibility.
Other places the compiler complains about are lines 12 and 24 (creating the packaged
variable and scheduling the std::shipping_task
), which in practice absolutely can throw exceptions while std::shipping_task
instances exist.
Let's take this one step at a time.
The first problem is that we need to be able to create and manage a dynamically allocated array of objects which require explicit destructor calls.
However we decide to resolve that conumdrum will influence the way we solve the rest.
For a single object dynamically allocated by
, it seems reasonable that we could pass the destructor parameters to
.
But what should the syntax be?
If we just write
, it's ambiguous because the type could have an overloaded
member function, and we end up deleting its return value instead of the object.
What if we re-used the placement-new syntax?
Or well, the new-with-parameters syntax.
For example there is also
and whatever other overloads of
are defined by the user.
We could just adopt that syntax for passing parameters to the explicit destructor, for example
, or
in the case of the explicit zero-arg destructor.
That's half the problem solved, we can now allocate and free one object which requires explicit destruction.
But what about the array versions of
and
?
Well, I suppose we can do the same thing there too, since you can also already do things like
- yes, C++ already lets us pass three different sets of parameters to one operation.
It's not much of a stretch to then write
or
.
Though, this does mean we are in all-or-nothing territory: all the objects in the array have to be destructed the same way, we can't pick and choose how each one gets destructed, and we have no hope of obtaining return values from those destructor calls.
For that we have to allocate uninitialized memory and use placement
much like how std::vector
is implemented, since then we're free to call a destructor on each object individually (and we already have to do that anyway, the compiler just can't help us avoid forgetting anymore).
Actually, I suppose that means std::vector
won't require too many changes since it already has to explicitly call destructors, we just won't be able to store types which don't have an accessible zero-arg destructor unless we start making changes to std::vector
.
Fortunately for us, even though our std::shipping_task
type's zero-arg destructor is marked
, the way std::vector
is implemented means it already has to perform explicit destructor calls, so I suppose it really does already "just work" for this particular use case, whether we want that or not.
Maybe there will need to be a way to detect when the zero-arg destructor is marked
so that std::vector
can sic the imaginary compiler on us, or maybe the zero-arg constructor should be =
and replaced by a tagged destructor instead, but we can save that problem for later.
Well, that was surprisingly simple to resolve... it turns out those compiler errors our imaginary compiler generated were also themselves imaginary.
Our most recent code now compiles without issue, but the tradeoff is that now we have a bug.
If we're in the middle of having already scheduled some of the tasks when our executor throws, the std::future
instances we worked so hard to create will be discarded, and the work will continue with nobody to hear it scream.
Thankfully that bug is not specific to our newly-added language features, it's something you can already encounter in real world code, so we're going to just ignore that for now as we focus back on this explicit destructor stuff.
Wait a minute, that gives me an idea!
We did originally start this blog post with the complaint that std::future
is something we could accidentally forget to pull the result from - we do sometimes want to discard the result, but we want that to be explicit so that the compiler makes us choose.
Let's try giving std::future
some explicit destructor love.
For simplicity, I'll just focus on the non-
non-reference version.
{ | |
| |
[[nodiscard]] T get(){ | |
void ~future( | |
[[nodiscard]] | |
{ | |
| |
} | |
| |
}; |
That's something at least.
Now we can choose between f.~future(std::ignore);
or
and the compiler will helpfully remind us if we don't pick one of those two options.
Unfortunately, I think we do need to tweak or replace std::vector
now.
As handy as [[nodiscard]]
is, it only generates a compiler warning, and not everyone passes the compiler flags to turn it into an error, and some compilers might suppress it in standard library code anyway.
As it stands currently, std::vector
calls the blocking destructor and discards the result, or else terminates if there was an exception.
That's not really what we want.
Time for everyone's favorite, our very own home-grown hastily-assembled wrapper around a block of memory being treated as an array!
I'll spare you the less interesting implementation details and just focus on the highlights:
{ | |
T* arr{}; | |
std::size_t len{}; | |
| |
[[nodiscard]] | |
{ | |
[[maybe_unused]] | |
{ | |
:: | |
}); | |
| |
std::exception_ptr exception{}; | |
| |
{ | |
| |
{ | |
| |
{ | |
v.~T(std::forward<DestructArgs>(args)...); | |
} | |
| |
{ | |
| |
{ | |
std::terminate(); | |
} | |
exception = std::current_exception(); | |
} | |
} | |
| |
{ | |
std::rethrow_exception(exception); | |
} | |
| |
} | |
| |
{ | |
explicit_destruct_array<U> ret; | |
| |
{ | |
ret.reserve(len); | |
} | |
| |
{ | |
exception = std::current_exception(); | |
} | |
| |
{ | |
| |
{ | |
ret.emplace_back(v.~T(std::forward<DestructArgs>(args)...)); | |
} | |
| |
{ | |
| |
{ | |
std::terminate(); | |
} | |
exception = std::current_exception(); | |
} | |
} | |
| |
{ | |
ret.~ | |
std::rethrow_exception(exception); | |
} | |
| |
} | |
} | |
| |
} |
Conditional
is a pre-existing language feature introduced in C++20, but std::is_implicitly_destructible_v
is something I just made up and requires compiler support to implement.
Our templated destructor manually does what the compiler normally has to do with arrays anyway: call each destructor in reverse order, and if one of them throws, pause the exception and destroy the rest, and if another throws then terminate because C++ doesn't allow multiple exceptions to propagate up through the same stack frames (multiple exceptions can be active at once, they just can't overlap in the same stack frames).
If we get to the end, we free the block of memory, then throw the exception we paused (if any) or return normally.
The main difference here is that the destructors we call can themselves return values that we want to capture and return together for inspection, so we have to have a bit of code duplication and also be careful when allocating the array for the returned values as well, and their destructors could also be explicit so we re-use our own class type even if it's not really necessary to do so, just for consistency.
This does mean we need to add explicit destructor calls for it though, and here I'm just calling the default/zero-arg destructor, which might not be available... so, that will need another template parameter to allow client code to pass in an appropriate deleter type/function, which is a can of worms we're not going to get into just yet.
An issue with this approach of following in the compiler's pawsteps is that it's really unlikely for only zero or one of the objects to throw from its destructor if we're storing an array of our modified std::future
, and we want to be able to gracefully handle the case where there are multiple failures (e.g. multiple missing files).
There's also a bug in our code where we continue destruction after an exception outside the catch block, which is an observable difference in behavior for code that checks std::uncaught_exceptions
before deciding whether to throw - it's not difficult to fix, but it's annoying to have yet more code duplication for behavior we don't even want.
This terminate-upon-second-exception approach is therefore a no-go in my book.
How about we take advantage of std::expected
so we can proceed regardless of how many failures occurred?
{ | |
T* arr{}; | |
std::size_t len{}; | |
| |
[[nodiscard]] | |
{ | |
[[maybe_unused]] | |
{ | |
:: | |
}); | |
| |
explicit_destruct_array<std::expected<U, std::exception_ptr>> ret; | |
| |
{ | |
ret.resize(len, std::expected<U, std::exception_ptr>(std::unexpect)); | |
} | |
| |
{ | |
| |
{ | |
| |
{ | |
| |
} | |
| |
{ | |
} | |
} | |
ret.~ | |
| |
} | |
| |
{ | |
| |
{ | |
r.emplace(v.~T(std::forward<DestructArgs>(args)...)); | |
} | |
| |
{ | |
r = std::unexpected(std::current_exception()); | |
} | |
} | |
| |
} | |
| |
}; |
A lot of removed lines and very few added - much better.
We start by preallocating the array of std::expected
with empty std::exception_ptr
instances, so the resize can only fail due to memory allocation failure, in which case we destruct everything and ignore return values and exceptions along the way (because what are we going to do with them without memory?) and then rethrow that memory allocation failure exception as before (so we can unload the game world, save player progress, and show the out of memory message to the player).
If the resize succeeds, then we just overwrite each std::expected
with the result of the corresponding destructor or its exception, and finally return the entire new array for inspection.
It's amazing how much simpler the code is just by switching to using std::expected
, even though we're still dealing with exception handling along the way!
In times past I would have used std::variant
, which would have worked just as well here, but std::expected
is a much better vocabulary type and has a nicer interface for the way it's meant to be used.
(Just pretend we already made it explicit-destruction-aware.
Also yes, if the destructor we're calling is
and doesn't return a value, we don't need to populate nor return an array - I leave that fix as an exercise for the reader.
Also we have to handle explicit destruction from the destructor's return value on line 26 but that's the can of worms I mentioned earlier that we're not getting into just yet.
And yes, line 39 needs handling for when the destructor returns void, another exercise for the reader.)
I can hear you thinking, "wait, but we can just ignore the returned array now" - correct! But that doesn't matter because the compiler has already forced us to go through all the effort of actually getting the array of results, so we have a very high likelihood of remembering to inspect it and handle it appropriately. It's not an object type that we store somewhere for later, we want those results now. We're back in synchronous land, so implicit destructors that discard and swallow are fine. We can also factor this code out of the destructor and re-use it for implementing other operations on our array type, such as for erasing elements / shrinking resizes. Anyway, let's actually use our custom array type now:
[[nodiscard]] | ||
{ | ||
std::vector<std::future<Buffer>> futures; | ||
std::vector<std::shipping_task<Buffer ()>> tasks; | ||
futures.reserve(std::size(paths)); | ||
tasks .reserve(std::size(paths)); | ||
explicit_destruct_array<std::future<Buffer>> futures; | ||
| ||
{ | ||
futures.reserve(std::size(paths)); | ||
} | ||
| ||
{ | ||
futures.~ | ||
| ||
} | ||
| ||
{ | ||
| ||
{ | ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
} | ||
| ||
{ | ||
futures.~ | ||
| ||
} | ||
} | ||
| ||
{ | ||
| ||
{ | ||
| ||
| ||
} | ||
| ||
{ | ||
futures.~ | ||
| ||
} | ||
} | ||
| ||
} |
The code is uglier now, but that's actually a good thing in this case: we are forced to reveal that hidden code path I mentioned previously, where we might want to do something more intelligent. For example, we could cancel all those scheduled tasks via the return value from the executor, or we could do a partial return where the caller surmises that only some of the file loads were able to be started and tries the rest again later. We don't have to do those things, but the location of that decision is now plain and obvious for all to see, and the compiler doesn't let us hide it again. Personally, I call that a win.
On to the next consideration: you may have noticed an inconsistency between the destructor behaviors for the types we've modified.
Our std::packaged_task
and std::shipping_task
zero-arg destructors work the same as they always did, we just marked them
to get the compiler to remind us to make a choice each time the destructor has to be called.
Our std::future
zero-arg destructor however waits for the value, meaing that if we had stored it in std::vector
, the code would have compiled without changes but the behavior would have been undesirable in many circumstances.
This is actually something that can already happen in real world C++: if you use std::async
, the std::future
it returns can block in its destructor, and there is great debate about how to handle this situation.
As it turns out, our new language features here actually provide a nuanced solution: just wrap std::future
in a type with destructor(s) that behave the way you want, since now we can choose whether to wait and get a value or discard/detach the value during destruction.
: std::future<T> | |
{ | |
| |
| |
| |
~wrapped_future() | |
{ | |
future::~future(std::ignore); | |
} | |
}; |
Using
on a destructor actually isn't new, in fact it's an already existing way to have "multiple" destructors in C++, you just couldn't allow more than one to exist in the final class until we started making these language changes.
Inheriting constructors is also a pre-existing C++ feature, but inheriting destructors is something new I just made up since it's easy to give it similar semantics to inheriting constructors.
In this case, all destructors are inherited, but we override the default destructor if the second template parameter is true, and in it we use the discarding destructor of our std::future
instead.
This means that regardless of what the second template parameter is, the class always has a zero-arg destructor and a discarding destructor, so calling code doesn't always have to worry which one it's dealing with.
Note also that I did did not mark our overridden destructor as
, this means the implicit destruction attempts by the compiler only succeed when the second template parameter is true.
We could alternatively remove that
clause and use
inside the destructor body instead, to make sure that the zero-arg destructor is always implicit.
There's lots of room here for controlling how this works on a case-by-case basis.
So far this experiment is proving to be quite interesting, but it's by no means perfect - it's not perfect in Vale either, but the point is that it doesn't have to be. It's there to help stop us from making common mistakes for the few types that require special treatment, while the rest of the language operates as normal. It can't catch all mistakes though, for example we could store a type which requires explicit destruction in a large class that does a lot of things, and the compiler only makes us deal with the destruction in the enclosing class' destructor. It has no reason to force us to call a destructor in any other circumstance, such as a member function where we were supposed to check on those futures for results but forgot. These sorts of problems are inherently very difficult to solve in a general way like this, but what we have so far is at least a little bit better than nothing at all.
As tempting as it is to start drafting up proposals to add this to the language, I already said at the start of this post that it's too late.
A change like this would require drastic changes to the language and standard library, even for as little as we tried to change.
For example, what about temporaries??
Wait, more importantly, how do types which require explicit destruction interact with coroutines?
Yeah, you know, the major language feature that dramataically simplifies asynchronous programming, which is the thing we've been trying to make better throughout this entire post?
Every
can suspend the coroutine and then the coroutine can either be resumed or destroyed - that's not a normal unconditional exit path like a
or a
, it's an entirely new weird conditional thing.
Where would we write the explicit detructor calls to satisfy the compiler?
We'd have to initially ban using this new explicit destruction stuff with coroutines that can suspend, or possibly disallow using them in coroutines entirely, until something can be figured out.
Another thing to consider is that an operation which can fail should typically be able to be retried, but destructors are final - even though we can choose which destructor to call and they can throw exceptions, the objects and subobjects still always get destroyed, all memory is freed, etcetera. This is true already in C++ with throwing destructors, the changes we made here didn't affect these rules. To retry a failed destructor we have to either back up to a higher level of abstraction in the program and re-create stuff, or the destructor's return value or exception has to include the state necessary to retry. Imagine the temptation to move the object state into an exception type before throwing it, so that whoever catches it is again stuck in the dilemma of dealing with a destructor that can fail. It's something we already have to do in certain circumstances without this new destructor stuff, but it really doesn't feel good, and it hints that destructors are not the best way to get what we want out of the compiler. They certainly come pretty close, though.
Perhaps we don't really need explicit destruction that badly.
Now that we have std::expected
with convenient monadic operations, it can be quite nice to write code with explicit error handling that doesn't accidentally forget to do stuff, if chaining lambdas together is something you think is nice like I do.
We already saw half way through this post that splitting out std::shipping_task
from std::packaged_task
helped protect us from potential misuses, and we don't actually need explicit destruction to do that.
It would be nice if C++ had something like Rust where moved-from variables became "poisoned", and some static analyzers can check for situations like this already, so maybe it's not far off.
Put it all together, throw in some coroutines, and you've got an entirely alternate approach to the problems I covered in this post, without the troubles introduced by explicit destructors.
It's nice how exploring experimental language ideas can give us a better perspective on problems like this.
So where does this leave us? Well, things are looking pretty good for Rust, I still need to dive into learning it more properly and figuring out lifetime annotations. I have a feeling they can also help solve problems like this, and I'm sure there's literature on the subject already, I just haven't sought it out yet. I'm sure there's also other languages besides Vale that have the compiler force you to not forget to do things, but it's easy to get fatigued with how many niche programming languages there are. My experience is primarily with C++ because I work in game engine development where C++ is a dominant programming language, so of course I want C++ to get better at making my job easier rather than having to jump ship to an entirely new language with its own quirks. I'm not opposed though, and the way things are going I might be writing games in Rust someday. I hope it'll be as fun as C++ has been for me all these years.
Comments
Post a Comment
Remember the universal code of conduct