C++23 finally lets us solve the const view problem
For a while now I've been vexed by a problem I don't really know how to name or describe in simple terms. Say we have a JSON file and we are designing a program that provides a more natural way to edit that JSON file than writing JSON by hand. This program would help its user maintain proper structure within the JSON, but it would also be forgiving of erroneous structure in hand-edited or corrupted JSON documents. Once you have a deserialized JSON in memory, you could go through all the effort of loading it all into bespoke classes with rich interactions, but you immediately run into some problems. Depending on your serialization framework, you might have issues with the load and save code getting out of sync, or issues with upgrading from older versions and later downgrading again, or elements in the JSON being re-ordered unnecessarily and causing unnecessary diffs in source control repositories.
The deserialized JSON is right there. It's all already in memory. Why are we copying it again? Why not just use it as-is? We get undo/redo for free, we don't need to touch stuff that the user doesn't edit, and there's only one layer of serialization instead of two. That sounds great!
But you can't just go around writing JSON editing code like a cowboy.
We want strong typing in these parts.
We should create specialized views on the JSON structure itself, and rely on those views to make sure we don't mess up our JSON operations.
This is the main subject of this blog post: how do you create such views without running into trouble with
-correctness?
A view holds no state of its own, only a pointer or reference to a node in the parsed JSON data structure in memory.
C++ historically struggles to mesh
-correctness with this kind of thing, as seen by almost all standard library containers having to define separate class types for their ::iterator
and ::const_iterator
types.
We could try that ourselves and see how far it gets us:
Attempt #1
#include <expected> | |
{ | |
Missing, | |
WrongType, | |
| |
}; | |
{ | |
JsonNode | |
| |
[[nodiscard]] | |
[[nodiscard]] std::expected< | |
[[nodiscard]] std::expected< | |
}; |
It seems our JSON document describes plants, and so far we have a view that lets us interpret part of the JSON document as a bush plant, giving us a rich interface for querying its identifying tag and its level of hydration. However, we immediately run into the problem that the C++ standard library ran into: how do we add a way to have a mutable view over that part of the JSON in a similarly-rich way? If we try to add setters to this class, we can no longer construct the view from a read-only JSON node. If we make a second class for it, we either have to duplicate the getters or inherit from the immutable view. Inheritance doesn't seem so bad right now, let's give it a try:
: | |
{ | |
JsonNode& node; | |
| |
[[nodiscard]] | |
| |
| |
}; |
So far this all "works".
If we are given a JSON node, we can construct either one of these views if it's not
, and only the read-only view if it is.
Once we have a mutable view, we can trivially pass it to a function that takes the immutable view, and the public inheritance allows it to just work.
There's already some minor annoyances though:
- Adding or removing
to a view type doesn't meaningfully change anything about how we can use it. This can't really be solved but it's good to be aware of it.const - The mutable view is twice as large as it should be due to the doubled up references. I don't think this matters much since these views are supposed to be emphemeral anyway as any mutations to the JSON structure could invalidate the references. (You might consider using a custom reference type that can detect when it has been invalidated, but that's not relevant to this blog post).
- The getters are declared far away from their corresponding setters. This is a bigger deal in my mind, since any level of distance between related code makes it easier for things to get out of sync or become annoying to develop with.
Still, there's not any real blocker yet, so let's go deeper. What if we have more-derived kinds of bushes, like berry bushes for example? (Yes, there are other ways we could handle this than with a class hierarchy, but sometimes a class hierarchy is really what you need and unavoidable, so bear with me.)
{ | |
Black, | |
Blue, | |
Rasp, | |
| |
}; | |
: | |
{ | |
| |
[[nodiscard]] std::expected<BerryKind, BrokenReason> getBerryKind() | |
[[nodiscard]] std::expected< | |
}; |
So far so good. We even get to re-use the inherited constructor to save some typing. With a non-mutable berry bush view now we can query four properties, since the earlier getters are inherited and applicable here too. Let's try making the mutable berry bush view now:
: | |
, | |
{ | |
| |
| |
| |
| |
}; |
Oh, that's... well it works, but... oh dear.
This is quickly becoming a bit of a headache to reason about.
We've backed ourselves into a corner here with having to use multiple inhereitance to bring in both the read-only view members and the parent class mutable view members.
That forces us to write our own constructor by hand, and we also have to disambiguate the getNode
function since it'd otherwise be ambiguous to call.
It works, but it's ugly and is reducing the maintainability and extensibility of the code.
Well, I lied. There is a case where this doesn't work at all and there's no good way to fix it. If we want to pass a mutable berry bush view to a function that takes a normal read-only bush view, the conversion is ambigious. You might think we can just add a conversion operator overload to fix this, but nope, still ambiguous. The caller has to manually cast to one of the base classes even though the result is the same in either case.
Virtual inheritance to the rescue, perhaps?
Alas, not entirely, since that douses our
dreams.
If you don't need
then virtual inheritance can solve the ambiguous upcast issue, but it also introduces new problems in the form of having even more manual work to do in writing every constructor, especially as the class hierarchy deepens, since every virtually inherited base class has to have its constructor explicitly called by every derived class the whole way down, even if it doesn't even make sense to do so.
(For example, an abstract class will never have a chance to invoke the constructors of virtually inherited classes, but the language still requires them to be written out.)
Let's start over.
No more of this separate mutable and immutable view nonsense, let's combine it all together and just have the user pass in either a
or non-
reference to the node they want to act on each time.
Attempt #2
{ | |
[[nodiscard]] | |
| |
[[nodiscard]] | |
| |
}; | |
{ | |
Black, | |
Blue, | |
Rasp, | |
| |
}; | |
: | |
{ | |
[[nodiscard]] | |
| |
[[nodiscard]] | |
| |
}; |
There we go.
Now each getter and setter pair are right next to each other for easy reading, and there's no more multiple inheritance woes.
Unfortunately, just about everything else is worse.
For starters, there's no reason or use at all for instantiating any of these views, since they have no data members and all their member functions are declared
.
It's not much better than a flat C API at this point, especially since all the functions take a raw untyped JSON node reference.
That JSON node could refer to anything! A bush, an animal, the abstract concept of loss, no no no.
The type system was supposed to save us from chaos, not join it!
The major failure here is the loss of the ability to pass the views to functions as parameters.
Since everything has to operate on raw JSON nodes now, you just have to be careful to never get your higher level types confused.
But, hang on a second... isn't that interesting how in these view classes, every function's first parameter is the thing it wants to act on?
That sort of rings a bell... it's like the
pointer, right?
What was the title of this blog post again?
Attempt #3
This is where you're likely to see something you didn't even think was valid before now.
We're going to take advantage of C++23's new explicit object parameter syntax in conjuction with some classic CRTP to build our own concept of
ness, and we won't even have to turn our views nor their functions into templates at all.
#include <type_traits> | |
: | |
{ | |
JsonNode | |
| |
[[nodiscard]] | |
| |
| |
[[nodiscard]] | |
}; | |
: | |
{ | |
JsonNode& node; | |
| |
[[nodiscard]] | |
| |
| |
[[nodiscard]] | |
| |
| |
[[nodiscard]] | |
}; | |
{ | |
[[nodiscard]] std::expected< | |
| |
[[nodiscard]] std::expected< | |
| |
}; | |
{ | |
Black, | |
Blue, | |
Rasp, | |
| |
}; | |
: | |
{ | |
[[nodiscard]] std::expected<BerryKind, BrokenReason> getBerryKind( | |
| |
[[nodiscard]] std::expected< | |
| |
}; | |
{ | |
JsonNode& node = self.getNode(); | |
| |
} | |
{ | |
| |
| |
} | |
{ | |
Mut<BerryBushView> view(node); | |
view.setRipeness( | |
| |
{ | |
view.setHydrationLevel( | |
} | |
exampleUsage2(view); | |
} |
Here, we define two template classes, Const
and Mut
.
Each one publicly inherits from its template argument, which is intended to be the view type.
They then provide the data member and constructor for the JSON node reference, and conversion operators to convert to compatible views based on the pre-existing class hierarchy.
Then, all our views just use explicit object parameter syntax with the assumption that they will be derived by these templates, and the compiler handles all the rest for us.
Note that the views and their functions are not templates, so the definitions can be in a separate source file just the same as always, no annoying template syntax to worry about.
The views also have no data members or constructors, as they don't need them, that's all handled automatically by Const
and Mut
.
The implementation code just calls self.getNode()
and uses it as it pleases.
You might be wondering about that AllowConversionFromMutable
parameter.
That's for if you need to have two overloads of a function with one taking Const
and the other taking Mut
.
Normally there would be ambiguity in many cases since both can convert from a given type, but in this specific case you can just specify
for that second template parameter and it resolves all the ambiguities:
[[nodiscard]] std::expected<std::span<JsonNode | |
[[nodiscard]] std::expected<std::span<JsonNode >, BrokenReason> getBushes( |
You might notice that this effectively gives you the power to define your own alternatives to
with your own rules and meanings, since now the language-level
has no effect here and we've re-created it ourselves.
I think that's a very interesting side effect of solving this problem and I'll be curious to see what the C++ community comes up with using it.
For now though, I'm just finally happy I can easily define and pass around strongly typed views that are simple and convenient to use.
No more fussing with inheritance issues, no more having to be careful about what functions to call on a JSON node, it can all just be strongly typed and
-correct with little to no fuss.
Mission accomplished.
Cmpiler explorer demo: https://compiler-explorer.com/z/ohfMnvrYd
Comments
Post a Comment
Remember the universal code of conduct