Thursday, February 19, 2015

SFINAE std::result_of? Yeah, right!

Return type deduction pre-decltype and auto could really make one mad. Back then, if you wanted a function object, you had to make it "adaptable", meaning it had to inherit from std::unary_ or binary_function and define its first_argument_type, second_argument_type, and result_type. There had been no concept of "transparent functors" allowing us to pass polymorphic functions to higher order functions (like std::plus<> to std::accumulate). For an example of programming in these dark ages, check out the FC++ FAQ.

Just to recap, SFINAE stands for Substitution Failure Is Not An Error. It's the rule that lets you define two overloads of a function such that and one returns decltype(f(x)) and the other decltype(g(x)), if f(x) can't be evaluated, that's not an error. If g(x) can't. either, that's OK too, as long one one of the two can be for any given x.

And so came decltype and it let us do never-before seen, amazing things, right? Well, yes, we could write things like this:
template<class X>
constexpr auto append_blah(X& x) -> decltype(x + "blah")
{
  return x + "blah";
}

int main() {
  append_blah("nah");
}
Obviously, since "nah" += "blah" isn't valid, this will fail to compile, but we'll get a nice, easy to read error message, right? Nah!
auto.cpp: In function `int main()`:
auto.cpp:11:20: error: no matching function for call to `append_blah(const char [4])`
   append_blah("nah");
                    ^
auto.cpp:11:20: note: candidate is:
auto.cpp:5:16: note: template<class X> constexpr decltype ((x + "blah")) append_blah(X)
 constexpr auto append_blah(X x) -> decltype(x + "blah")
                ^
auto.cpp:5:16: note:   template argument deduction/substitution failed:
auto.cpp: In substitution of `template<class X> constexpr decltype ((x + "blah")) append_blah(X) [with X = const char*]`:
auto.cpp:11:20:   required from here
auto.cpp:5:47: error: invalid operands of types `const char*` and `const char [5]` to binary `operator+
That's GCC's output--more error than code! Even with the most relevant section bolded, it's a mess. Clang does a better job, though.
auto.cpp:12:3: error: no matching function for call to 'append_blah'
  append_blah("nah");
  ^~~~~~~~~~~
auto.cpp:6:16: note: candidate template ignored: substitution failure [with X = char const[4]]: invalid operands to binary expression ('char const[4]' and 'const char *')
constexpr auto append_blah(X& x) -> decltype(x + "blah")
               ^                               ~~
So now, thanks to N3436, that the standard dictates that std::result_of be SFINAE, too. Has the situation for GCC improved? First, the new source:
template<class X>
constexpr std::result_of_t<std::plus<>(X, const char*)>
append_blah(X x) {
  return std::plus<>{}(x, "blah");
}

int main() {
  append_blah("nah");
}
And GCC's output:
auto.cpp: In function `int main()`:
auto.cpp:12:20: error: no matching function for call to `append_blah(const char [4])`
   append_blah("nah");
                    ^
auto.cpp:12:20: note: candidate is:
auto.cpp:5:16: note: template<class X> constexpr std::result_of_t<std::plus<void>(X, const char*)> append_blah(X)
 constexpr auto append_blah(X x)
                ^
auto.cpp:5:16: note:   template argument deduction/substitution failed:
While the error message certainly gotten shorter, GCC no longer tells us why the function failed to compile, only that it was a candidate and "template argument deduction/substitution failed"--it doesn't even give us the type of X! SFINAE doesn't mean better diagnostics, it's just an overloading technique. Still, if it decreases the readability of error messages, then having many overloads, all failing, just compounds the problem.

But clang does better, right?
auto.cpp:12:3: error: no matching function for call to 'append_blah'
  append_blah("nah");
  ^~~~~~~~~~~
auto.cpp:5:16: note: candidate template ignored: 
  substitution failure [with X = const char *]: 
    no type named 'type' in `std::result_of<std::plus<void> (const char *, const char *)>`
constexpr auto append_blah(X x)
               ^
1 error generated.
Better, yes, but not great. We still don't know why it failed. If you have deeply nested function calls, but your top-level function guarantees a return of std::result_of_t<invalid-expression>, you'll never know why it failed to compile because result_of doesn't actually evaluate the semantics; it just checks the signatures and spits out that result_of has no type.

There simply must be a way of writing this such that GCC and clang will give a minimal, correct diagnosis of the problem, right? Well, let's try using auto as our return type.
template<class X>
constexpr auto append_blah(X x) {
  return x + "blah";
}

int main() {
  append_blah("nah");
}
What does GCC have to say?
auto.cpp: In instantiation of `constexpr auto append_blah(X) [with X = const char*]`:
auto.cpp:11:20:   required from here
auto.cpp:7:12: error: invalid operands of types `const char*` and `const char [5]` to binary `operator+`
   return x + "blah";
            ^
Wow, straight to the point and totally accurate! We get a nice, minimal trail of what GCC had to instantiate to find the error, but not a hard-to-follow mess of output. How does clang handle it?
auto.cpp:7:12: error: invalid operands to binary expression (`const char *` and `const char *`)
  return x + "blah";
         ~ ^ ~~~~~~
auto.cpp:11:3: note: in instantiation of function template specialization `append_blah<const char *>` requested here
  append_blah("nah");
  ^
1 error generated.
Again, exactly what I wanted!

So in terms of getting good and useful error messages, auto or decltype(auto) gives the best, followed by decltype, and result_of gives the absolute worst. The problem is that auto can't participate in SFINAE, so you can't overload a set of functions on auto and that leaves you with std::enable_ifstd::result_of, or decltype.

Before concluding, let's look at a slightly more obfuscated example.
#include <functional>

constexpr struct plus_f {
  template<class X, class Y>
  constexpr auto operator() (X&& x, Y&& y) const
    -> decltype(std::declval<X>() + std::declval<Y>())
  {
    return std::forward<X>(x) + std::forward<Y>(y);
  }
} plus{};

template<class X>
constexpr auto append_blah(X x) -> decltype(plus(x, "blah")) {
  return plus(x, "blah");
}

int main() {
  append_blah("nah");
}
Now, in order to find the error, the compiler has to go just two levels deep, through append_blah, then to plus_f, and evaluage x + "blah". But will the error message go that deep? We know that if we used std::result_of, it would just say that result_of<plus_f(X, const char*)> has no 'type'. But this is decltype!

First, GCC:
auto.cpp: In function `int main()`:
auto.cpp:19:20: error: no matching function for call to `append_blah(const char [4])`
   append_blah("nah");
                    ^
auto.cpp:19:20: note: candidate is:
auto.cpp:14:16: note: template<class X> constexpr decltype (plus(x, "blah")) append_blah(X)
 constexpr auto append_blah(X x) -> decltype(plus(x, "blah")) {
                ^
auto.cpp:14:16: note:   template argument deduction/substitution failed:
auto.cpp: In substitution of `template<class X> constexpr decltype (plus(x, "blah")) append_blah(X) [with X = const char*]`:
auto.cpp:19:20:   required from here
auto.cpp:14:59: error: no match for call to `(const plus_f) (const char*&, const char [5])`
 constexpr auto append_blah(X x) -> decltype(plus(x, "blah")) {
                                                           ^
auto.cpp:4:18: note: candidate is:
 constexpr struct plus_f {
                  ^
auto.cpp:6:18: note: template<class X, class Y> constexpr decltype ((declval<X>() + declval<Y>())) plus_f::operator()(X&&, Y&&) const
   constexpr auto operator() (X&& x, Y&& y) const
                  ^
auto.cpp:6:18: note:   template argument deduction/substitution failed:
auto.cpp: In substitution of `template<class X, class Y> constexpr decltype ((declval<X>() + declval<Y>())) plus_f::operator()(X&&, Y&&) const [with X = const char*&; Y = const char (&)[5]]`:
auto.cpp:14:59:   required by substitution of `template<class X> constexpr decltype (plus(x, "blah")) append_blah(X) [with X = const char*]`
auto.cpp:19:20:   required from here
auto.cpp:7:35: error: invalid operands of types `const char*` and `const char [5]` to binary `operator+`
     -> decltype(std::declval<X>() + std::declval<Y>())
                                   ^
GCC feels obligated to tell us why, so in terms of having an accurate, if not in an overly-verbose, error messages, it does its job. It'll take some digging, but a user will eventually be able to find out why it failed. But curiously, clang does no better than std::result_of.
auto.cpp:19:3: error: no matching function for call to `append_blah`
  append_blah("nah");
  ^~~~~~~~~~~
auto.cpp:14:16: note: candidate template ignored:
  substitution failure [with X = const char *]:
    no matching function for call to object of type `const struct plus_f`
constexpr auto append_blah(X x) -> decltype(plus(x, "blah")) {
               ^                            ~~~~
1 error generated.
This doesn't even tell us the types of the arguments to plus_f that caused the failure! If plus_f were a more complicated function with a slight semantic error, we'd be left clueless.

Just for comparison, here's the output with GCC, using auto:
auto.cpp: In instantiation of `constexpr auto plus_f::operator()(X&&, Y&&) const [with X = const char*&; Y = const char (&)[5]]`:
auto.cpp:14:24:   required from `constexpr auto append_blah(X) [with X = const char*]`
auto.cpp:18:20:   required from here
auto.cpp:8:31: error: invalid operands of types `const char*` and `const char [5]` to binary `operator+`
     return std::forward<X>(x) + std::forward<Y>(y);
                               ^
So, when it comes to decltype vs. std::result_of, neither really produced ideal diagnostics and both can be rather unhelpful. decltype at least gives us a little more information, but only with GCC.

I discovered this while working on FU. One of my basic functions that I based the library off of had a very simple bug. Not realizing this, I wrote a small function that called another that called another that eventually landed in that one. When I tried to compile, I got several pages of errors, and they kept suggesting that the problem was in one of the mid-level functions. Assuming I might have gotten the result_of expression wrong, I decided to just use auto and let the compiler figure it out. Then, GCC and clang both told me that the next function in the call tree was the culprit. I continued to mark function by function as returning auto until I found the problem. I had tried to pass an rvalue reference to a function taking a non-const reference. After making it pass by const reference, everything else worked.

Does this mean that we should always prefer auto to result_of or decltype when we don't need SFINAE? I think it should be evaluated on a case-to-case basis. In terms of documentation, auto gives the user a pretty muddy perception on what the function will return. For something like std::plus<>, that's not a problem because addition generally doesn't change the type of its arguments too much. For something simple like std::invoke (C++17), std::result_of_t<F&&(X&&...)> gives us the clearest documentation about how it works--it perfect forwards F and the arguments, X...--but this gives us the worst diagnosis if F can't be called with the arguments. decltype would give the best error message (with clang when the error is shallow, otherwise only with GCC), but decltype(std::forward<F>(f)(std::forward<X>(x)...)) doesn't read as nicely as result_of<F&&(X&&...)>

Whether to use auto, result_of, or decltype is a matter of priorities, weighing the importance of documentation (result_of; maybe enable_if) against semantics (decltype) and diagnostics (auto). For FU, a library that depends heavily on dependent typing, I will probably over time switch entirely to using auto because determining the cause of errors is of great importance for making it usable. Users will likely be confused with messages like "result_of<multary_n_f<1,lassoc_f>(UserFunc, TypeA, TypeB, TypeC)> has no member, `type`". At the same time, this feels like poor style--an unfortunate dilemma. With any hope, GCC and clang will improve how they propagate errors over time.

For information on how to improve the diagnostics of functions using decltype, check out pfultz2's article, "Improving error messages in C++ by transporting substitution failures".

Tuesday, February 17, 2015

Common algorithm patterns.

Of the STL, <algorithm> may be the most well-used non-container library, but often requires a level of verbosity that requires just as much typing as the hand-written loop, making it not always feel so convenient. It benefits code that uses it with increased clarity and mathematical soundness, so reducing the syntactic overhead should be a goal of those using it. Today I will talk about these problems and demonstrate ways of making it more terse and sound.

Usually, I try to put all the code in my articles in one gist, but this time I will base it off my most recent work--a library called FU (Functional Utilities)--for sake of simplicity. Still, it applies as well to fit, FTL, and to some degree, range-v3, if you use any of those.

Projection


We often store large data structures for use in many areas, but a specific algorithm may only need one or two members. Say I have a list of people and I want to sort them in order of their year_of_birth. I'm sure anyone with much experience with C++ has seen code like this:
std::sort(std::begin(people), std::end(people),
          [](const Person& a, const Person& b) {
            return a.year_of_birth < b.year_of_birth;
          });
This pattern occurs so frequently than range-v3 accepts "projection" functions for algorithms such as this.
range::v3::sort(people, std::less<>{}, &Person::year_of_birth);
The projection converts its argument to make it suitable for the comparison function, std::less in this example. Internally, it ends up calculating the same thing as the lambda, extracting year_of_birth from both arguments and calling less on them.

One might ask, "but &Person::year_of_birth is not a function, so how can we pass it as one?" Internally, range-v3 and FU use a more generic concept of "invoking", which does not limit them to regular functions. See n3727, or std::invoke (C++17) for more.

This issue is not limited to <algorithm> or std::sort, it is a generic problem, so I wrote fu::proj for use with the current STL and elsewhere.

std::sort(std::begin(people), std::end(people),
          fu::proj(std::less<>{}, &Person::year_of_birth));
But since requiring a projection over less is so very common,  fu::proj_less can be more convenient.
std::sort(std::begin(people), std::end(people),
          fu::proj_less(&Person::year_of_birth));
Fit also provides a similar utility, by, although the syntax is a bit different.

std::sort(std::begin(people), std::end(people),
          fit::by(std::mem_fn(&Person::year_of_birth), _ < _));
I argue that each of the four versions above would be more clear and less error prone than the first, using a lambda. Rather than specifying exactly how to apply the arguments, we need only specify what to apply.

So why not use std::bind for examples like this?
using namespace std::placeholders;
std::sort(std::begin(people), std::end(people),
          std::bind(std::less<>{},
                    std::bind(&Person::year_of_birth, _1),
                    std::bind(&Person::year_of_birth, _2)));
With examples like that, it's no wonder people prefer lambdas! It actually becomes less readable with the overly-verbose and overly-general std::bind

Projection works well enough for comparisons, but then you have a function like std::accumulate and you only want to project the right-hand argument. Consider calculating the average age of the list of people.
auto acc = [](int accum, const Person& p) { return accum + p.age(); }
int sum = 
  std::accumulate(std::begin(people), std::end(people), 0, acc);
float ave = float(sum) / people.size();
Again, range-v3 allows projection here:
int sum = range::v3::accumulate(people, std::plus<>{}, &Person::age);
float ave = float(sum) / people.size();
One can use fu::rproj (right-projection) with the existing STL.
int sum = 
  std::accumulate(std::begin(people), std::end(people),
                  0, fu::rproj(std::plus<>{}, &Person::age));
float ave = float(sum) / people.size();
While proj and rproj will be sufficient for most <algorithm> functions, we do have one special case: std::transform or std::equal. Here, the right-hand and left-hand arguments of our function may be different types, but we still might want to use something like std::plus or std::equal_to to combine them.

For lack of a better example, consider that my struct, Person, contains a member, year_of_death. I want a sanity check that for any person, year_of_birth < year_of_death.
bool ok = std::equal(std::begin(people), std::end(people),
                     std::begin(people),
                     [](const Person& a, const Person& b) {
                       return a.time_of_birth < b.time_of_death;
                     });
With range-v3:
bool ok = range::v3::equal(people, people, std::less<>{},
                           &Person::year_of_birth, &Person::year_of_death)
For fu, I decided to call this function where the right- and left-hand arguments require different projections join instead of binary_proj or bproj.
auto pred = fu::join(std::less<>{}, &Person::year_of_birth,
                                    &Person::year_of_death);
bool ok = std::equal(std::begin(people), std::end(people),
                     std::begin(people), pred);

Although, a more obvious solution might be to use std::all_of, using fu::split(f, left, right), which produces a function, s, such that s(x) == f(left(x), right(x)).
bool ok = std::all_of(std::begin(people), std::end(people),
                      fu::split(std::less<>{}, &Person::year_of_birth,
                                               &Person::year_of_death));
So, in this instance, the lambda might actually win out on terseness, but projection does one nice thing for us: it asserts mathematical soundness. It might be very tempting to write an equality operator between a Person and an std::string to compare the person's name,  especially for use with <algorithm>s, but it makes no real sense. I also store the person's birthplace as a string, so should person == "place" be valid, too? No, it makes much more sense to define a predicate based on a function with well-known mathematical properties, like std::equal_to, and use projection to obtain the items for which equality has already been defined.

Haskell programmers like to say "if it compiles, it works!" Some of that has to do with the idea of "referential transparency", but another large part of it relates to using mathematically sound operations to assert correctness. If each component, or function, is obviously correct, and the algorithm is made of obviously correct operations, then most likely the algorithm is obviously correct.

Let's use this concept to prove the correctness of our first example, sorting by the projection of std::less on the projection of &Person::year_of_birth. It may seem unnecessary, but the rigors of mathematics require that we can formally prove these things, not just intuit or know them. 
  • We can assume the correctness of std::sort by definition, if the comparison function implements a strict weak order. Strict weak order for any comparison, comp, means that it must be irreflexive (!comp(x,x)), weakly ordered (!comp(a, b) && !comp(b, a) implies a == b), and transitive (comp(a,b) && comp(b,c) => comp(a,c)). 
  • We know by definition that std::less meets this requirement thus proj_less(f) meets this requirement if f(a) < f(b) preserves this property (also by definition). 
  • Finally, we know that f(a) == a.year_of_birth, stored as an integer, and that a set of integers sorted by < are strict-weak ordered. 
Thus, we can call that example "obviously correct." In fact, we can further state that any ordering over proj_less(f) will be correct with very few, if any, exceptions.

Mathematical soundness and obvious correctness make it easy to reason about code, but projection alone does not give us the required tools to accomplish it. We need other fundamental tools, as I will show in the next section.

The building blocks.


Say I want to construct a list of people born in Germany. Like always, we can start with the generic lambda.
std::vector<Person> germans;
std::copy_if(std::begin(people), std::end(people),
             std::back_inserter(germans),
             [](const Person& p) { return p.birth_place == "Germany"; });
Range-v3 does allow a projection for copy_if, but unfortunately, that won't much help us.
range::v3::copy_if(people, std::back_inserter(germans),
                   [](const std::string& place) { return place == "Germany"; },
                   &Person::birth_place);
It would have been less typing without the projection, although we at least know that the lambda and projection both are obviously correct and thus is the expression. To really get the desired level of terseness, we need a partial application of equality.

FU supplies a function, part, so that we could write "auto pred = fu::part(std::equal_to<>{}, "Germany")", and then one could write "pred("Hungary")", which would return false, or "pred("Germany")", true, but it's just not terse. Since one often needs to pass binary operators or partially-applied operators to functions like copy_if, FU also supplies a number of such functions. For this example, we need fu::eq, which acts similarly to std::equal_to except that fu::eq(x) returns the partial application. (But fu::eq(x,y) does compute x == y, like one would expect.) So with that, we can return to our example:
range::v3::copy_if(people, std::back_inserter(germans),
                   fu::eq("Germany"), &Person::birth_place);
Fit's placeholders utilities also recognize the importance of partial applications like this.
range::v3::copy_if(people, std::back_inserter(germans),
                   _ == "Germany", &Person::birth_place);
And with the STL using FU,
std::copy_if(std::begin(people), std::end(people),
             std::back_inserter(germans),
             fu::proj(fu::eq("Germany"), &Person::birth_place));
(Note: fu::ucompose, short for "unary composition", might be more appropriate here, but proj(f,g)(x) is roughly equivalent to ucompose(f,g)(x).)

Let's say I need all the names stored in people in a contiguous string, separated by commas. I could just call std::accumulate, starting with an empty std::string and append each one using rproj(std::plus<>{}, &Person::name), but I want to minimize the number of required allocations so I have to sum up their sizes first. It'll look something like this, with range-v3:
auto name_len = [](const Person& p) { return p.name.size(); };
size_t size = range::v3::accumulate(people, 0, std::plus<>{}, name_len);
                                                                         
std::string names;
names.reserve(size + people.size() * 2);
                                                                         
auto append = [&](const std::string& str) { names.append(str);
                                          names.append(", "); };
range::v3::for_each(people, append, &Person::name);
(For simplicity, I'm ignoring that this algorithm will append an extra ", " at the end.)

Again, we could not represent the mini-functions name_len and append as simple projections and had to resort to writing out the lambdas. However, can we represent them using fu::proj? Well, for name_len, yes.
auto name_len = fu::proj(&std::string::size, &Person::name);
Invoking &Person::name returns an std::string, and &std::string::size gives us the desired result--it's obviously correct. append, however... no, and for two reasons. Firstly, std::string::append is an overloaded function so we can't just take its address; we'd need to wrap it in a lambda, anyway. Secondly, we also need to append ", ", and while one could write a series of generic functional utilities and define append as accordingly, we'd most likely lose our "obvious correctness" with no real gains. For completeness, here is the version for the STL:
auto name_len = fu::proj(&std::string::size, &Person::name);
size_t size = std::accumulate(std::begin(people), std::end(people),
                              0, fu::rproj(std::plus<>{}, name_len));
                                                                    
std::string names;
names.reserve(size + people.size() * 2);
                                                                    
auto append = [&](const std::string& str) { names.append(str);
                                          names.append(", "); };
std::for_each(std::begin(people), std::end(people),
              fu::proj(append, &Person::name));

Conclusions


One might wonder why, in the last example, I prefer to make append operator on just an std::string and insist on projecting over &Person::name. I can trivially prove the correctness of append(str); it appends str followed by ", " to names, and it might even by useful in another part of the program. proj(append, &Person::name) is obviously correct for appending a person's name. Each unit, append and &Person::name operates on one level of abstraction; proj links them together. If the way a Person holds its name changes, I only need to change &Person::name and append remains valid. If I need to change append, the projection still remains valid.

Keeping our functions small and on one layer of abstraction makes the program more robust, aside from maintaining soundness of the algorithm. FU's proj, rproj, part, and split can be used to construct most simple predicates, relational functions, and accumulators, and can also be combined in simple, granular expressions to make more complex ones.

Links

fu::proj_less: proj_less has a very simple definition: proj(std::less<>{}), although it actually uses a helper so that it may be constexpr. Most fu functions are objects that can be implicitly partially applied. In fact, the definition of proj is actually closer to [](auto f, auto pf, auto x, auto y) { return f(pf(x), pf(y)); }, but perfect forwarding. Thus, proj(std::less<>{}, &Person::name) is also a partial application.

fu::rproj(std::plus<>): I considered adding a function to FU called accumulator = rproj(std::plus<>{}), but it didn't seem very intuitive. Please comment below if you have an opinion on that.

range::v3::copy_if: It may be preferable to use range::v3::view::remove_if(rng, std::not_fn(pred)) for the example given. One might want to use range::v3::view::filter, but it has been deprecated.

fit::by: We require std::mem_fn for Fit because it does not use a generalized invoke function.