i think this is a limitation
from what I understand when you use [] the pattern inside is parsed independently in relation to the whole outside pattern, and then gets "embedded" to the whole pattern. so the parser is basically trying to understand the pattern "_ hh". but it can't know what you mean by "_ " because the bd the "_ " is elongating is outside of this inner pattern.
(correct me if I'm wrong people who actually understand the parser code lol)
Yes the parser will fail at this point for the reasons @ritschse gives. You could do this:
d1 $ s "bd [ ~ hh ] hh hh"
Not exactly the same, as you have half a step of silence rather than a continuation of the previous event. But in general with samples it will sound the same, until e.g. you put legato on.