I think this is one of the more fundamental questions since it gets right to the heart of what a social robot is all about, getting the user to feel comfortable with Jibo to a point where he feels real and alive, without pre-scripting and mechanical-feeling routines. The user doesn't need a script to communicate with you...your both human and communicate on a unspoken level as well as using words.
Considering that, Jibo at some point should be able to put the whole scene together and try to understand what you say a level above what each skill allows. On a base level, he should understand what each skill, core and third-party, does and when it's needed. That way he could suggest what he thinks you need when you say:
"I'm leaving the house for a few days. Make sure to keep an eye on things."
...understanding that you'll need a broad-level security skill here, even though the skill itself only reacts to things like "turn on the alarm".
Point being, it's more natural to look up a help area for a list of commands on the phone than with a character. It would be better to add intuition into the mix.