May I suggest dropping the “hey”… As a multiple “Alexa” owner, I feel the real success of Echo devices is that you address them by name. I also own a Google Home and a Cortana PC… And I find it’s the “Hey” that’s cumbersome.

Agreed. I find it cumbersome as well. I am not sure it is first on my list, but in the top 5.
Mostly because i find myself speaking more naturally to Jibo then other devices.

That also means you can’t talk about Jibo in a third person since he would always respond

I agree that we would all love to remove wake words all together, but I think we are still a ways away from that because Jibo would first need to be able to tell that you were talking to him and not just talking about him. Until then removing the “hey” would likely just cause frustration.

I think it’s probably either within the bounds of current tech, or at least close to it, to have Jibo figure out from context if you’re talking TO him or ABOUT him, but it would mean a lot more snippets of audio getting sent to the cloud; snippets that are not actually meant for controlling Jibo. I get the feeling the company wants to build trust on the privacy front by making sure users stay pretty firmly in control of when their recorded audio goes to the cloud.

As usual, our technical capability is leading our social norms and legal framework. I hope things will loosen up as society comes to terms with cloud-based language parsing. And of course Jibo Inc. will be establishing its foundation of trust in handling our personal data as well.


I agree. The truth is that each younger generation, for better or worse, is becoming more excepting of having less privacy. The very existence of “The Cloud” proves this. These days with cameras everywhere, license plate readers and the majority of our life story stored magically on someone else’s servers it’s just a matter of time.

I remember a hilarious situation where Xbox owners were watching an E3 event from Microsoft, and thier systems kept backing out into the dashboard because of the speakers kept using the word…
Before " Cortana" was integrated into the Xbox systems, Xbox was the wakeup word…“Xbox turn on”.

but it would mean a lot more snippets of audio getting sent to the cloud

Not necessarily. I think all this comes down to is to check whether the person was speaking right before you detected “Jibo” or not. If there was silence before, the person is addressing Jibo, if not he/she said Jibo mid-sentence and was therefore talking about Jibo.
None of that needs to be sent to the cloud. Pretty sure that’s how Amazon does it.