How will skill developers tune their skill?


I’ve been reading up on interactive voice response systems (IVRs), since they seem very similar in design to how Jibo’s SDK approaches it, and there’s been one item that always stands out: the tuning of the skill after it was launched. Most places ( e.g. ) view it as a key aspect of the process.

Now, for example the Microsoft page assumes that the skill developer has access to the speech recordings and the logs, but am I incorrect in assuming that that is out of the question for Jibo? I can imagine a total privacy disaster if Jibo shared people’s recordings with anonymous skill developers.

So, I was wondering, how is that planned to happen? According to these sites it’s hard to guess how users will use a skill and what they say to it, but without seeing that data I can’t imagine how one would improve their own skill.

In the same token, I have no idea how Echo or Google Home go about it. They are obviously faced with the same conundrum.


Here’s an example of how it could be done, if you follow Apple’s rules for Siri, according to Wired:

"Whenever you speak into Apple’s voice activated personal digital assistant, it ships it off to Apple’s data farm for analysis. Apple generates a random numbers to represent the user and it associates the voice files with that number. This number — not your Apple user ID or email address — represents you as far as Siri’s back-end voice analysis system is concerned.

Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes."

If Jibo records the information, but disassociates the personally identifiable data away from each record, leaving only the usage data, it could theoretically make that available to the developer in either a raw form or filtered in whatever way the Jibo team wishes. I think that would work fine.


That’s a very interesting idea, and I agree one could work it that way.

Does Siri have custom skills though? It’s one thing to do this for the internal improvement of Siri, another to provide this (securely and anonymously) to individual skill developers.


This could be an opt-in for the end user to provide data to “help make Jibo’s skills better by providing anonymous usage data to third-party developers of the skills you use.” That way it’s clear to the user what they’re getting into it and choose to agree to it manually. What do you think?


I just scoured the web some more, and apparently Amazon does NOT share either recordings or transcripts with developers, out of security concerns.

I could see the opt-in model, but the danger is of course that it takes one bad incident and Jibo is dead in the water.


Actually, you can’t do opt-in, unless you make sure nobody else ever uses the device. I mean, you can’t make the decision for friends who visit you that you’re sharing their recordings with Jibo. In terms of privacy model, you have to treat it like a public device.


Yup, there are definitely larger privacy concerns because this is a multi-user device but, if crafted carefully, the opt-in model could still work if it is on a per-user basis. That way only interaction by a particular authenticated user would be subject to their own opt-in decision. Other non-authenticated and opted-out interaction would be discarded after processing. In this way it could work, but there also may be a better model out there as well.


We’re now firmly in the realm of speculation though, would love to hear from the powers to be what the actual process will be. If those IVR websites are to be believed, the success, or failure, of a skill depends on that tuning step. And surely even more for a device like Jibo that needs to be buttery smooth.


We don’t have plans to provide this type of usage data at this time. The reason for this is that we want to make sure that we are taking the privacy of Jibo users into account. We certainly know that developers want data they can use to improve their skills over time and we will continue to consider how to best address that desire while also protecting user privacy going forward.


Oh, I see. But, doesn’t that mean skill developers will have to build their skills entirely in the dark about how users actually use them?


How about if Jibo created an API that allows requests for data totals and other statistics around outside-skill usage from other apps; this to allow skill learning, but in such a way that the user identity data becomes removed from the awareness of the skill by providing only statistics from requests of the protected data.


I think those stats would be a good start for sure. That is, if you see a lot of misrecgonitions, or entire aborts in a certain part of your skill, you might try to change your grammar or what Jibo says. That would be entirely anonymous indeed and give you solid feedback to improve your skill.

Well, not to sound too negative, but this all folds into the big nebulous cloud how developers upload, register, activate and maintain their skill anyway. It’s still not clear how users would even enter my skill. Will it be Echo/Home style of having to say “Hey Jibo, use skillname to play Pictionary”? That would be the obvious way of going about it, but also very non Jibo-ish.