Testing specialists Applause have debuted an AI solution promising to help tackle algorithmic bias while providing the scale of data needed for robust training.
Applause has built a vast global community of testers for its app testing solution which is trusted by brands including Google, Uber, PayPal, and more. The company is leveraging this relatively unique asset to help overcome some of the biggest hurdles facing AI development.
AI News spoke with Kristin Simonini, VP of Product at Applause, about the company’s new solution and what it means for the industry ahead of her keynote at AI Expo North America later this month.
“Our customers have been needing additional support from us in the area of data collection to support their AI developments, train their system, and then test the functionality,” explains Simonini. “That latter part being more in-line with what they traditionally expect from us.”
Applause has worked predominantly with companies in the voice space but also their increasing expansion into things such as gathering and labelling images and running documents through OCR.
This existing breadth of experience in areas where AI is most commonly applied today puts the company and its testers in a good position to offer truly useful feedback on where improvements can be made.
Specifically, Applause’s new solution operates across five unique types of AI engagements:
- Voice: Source utterances to train voice-enabled devices, and test those devices to ensure they understand and respond accurately.
- OCR (Optimized Character Recognition): Provide documents and corresponding text to train algorithms to recognize text, and compare printed docs and the recognized text for accuracy.
- Image Recognition: Deliver photos taken of predefined objects and locations, and ensure objects are being recognized and identified correctly.
- Biometrics: Source biometric inputs like faces and fingerprints, and test whether those inputs result in an experience that’s easy to use and actually works
- Chatbots: Give sample questions and varying intents for chatbots to answer, and interact with chatbots to ensure they understand and respond accurately in a human-like way.
“We have this ready global community that’s in a position to pull together whatever information an organisation might be looking for, do it at scale, and do it with that breadth and depth – in terms of locations, genders, races, devices, and all types of conditions – that make it possible to pull in a very diverse set of data to train an AI system.”
Some examples Simonini provides of the types of training data which Applause’s global testers can supply includes voice utterances, specific documents, and images which meet set criteria like “street corners” or “cats”. A lack of such niche data sets with the diversity necessary is one of the biggest obstacles faced today and one which Applause hopes to help overcome.
A significant responsibility
Everyone involved in developing emerging technologies carries a significant responsibility. AI is particularly sensitive because everyone knows it will have a huge impact across most parts of societies around the world, but no-one can really predict how.
How many jobs will AI replace? Will it be used for killer robots? Will it make decisions on whether to launch a missile? To what extent will facial recognition be used across society? These are important questions that no-one can give a guaranteed answer, but it’s certainly on the minds of a public that’s grown up around things like 1984 and Terminator.
One of the main concerns about AI is bias. Fantastic work by the likes of the Algorithmic Justice League has uncovered gross disparities between the effectiveness of facial recognition algorithms dependent on the race and gender of each individual. For example, IBM’s facial recognition algorithm was 99.7 percent accurate when used on lighter-skinned males compared to just 65.3 percent on darker-skinned females.
Simonini highlights another study she read recently where voice accuracy for white males was over 90 percent. However, for African-American females, it was more like 30 percent.
Addressing such disparities is not only necessary to prevent things such as inadvertently automating racial profiling or giving some parts of society an advantage over others, but also to allow AI to reach its full potential.
While there are many concerns, AI has a huge amount of power for good as long as it’s developed responsibly. AI can drive efficiencies to reduce our environmental impact, free up more time to spend with loved ones, and radically improve the lives of people with disabilities.
A failure of companies to take responsibility for their developments will lead to overregulation, and overregulation leads to reduced innovation. We asked Simonini whether she believes robust testing will reduce the likelihood of overregulation.
“I think it’s certainly improved the situation. I think that there’s always going to probably be some situations where people attempt to regulate, but if you can really show that effort has been put forward to get to a high level of accuracy and depth then I think it would be less likely.”
Human testing remains essential
Applause is not the only company working to reduce bias in algorithms. IBM, for example, has a tool called Fairness 360 which is essentially an AI itself used to scan other algorithms for signs of bias. We asked Simonini why Applause believes human testing is still necessary.
“Humans are unpredictable in how they’re going to react to something and in what manner they’re going to do it, how they choose to engage with these devices and applications,” comments Simonini. “We haven’t yet seen an advent of being able to effectively do that without the human element.”
An often highlighted challenge with voice recognition is the wide variety of languages spoken and their regional dialects. Many American voice recognition systems even struggle with my accent from the South West of England.
Simonini adds in another consideration about slang words and the need for voice services to keep up-to-date with changing vocabularies.
“Teenagers today like to, when something is hot or cool, say it’s “fire” [“lit” I believe is another one, just to prove I’m still down with the kids],” explains Simonini. “We were able to get these devices into homes and really try to understand some of those nuances.”
Simonini then further explains the challenge of understanding the context of these nuances. In her “fire” example, there’s a very clear need to understand when there’s a literal fire and when someone is just saying that something is cool.
“How do you distinguish between this being a real emergency? My volume and my tone and everything else about how I’ve used that same voice command is going to be different.”
The growth of AI apps and services
Applause established its business in traditional app testing. Given the expected growth in AI apps and services, we asked Simonini whether Applause believes its AI testing solution will become as big – or perhaps even bigger – than its current app testing business.
“We do talk about that; you know, how fast is this going to grow?” says Simonini. “I don’t want to keep talking about voice, but if you look statistically at the growth of the voice market vis-à-vis the growth and adoption of mobile; it’s happening at a much faster pace.”
“I think that it’s going to be a growing portion of our business but I don’t think it necessarily is going to replace anything given that those channels [such as mobile and desktop apps] will still be alive and complementary to one another.”
Simonini will be speaking at AI Expo North America on November 13th in a keynote titled Why The Human Element Remains Essential In Applied AI. We asked what attendees can expect from her talk.
“The angle that we chose to sort of speak about is really this intersection of the human and the AI and why we – given that it’s the business we’re in and what we see day-in, day-out – don’t believe that it becomes the replacement of but how it can work and complement one another.”
“It’s really a bit of where we landed when we went out to figure out whether you can replace an army of people with an army of robots and get the same results. And basically that no, there are still very human-focused needs from a testing perspective.”
Interested in hearing industry leaders discuss subjects like this? , , , AI &