The FINANCIAL — IBM on February 23 announced new and expanded cognitive APIs for developers that enhance Watson’s emotional and visual senses, further extending the capabilities of the industry’s largest and most diverse set of cognitive technologies and tools.
Three APIs, Tone Analyzer, Emotion Analysis and Visual Recognition, are now available in beta. Additionally, Text to Speech (TTS) has been updated with new emotional capabilities and is being re-released as Expressive TTS for general availability. These APIs are pushing the sensory boundaries of how humans and machines interact, and they are designed to improve how developers embed these technologies to create solutions that can think, perceive and empathize, according to IBM.
“We continue to advance the capabilities we offer developers on IBM’s Watson platform to help this community create dynamic AI infused apps and services,” said David Kenny, general manager of IBM Watson. “We are also simplifying the platform, making it easier to build, teach and deploy the technology. Together, these efforts will enable Watson to be applied in many more ways to address societal challenges.”
IBM is also adding tooling capabilities and enhancing its SDKs (Node, Java, Python, and newly introduced iOS Swift and Unity) across the Watson portfolio and adding Application Starter Kits to make it easy and fast for developers to customize and build with Watson. All APIs are available through the IBM Watson Developer Cloud on Bluemix.
New Beta APIs Advance Emotional Intelligence and Image Recognition
Building on existing Watson APIs that draw on advances in natural language processing, machine learning and deep learning, Tone Analyzer, Emotion Analysis and Visual Recognition are now available in beta.
Tone Analyzer: Tone Analyzer has deepened its analysis capabilities in this beta release in order to give users better insights about their own tone in a piece of text. Adding to its previous experimental understanding of nine traits across three tones – emotion (negative, cheerful, angry), social propensities (open, agreeable, conscientious) and writing style (analytical, confident, tentative) – Tone Analyzer now analyzes new emotions, including joy, disgust, fear, and sadness, as well as new social propensities, including extraversion and emotional range. Also new to the beta version, Tone Analyzer is moving from analyzing single words to analyzing entire sentences. This analysis is helpful in situations that require nuanced understanding. For example, in speech writing it can indicate how different remarks might come across to the audience, from exhibiting confidence and agreeableness to showing fear. In customer service, it can help analyze a variety of social, emotional and writing tones that influence the effectiveness of an exchange.
Watson Ecosystem Partner Connectidy has developed an innovative relationship science platform that leverages the Tone Analyzer beta to intuitively help users understand how messages to potential matches may come across. Dineen Tallering, President of Connectidy says, “Through the analysis of authentic language in real time, Tone Analyzer provides people with an unprecedented level of perspective into how their emotions and social propensities play out in their written word. This is a critical piece of emotional intelligence because it enables us to continually educate users on how they appear to others. We are able to advance past static algorithms to achieve a level of cognitive insight that continuously learns and helps guide our users towards greater self awareness and better choices.”
Emotion Analysis: IBM has added Emotion Analysis as a new beta function within the AlchemyLanguage suite of APIs. Emotion Analysis uses sophisticated natural language processing techniques to analyze external content and help users better understand the emotions of others. Developers can now go beyond identifying positive and negative sentiments and distinguish a broader range of emotions, including joy, fear, sadness, disgust and anger. By gaining this deeper understanding, Emotion Analysis can help identify new insights in areas like customer reviews, surveys, and social media posts. For example, in addition to knowing if product reviews are negative or positive, businesses can now identify if, for example, a change in a product feature prompted reactions of joy, anger or sadness among customers.
Visual Recognition: Moving beyond visual capabilities that allow systems to understand and tag an image, Visual Recognition is available now in beta and can be trained to recognize and classify images based on training material.
While other visual search engines can tag images with a fixed set of classifiers or generic terms, Visual Recognition allows developers to train Watson around custom classifiers for images – the same way users can teach Watson natural language classification – and build apps that visually identify unique concepts and ideas. This means that Visual Recognition is now customizable with results tailored to each user’s specific needs. For example, a retailer might create a tag specific to a style of its pants in the new spring line so it can identify when an image appears in social media of someone wearing those pants.
Watson Integrates Emotional IQ into its Text to Speech API
To further advance emotional capabilities for cognitive systems, IBM has also incorporated emotional IQ into its existing Text to Speech API and is releasing Expressive TTS for general availability.
Expressive Text to Speech: Resulting from 12 years of research and development, Expressive TTS is now generally available and incorporates emotional IQ into the existing Watson TTS API. Cognitive systems can for the first time generate and deliver an advanced level of adaptive emotion in vocal interactions, meaning computers can not only understand natural language, tone and context, but respond with the appropriate inflection.
Previously, automated systems relied on a pre-determined, rules-based corpus of words. This has been categorized by limited emotional queues, such as “good news equals a raised tone” or “bad news equals a slowed tone.” In creating Expressive TTS, IBM studied and decided on a specific set of expressive styles to frame this speech capability. To do this, the research team made significant enhancements to IBM’s existing synthesis engine incorporating ideas from machine learning to allow for seamless switching across expressive styles. Developers now have more flexibility in building cognitive systems that can demonstrate sensitivity in human interactions
These new and expanded services are part of IBM’s open Watson platform that now includes more than 30 Watson services and is available through the IBM Watson Developer Cloud on Bluemix. With a community of more than 80,000 developers, students, entrepreneurs and tech enthusiasts currently tapping into the cognitive computing platform to prototype and build cloud-based cognitive computing applications, these advancements are the latest example of IBM’s commitment to empowering the developer community to build cognitive enabled apps and businesses with Watson.
IBM Watson: Pioneering a New Era of Computing
Watson represents a new era in computing called cognitive computing, where systems understand the world the way humans do: through senses, learning, and experience. Watson continuously learns, gaining in value and knowledge over time, from previous interactions. With the help of Watson, organizations are leveraging cognitive computing to transform industries, help professionals do their jobs better, and solve important challenges. To advance Watson, IBM has three dedicated business units: Watson, established for the development of cloud-delivered cognitive computing technologies that represent the commercialization of “artificial intelligence” or “AI” across a variety of industries; Watson Health, dedicated to improving the ability of doctors, researchers and insurers and other related health organizations to surface new insights from data to and deliver personalized healthcare; and Watson IoT, focused on making sense of data embedded in for more than 9 billion connected devices operating in the world today, which generate 2.5 quintillion bytes of new data daily.
Discussion about this post