Microsoft is expanding its cognitive services with a customization service, handwriting recognition APIs, and more. – TechCrunch

[ad_1]

As part of its rather odd change of scenery before the flagship Build Developer Conference next week, Microsoft today announced a series of new predefined machine learning models for its cognitive services platform. These include an API to create customization features, a form recognition tool to automate data entry, a handwriting recognition API, and an enhanced speech recognition service that focuses on the transcription of conversations.

The most important of these new services may be the Personalizer. After all, there are few applications and websites that do not try to offer their users personalized features. This is difficult, in part, because it often involves building models based on data stored in various silos. With Personalizer, Microsoft is banking on reinforcement learning, a machine learning technique that does not require the kind of tagged training data typically used in machine learning. Instead, the reinforcement agent is constantly looking to find the best way to achieve a given goal based on what users are doing. Microsoft claims to be the first company to offer such a service and is testing the services on its Xbox itself, where it has seen a 40% increase in engagement with its content after the introduction of this service. service.

The handwriting recognition API, or Ink Recognizer as its official name, can automatically recognize handwriting, shapes and current documents. This is something that Microsoft has been interested in for a long time in developing its Windows 10 inking capabilities; so it is not surprising that she is now integrating it as a cognitive service. Indeed, Microsoft Office 365 and Windows already use exactly this service, so we are talking about a fairly robust system. With this new API, developers can now integrate these same features into their own applications.

Conversation Transcription is exactly what it is: it transcribes conversations and is one of Microsoft's existing speech-synthesis features in the cognitive service line. It can identify different stakeholders, transcribe the conversation in real time and even manage crosstalk. It already integrates with Microsoft Teams and other meeting software.

Another new feature is Form Recognizer, a new API that makes it easy to extract text and data from forms and business documents. This does not seem very interesting, but it solves a very common problem and the service only needs five samples to understand how to extract data and users do not have to perform the tedious manual tagging often necessary for their creation . systems.

Form Recognizer also arrives in cognitive service containers, which allow developers to use these models outside of Azure and their peripheral devices. The same applies to existing word-to-text and speech-to-speech services, as well as to the existing anomaly detector.

In addition, the company also announced today that its entity recognition APIs named Neural Text-to-Speech, Computer Vision Read and Text Analytics are now available.

Some of these existing services also get feature updates, with the Neural Speech Synthesizer service now supporting five voices, while the Computer Vision API can now include more than 10,000 concepts, scenes, and objects, as well as 1 million celebrities. to 200,000 in a previous version (are there so many celebrities?).

[ad_2]

Source link