Microsoft Artificial Intelligence and Research Group have reach a dramatic milestone: their speech recognition system is as good as a human transcriptionist. Their system had a record-setting word error rate of 5.9 percent, an improvement from 6.3 percent a month ago.
Interestingly, the AI system had the most difficulty transcribing the sound “uh-huh.” Other than that, it generally missed the same sounds as humans.
It took 2,000 hours of training on Microsoft’s AI Computational Network Toolkit to get the AI to this level. Still, Microsoft says it has work to do on identifying different speakers and filtering out background noise.