Step by Step to create an Azure Cognitive Services – Text to Speech API Application


Sathish Nadarajan
SharePoint MVP
Published On :   01 Mar 2018
Visit Count
Today :  3    Total :   1577



In this article, let us see how to create an Azure Cognitive Services – Text to Speech API Application using C# and the Speech Recognition API.

Before getting started, we need to get the Azure Subscription or even a 30 days trial is also offered by Microsoft.

As usual, let us go by step by step procedures.

1. Login to the Azure.Microsoft.com portal with a valid Azure Account.

https://azure.microsoft.com/en-us/try/cognitive-services/

2. The home page will looks like below.

clip_image002

3. Click on Login and Login with the valid credential.

clip_image004

clip_image006

4. We can see the list of APIs provided by Microsoft.

clip_image008

5. Add the Bing Speech API (Which is valid for 30 days)

6. We will get the End Point and the Keys for the Speech API. Make a note of the End Point and any of the Key. We can use one among the two keys.

clip_image010

7. With this, now, come back to the Visual Studio.

8. Open the Visual Studio. Create a New project – Console App.

clip_image012

9. Add the NuGet Package for Azure Speech Recognition. “Microsoft.ProjectOxford.SpeechRecognition-x64”. Initially the project Name was ProjectOxford before the Cognitive Services. Hence, the NuGet Package name is like ProjectOxford instead of Cognitive Services.

clip_image014

clip_image016

clip_image018

10. Rebuild the Application.

11. Execute the Application and we will see the below exception.

12. If we get any App break exception, then refer to the previous article

13. Now, edit the Program.CS and paste the below code.

 using CognitiveServicesTTS;
 using System;
 using System.Media;
 using System.Threading;
 using System.Threading.Tasks;
 
 namespace CognitiveServices.Demo
 {
     class Program
     {
         
         static void Main(string[] args)
         {
 //Get the Input from the User
             Console.WriteLine("Enter the Text to Speak : ");
             string input = Console.ReadLine();
 //Calling the speak method
             Speak(input);
             Console.ReadLine();
         }
 
         public static async Task Speak(string speech)
         {
             string accessToken;
 //The below Authentication class is available on the class TTSClient, which can be got //from the attached source code
 //Paste the Key which we got from the Azure Portal 
             Authentication auth = new Authentication("**********");
             accessToken = auth.GetAccessToken();
 //End Point URL
             string uri = "https://speech.platform.bing.com/synthesize";
             var speaker = new Synthesize();
 
 // Initialize the OnAudio Available Event
             speaker.OnAudioAvailable += Speaker_OnAudioAvailable;
 
 // Various Options for the speak formats
             var options = new Synthesize.InputOptions
             {
                 RequestUri = new Uri(uri),
                 Text = speech,
                 VoiceType = Gender.Female,
                 Locale = "en-US",
                 VoiceName = "Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)",
                 OutputFormat = AudioOutputFormat.Riff16Khz16BitMonoPcm,
                 AuthorizationToken = "Bearer " + accessToken
             };
 
             await speaker.Speak(CancellationToken.None, options);
 
             
 
         }
 
 //Trigger once the response available
         private static void Speaker_OnAudioAvailable(object sender, GenericEventArgs<System.IO.Stream> e)
         {
 
             SoundPlayer player = new SoundPlayer(e.EventData);
             player.PlaySync();
             e.EventData.Dispose();
 
         }
 
     }
 }
 

14. Execute the code and the screen will popup for the text to input.

Download the Source HERE

Happy Coding,

Sathish Nadarajan.

Categories