Step by Step to create an Azure Cognitive Services – Text to Speech API Application

Sathish Nadarajan
Solution Architect
March 1, 2018
Rate this article
[Total: 0    Average: 0/5]

In this article, let us see how to create an Azure Cognitive Services – Text to Speech API Application using C# and the Speech Recognition API.

Before getting started, we need to get the Azure Subscription or even a 30 days trial is also offered by Microsoft.

As usual, let us go by step by step procedures.

1. Login to the portal with a valid Azure Account.

2. The home page will looks like below.


3. Click on Login and Login with the valid credential.



4. We can see the list of APIs provided by Microsoft.


5. Add the Bing Speech API (Which is valid for 30 days)

6. We will get the End Point and the Keys for the Speech API. Make a note of the End Point and any of the Key. We can use one among the two keys.


7. With this, now, come back to the Visual Studio.

8. Open the Visual Studio. Create a New project – Console App.


9. Add the NuGet Package for Azure Speech Recognition. “Microsoft.ProjectOxford.SpeechRecognition-x64”. Initially the project Name was ProjectOxford before the Cognitive Services. Hence, the NuGet Package name is like ProjectOxford instead of Cognitive Services.




10. Rebuild the Application.

11. Execute the Application and we will see the below exception.

12. If we get any App break exception, then refer to the previous article

13. Now, edit the Program.CS and paste the below code.

 using CognitiveServicesTTS;
 using System;
 using System.Media;
 using System.Threading;
 using System.Threading.Tasks;
 namespace CognitiveServices.Demo
     class Program
         static void Main(string[] args)
 //Get the Input from the User
             Console.WriteLine("Enter the Text to Speak : ");
             string input = Console.ReadLine();
 //Calling the speak method
         public static async Task Speak(string speech)
             string accessToken;
 //The below Authentication class is available on the class TTSClient, which can be got //from the attached source code
 //Paste the Key which we got from the Azure Portal 
             Authentication auth = new Authentication("**********");
             accessToken = auth.GetAccessToken();
 //End Point URL
             string uri = "";
             var speaker = new Synthesize();
 // Initialize the OnAudio Available Event
             speaker.OnAudioAvailable += Speaker_OnAudioAvailable;
 // Various Options for the speak formats
             var options = new Synthesize.InputOptions
                 RequestUri = new Uri(uri),
                 Text = speech,
                 VoiceType = Gender.Female,
                 Locale = "en-US",
                 VoiceName = "Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)",
                 OutputFormat = AudioOutputFormat.Riff16Khz16BitMonoPcm,
                 AuthorizationToken = "Bearer " + accessToken
             await speaker.Speak(CancellationToken.None, options);
 //Trigger once the response available
         private static void Speaker_OnAudioAvailable(object sender, GenericEventArgs<System.IO.Stream> e)
             SoundPlayer player = new SoundPlayer(e.EventData);

14. Execute the code and the screen will popup for the text to input.

Download the Source HERE

Happy Coding,

Sathish Nadarajan.

Author Info

Sathish Nadarajan
Solution Architect
Rate this article
[Total: 0    Average: 0/5]
Sathish is a Microsoft MVP for SharePoint (Office Servers and Services) having 13+ years of experience in Microsoft Technologies. He holds a Masters Degree in Computer Aided Design and Business more

Leave a comment