Voice Transcription

Learn how to transcribe Audio Messages Using Powerful Neural Network Models.

Voice transcription extension allows you to convert an audio message into text.

Before you begin

This Extension uses a third-party API service - Rev.ai - for transcribing your audio messages.
Create an account with Rev.ai and fetch the Access Token for configuring this extension.

Settings

  1. Login to the CometChat Dashboard and select your app.
  2. On the Extensions page add the Voice Transcription extension.
  3. Go to the Installed tab and open the Settings for this extension.
  4. Enter the Rev.ai Access Token, and click on save.

How does it work?

Once the Extension is enabled for your App and the settings are done, the recipients will receive metadata with the transcription details.

The transcription information will be updated later for the message and hence you need to implement the onMessageEdited listener. Please check the Edit a Message page under the Messaging section of each SDK for more details.

Here is a sample response:

"@injected": {
  "extensions": {
    "voice-transcription": {
      “transcribed_message”: “This is a test”
  }
}

If the data is missing, it means that the extension has timed out.

Implementation

At the recipients' end, from the message object, you can fetch the metadata by calling the getMetadata() method. Using this metadata, you can fetch the Rich Media Embed.

var metadata = message.getMetadata();
if (metadata != null) {
  var injectedObject = metadata["@injected"];
  if (injectedObject != null && injectedObject.hasOwnProperty("extensions")) {
    var extensionsObject = injectedObject["extensions"];
    if (
      extensionsObject != null &&
      extensionsObject.hasOwnProperty("voice-transcription")
    ) {
      var voiceTranscriptionObject = extensionsObject["voice-transcription"];
      var transcribed_message = voiceTranscriptionObject["transcribed_message"];
    }
  }
}
JSONObject metadata = message.getMetadata();
if (metadata != null) {
  JSONObject injectedObject = metadata.getJSONObject("@injected");
  if (injectedObject != null && injectedObject.has("extensions")) {
    JSONObject extensionsObject = injectedObject.getJSONObject("extensions");
    if (extensionsObject != null && extensionsObject.has("voice-transcription")) {
          JSONObject transcriptionObject = extensionsObject.getJSONObject("voice-transcription");
        }
    }
}
if (metadata != null) {
  if (metadata.has("@injected")) {
   val injectedJSONObject = metadata.getJSONObject("@injected")
   if (injectedJSONObject != null && injectedJSONObject.has("extensions")) {
   val extensionsObject = injectedJSONObject.getJSONObject("extensions")

   if (extensionsObject != null && extensionsObject.has("voice-transcription")) {
     val transcriptionObject = extensionsObject.getJSONObject("voice-transcription")
   }
  }
 }
}
let textMessage = message as? TextMessage
var metadata : [String : Any]? = textMessage.metaData
if metadata != nil {

    var injectedObject : [String : Any]? = (metadata?["@injected"] as? [String : Any])!
            
    if injectedObject != nil && (injectedObject!["extensions"] != nil){

      var extensionsObject : [String : Any]? = injectedObject?["extensions"] as? [String : Any]

      if extensionsObject != nil && extensionsObject?["voice-transcription"] != nil {
        var transcriptionObject = extensionsObject?["voice-transcription"] as! [String :  Any]
      }
   }
}

Did this page help you?