Transcribe audio to text

Amplify iOS v1 is deprecated as of June 1st, 2024. No new features or bug fixes will be added. Dependencies may become outdated and potentially introduce compatibility issues.

Please use the latest version (v2) of Amplify Library for Swift to get started. Refer to the upgrade guide for instructions on upgrading your application to the latest version.

Amplify libraries should be used for all new cloud connected applications. If you are currently using the AWS Mobile SDK for iOS, you can access the documentation here.

Set up the backend

If you haven't already done so, run amplify init inside your project and then amplify add auth (we recommend selecting the default configuration).

Run amplify add predictions and select Convert. Then use the following answers:

? What would you like to convert?
  Translate text into a different language
  Generate speech audio from text
❯ Transcribe text from audio
? Provide a friendly name for your resource
  <Enter a friendly name here>
? What is the source language? (Use arrow keys)
  <Select your default source language>
? Who should have access?
  Auth users only
❯ Auth and Guest users

Here is an example of converting speech to text. In order to override any choices you made while adding this resource using the Amplify CLI, you can pass in a language in the options object as shown below.

func speechToText(speech: URL) {
    let options = PredictionsSpeechToTextRequest.Options(
        defaultNetworkPolicy: .auto,
        language: .usEnglish,
        pluginOptions: nil
    Amplify.Predictions.convert(speechToText: speech, options: options) { event in
        switch event {
        case let .success(result):
            print(result.transcription)
        case let .failure(error):
            print(error)

func speechToText(speech: URL) -> AnyCancellable {
    let options = PredictionsSpeechToTextRequest.Options(
        defaultNetworkPolicy: .auto,
        language: .usEnglish,
        pluginOptions: nil
    let sink = Amplify.Predictions.convert(speechToText: speech, options: options)
        .resultPublisher
        .sink {
            if case let .failure(error) = $0 {
                print(error)
        receiveValue: { result in
            print(result.transcription)
    return sink

Transcribe audio to text

Set up the backend

Working with the API