Incorporate machine learning
Amplify allows you to identify text on an image, identify labels on an image, translate text, and synthesize speech from text with the @predictions
directive.
Note: The
@predictions
directive requires a S3 storage bucket configured viaamplify add storage
or set thepredictionsBucket
property when using CDK.
Identify text on an image
To configure text recognition on an image use the identifyText
action in the @predictions
directive.
type Query { recognizeTextFromImage: String @predictions(actions: [identifyText])}
In your GraphQL query, can pass in a S3 key
for the image. At the moment, this directive works only with objects located within the public/
folder of your S3 bucket. The public/
prefix is automatically added to the key
input. For instance, in the example below, public/myimage.jpg
will be used as the input.
query RecognizeTextFromImage($input: RecognizeTextFromImageInput!) { recognizeTextFromImage(input: { identifyText: { key: "myimage.jpg" } })}
Identify labels on an image
To configure label recognition on an image use the identifyLabels
action in the @predictions
directive.
type Query { recognizeLabelsFromImage: [String] @predictions(actions: [identifyLabels])}
In your GraphQL query, you can pass in a S3 key
for the image. At the moment, this directive works only with objects located within public/
folder in your S3 bucket. The public/
prefix is automatically added to the key
input. For instance, in the example below, public/myimage.jpg
will be used as the input.
The query below will return a list of identified labels. Review Detecting Labels in the Amazon Rekognition documentation for the full list of supported labels.
query RecognizeLabelsFromImage($input: RecognizeLabelsFromImageInput!) { recognizeLabelsFromImage(input: { identifyLabels: { key: "myimage.jpg" } })}
Translate text
To configure text translation use the identifyLabels
action in the @predictions
directive.
type Query { translate: String @predictions(actions: [translateText])}
The query below will return the translated string. Populate the sourceLanguage
and targetLanguage
parameters with one of the Supported Language Codes. Pass in the text to translate via the text
parameter.
query TranslateText($input: TranslateTextInput!) { translate( input: { translateText: { sourceLanguage: "en" targetLanguage: "de" text: "Translate me" } } )}
Synthesize speech from text
To configure Text-to-Speech synthesis use the convertTextToSpeech
action in the @predictions
directive.
type Query { textToSpeech: String @predictions(actions: [convertTextToSpeech])}
The query below will return a presigned URL with the synthesized speech. Populate the voiceID
parameter with one of the Supported Voice IDs. Pass in the text to synthesize via the text
parameter.
query ConvertTextToSpeech($input: ConvertTextToSpeechInput!) { textToSpeech( input: { convertTextToSpeech: { voiceID: "Nicole" text: "Hello from AWS Amplify!" } } )}
Combining Predictions actions
You can also combine multiple Predictions actions together into a sequence. The following action sequences are supported:
identifyText -> translateText -> convertTextToSpeech
identifyLabels -> translateText -> convertTextToSpeech
translateText -> convertTextToSpeech
In the example below, speakTranslatedImageText
identifies text from an image, then translates it into another language, and finally converts the translated text to speech.
type Query { speakTranslatedImageText: String @predictions(actions: [identifyText, translateText, convertTextToSpeech])}
An example of that query will look like:
query SpeakTranslatedImageText($input: SpeakTranslatedImageTextInput!) { speakTranslatedImageText( input: { identifyText: { key: "myimage.jpg" } translateText: { sourceLanguage: "en", targetLanguage: "es" } convertTextToSpeech: { voiceID: "Conchita" } } )}
A code example of this using the JS Library is shown below:
import React, { useState } from 'react';import { Amplify } from 'aws-amplify';import { uploadData, getUrl } from 'aws-amplify/storage';import { generateClient } from 'aws-amplify/api';import config from './amplifyconfiguration.json';import { speakTranslatedImageText } from './graphql/queries';
/* Configure Exports */Amplify.configure(config);
const client = generateClient();
function SpeakTranslatedImage() { const [src, setSrc] = useState(''); const [img, setImg] = useState('');
function putS3Image(event) { const file = event.target.files[0]; uploadData({ key: file.name, data: file }) .result.then(async (result) => { setSrc(await speakTranslatedImageTextOP(result.key)); setImg((await getUrl({ key: result.key })).url.toString()); }) .catch((err) => console.log(err)); }
return ( <div className="Text"> <div> <h3>Upload Image</h3> <input type="file" accept="image/jpeg" onChange={(event) => { putS3Image(event); }} /> <br /> {img && <img src={img}></img>} {src && ( <div> <audio id="audioPlayback" controls> <source id="audioSource" type="audio/mp3" src={src} /> </audio> </div> )} </div> </div> );}
async function speakTranslatedImageTextOP(key) { const inputObj = { translateText: { sourceLanguage: 'en', targetLanguage: 'es' }, identifyText: { key }, convertTextToSpeech: { voiceID: 'Conchita' } }; const response = await client.graphql({ query: speakTranslatedImageText, variables: { input: inputObj } }); return response.data.speakTranslatedImageText;}
function App() { return ( <div className="App"> <h1>Speak Translated Image</h1> <SpeakTranslatedImage /> </div> );}export default App;
How it works
Definition of the @predictions
directive:
directive @predictions(actions: [PredictionsActions!]!) on FIELD_DEFINITIONenum PredictionsActions { identifyText # uses Amazon Rekognition to detect text identifyLabels # uses Amazon Rekognition to detect labels convertTextToSpeech # uses Amazon Polly in a lambda to output a presigned url to synthesized speech translateText # uses Amazon Translate to translate text from source to target language}
@predictions
creates resources to communicate with Amazon Rekognition, Translate, and Polly. For each action the following is created:
- IAM Policy for each service (e.g. Amazon Rekognition
detectText
Policy) - An AppSync VTL function
- An AppSync DataSource
Finally, a pipeline resolver is created for the query or field. The pipeline resolver is composed of AppSync functions which are defined by the action list provided in the directive.