Incorporate machine learning
Amplify allows you to identify text on an image, identify labels on an image, translate text, and synthesize speech from text with the @predictions
directive.
Note: The
@predictions
directive requires a S3 storage bucket configured viaamplify add storage
or set thepredictionsBucket
property when using CDK.
Identify text on an image
To configure text recognition on an image use the identifyText
action in the @predictions
directive.
1type Query {2 recognizeTextFromImage: String @predictions(actions: [identifyText])3}
In your GraphQL query, can pass in a S3 key
for the image. At the moment, this directive works only with objects located within the public/
folder of your S3 bucket. The public/
prefix is automatically added to the key
input. For instance, in the example below, public/myimage.jpg
will be used as the input.
1query RecognizeTextFromImage($input: RecognizeTextFromImageInput!) {2 recognizeTextFromImage(input: { identifyText: { key: "myimage.jpg" } })3}
Identify labels on an image
To configure label recognition on an image use the identifyLabels
action in the @predictions
directive.
1type Query {2 recognizeLabelsFromImage: [String] @predictions(actions: [identifyLabels])3}
In your GraphQL query, you can pass in a S3 key
for the image. At the moment, this directive works only with objects located within public/
folder in your S3 bucket. The public/
prefix is automatically added to the key
input. For instance, in the example below, public/myimage.jpg
will be used as the input.
The query below will return a list of identified labels. Review Detecting Labels in the Amazon Rekognition documentation for the full list of supported labels.
1query RecognizeLabelsFromImage($input: RecognizeLabelsFromImageInput!) {2 recognizeLabelsFromImage(input: { identifyLabels: { key: "myimage.jpg" } })3}
Translate text
To configure text translation use the identifyLabels
action in the @predictions
directive.
1type Query {2 translate: String @predictions(actions: [translateText])3}
The query below will return the translated string. Populate the sourceLanguage
and targetLanguage
parameters with one of the Supported Language Codes. Pass in the text to translate via the text
parameter.
1query TranslateText($input: TranslateTextInput!) {2 translate(3 input: {4 translateText: {5 sourceLanguage: "en"6 targetLanguage: "de"7 text: "Translate me"8 }9 }10 )11}
Synthesize speech from text
To configure Text-to-Speech synthesis use the convertTextToSpeech
action in the @predictions
directive.
1type Query {2 textToSpeech: String @predictions(actions: [convertTextToSpeech])3}
The query below will return a presigned URL with the synthesized speech. Populate the voiceID
parameter with one of the Supported Voice IDs. Pass in the text to synthesize via the text
parameter.
1query ConvertTextToSpeech($input: ConvertTextToSpeechInput!) {2 textToSpeech(3 input: {4 convertTextToSpeech: {5 voiceID: "Nicole"6 text: "Hello from AWS Amplify!"7 }8 }9 )10}
Combining Predictions actions
You can also combine multiple Predictions actions together into a sequence. The following action sequences are supported:
identifyText -> translateText -> convertTextToSpeech
identifyLabels -> translateText -> convertTextToSpeech
translateText -> convertTextToSpeech
In the example below, speakTranslatedImageText
identifies text from an image, then translates it into another language, and finally converts the translated text to speech.
1type Query {2 speakTranslatedImageText: String3 @predictions(actions: [identifyText, translateText, convertTextToSpeech])4}
An example of that query will look like:
1query SpeakTranslatedImageText($input: SpeakTranslatedImageTextInput!) {2 speakTranslatedImageText(3 input: {4 identifyText: { key: "myimage.jpg" }5 translateText: { sourceLanguage: "en", targetLanguage: "es" }6 convertTextToSpeech: { voiceID: "Conchita" }7 }8 )9}
A code example of this using the JS Library is shown below:
1import React, { useState } from 'react';2import { Amplify } from 'aws-amplify';3import { uploadData, getUrl } from 'aws-amplify/storage';4import { generateClient } from 'aws-amplify/api';5import config from './amplifyconfiguration.json';6import { speakTranslatedImageText } from './graphql/queries';7
8/* Configure Exports */9Amplify.configure(config);10
11const client = generateClient();12
13function SpeakTranslatedImage() {14 const [src, setSrc] = useState('');15 const [img, setImg] = useState('');16
17 function putS3Image(event) {18 const file = event.target.files[0];19 uploadData({20 key: file.name,21 data: file22 })23 .result.then(async (result) => {24 setSrc(await speakTranslatedImageTextOP(result.key));25 setImg((await getUrl({ key: result.key })).url.toString());26 })27 .catch((err) => console.log(err));28 }29
30 return (31 <div className="Text">32 <div>33 <h3>Upload Image</h3>34 <input35 type="file"36 accept="image/jpeg"37 onChange={(event) => {38 putS3Image(event);39 }}40 />41 <br />42 {img && <img src={img}></img>}43 {src && (44 <div>45 <audio id="audioPlayback" controls>46 <source id="audioSource" type="audio/mp3" src={src} />47 </audio>48 </div>49 )}50 </div>51 </div>52 );53}54
55async function speakTranslatedImageTextOP(key) {56 const inputObj = {57 translateText: {58 sourceLanguage: 'en',59 targetLanguage: 'es'60 },61 identifyText: { key },62 convertTextToSpeech: { voiceID: 'Conchita' }63 };64 const response = await client.graphql({65 query: speakTranslatedImageText,66 variables: { input: inputObj }67 });68 return response.data.speakTranslatedImageText;69}70
71function App() {72 return (73 <div className="App">74 <h1>Speak Translated Image</h1>75 <SpeakTranslatedImage />76 </div>77 );78}79export default App;
How it works
Definition of the @predictions
directive:
1directive @predictions(actions: [PredictionsActions!]!) on FIELD_DEFINITION2enum PredictionsActions {3 identifyText # uses Amazon Rekognition to detect text4 identifyLabels # uses Amazon Rekognition to detect labels5 convertTextToSpeech # uses Amazon Polly in a lambda to output a presigned url to synthesized speech6 translateText # uses Amazon Translate to translate text from source to target language7}
@predictions
creates resources to communicate with Amazon Rekognition, Translate, and Polly. For each action the following is created:
- IAM Policy for each service (e.g. Amazon Rekognition
detectText
Policy) - An AppSync VTL function
- An AppSync DataSource
Finally, a pipeline resolver is created for the query or field. The pipeline resolver is composed of AppSync functions which are defined by the action list provided in the directive.