Internship Report > Workshop > AI & Voice Integration > Amazon Transcribe

Amazon Transcribe

Overview

Amazon Transcribe converts audio to text (Speech-to-Text):

Support: Batch transcription + real-time streaming
Language: English (US, UK, etc.)
Output: JSON format with timestamps

Batch Transcription

Start Transcription Job

# Start transcription job
aws transcribe start-transcription-job \
  --transcription-job-name <JOB_NAME> \
  --language-code en-US \
  --media-format wav \
  --media '{"MediaFileUri":"s3://<BUCKET>/<KEY>"}' \
  --output-bucket-name <BUCKET> \
  --region <REGION>

Check Job Status

# Get job status
aws transcribe get-transcription-job \
  --transcription-job-name <JOB_NAME> \
  --region <REGION>

# Or list all jobs
aws transcribe list-transcription-jobs \
  --region <REGION>

Results will include:

TranscriptionJobStatus (QUEUED, IN_PROGRESS, COMPLETED, FAILED)
Transcript (S3 URI)
CreationTime, CompletionTime

Get Transcription Result

# Download transcript from S3
aws s3 cp s3://<BUCKET>/<TRANSCRIPT_KEY> transcript.json

# Or view directly
aws s3api get-object \
  --bucket <BUCKET> \
  --key <TRANSCRIPT_KEY> \
  transcript.json

cat transcript.json

Delete Transcription Job

# Delete job
aws transcribe delete-transcription-job \
  --transcription-job-name <JOB_NAME> \
  --region <REGION>

Real-time Streaming Transcription

import boto3
import json

class TranscribeStreamingService:
    def __init__(self, region: str = 'ap-southeast-1'):
        self.client = boto3.client('transcribe-streaming', region_name=region)
    
    def transcribe_stream(self, audio_stream):
        """Transcribe audio stream in real-time"""
        
        response = self.client.start_stream_transcription(
            language_code='en-US',
            media_sample_rate_hz=16000,
            media_encoding='pcm',
            AudioStream=audio_stream
        )
        
        for event in response['TranscriptResultStream']:
            if 'TranscriptEvent' in event:
                result = event['TranscriptEvent']['Transcript']
                for item in result['Results']:
                    if item['IsPartial']:
                        print(f"Partial: {item['Alternatives'][0]['Transcript']}")
                    else:
                        print(f"Final: {item['Alternatives'][0]['Transcript']}")

Transcribe API Example (Python)

import boto3
import uuid

class TranscribeService:
    def __init__(self, region: str = 'ap-southeast-1'):
        self.client = boto3.client('transcribe', region_name=region)
        self.bucket_name = 'lexi-be-speakingaudiobucket'
    
    def transcribe_audio(self, audio_url: str) -> str:
        """Transcribe audio file to text"""
        
        job_name = f"lexi-transcribe-{uuid.uuid4()}"
        
        response = self.client.start_transcription_job(
            TranscriptionJobName=job_name,
            Media={'MediaFileUri': audio_url},
            MediaFormat='wav',
            LanguageCode='en-US',
            OutputBucketName=self.bucket_name,
            OutputKey=f'transcripts/{job_name}.json'
        )
        
        return response['TranscriptionJob']['TranscriptionJobName']
    
    def get_transcription(self, job_name: str) -> str:
        """Get transcription result"""
        
        response = self.client.get_transcription_job(
            TranscriptionJobName=job_name
        )
        
        job = response['TranscriptionJob']
        
        if job['TranscriptionJobStatus'] == 'COMPLETED':
            transcript_uri = job['Transcript']['TranscriptFileUri']
            return transcript_uri
        
        return None

Monitor Transcribe Metrics

# View transcription jobs
aws cloudwatch get-metric-statistics \
  --namespace AWS/Transcribe \
  --metric-name SuccessfulTranscriptionJobs \
  --start-time 2026-05-01T00:00:00Z \
  --end-time 2026-05-02T00:00:00Z \
  --period 3600 \
  --statistics Sum \
  --region <REGION>

# View failed jobs
aws cloudwatch get-metric-statistics \
  --namespace AWS/Transcribe \
  --metric-name FailedTranscriptionJobs \
  --start-time 2026-05-01T00:00:00Z \
  --end-time 2026-05-02T00:00:00Z \
  --period 3600 \
  --statistics Sum \
  --region <REGION>

Troubleshooting

Issue: Job failed

# Check job status
aws transcribe get-transcription-job \
  --transcription-job-name <JOB_NAME> \
  --region <REGION> \
  --query 'TranscriptionJob.FailureReason'

Issue: Audio file not found

# Check S3 file
aws s3 ls s3://<BUCKET>/<KEY>

# Or upload file
aws s3 cp <LOCAL_FILE> s3://<BUCKET>/<KEY>

Issue: Unsupported format

# Convert audio format
ffmpeg -i input.mp3 -acodec pcm_s16le -ar 16000 output.wav

# Or use different format
# Supported: mp3, mp4, wav, flac, ogg, amr, webm

Next Steps

Continue to Polly to learn how to synthesize speech from text.

Checklist

Transcribe job created successfully
Audio file uploaded to S3
Transcription job completed
Transcript result retrieved
Real-time streaming test successful