Internship Report > Workshop > AI & Voice Integration

AI & Voice Integration

Overview

Lexi uses three main AWS services for AI and voice processing:

Amazon Bedrock - AI conversation (Claude 3 Haiku or Nova Lite)
Amazon Transcribe - Speech-to-Text (STT)
Amazon Polly - Text-to-Speech (TTS)

Speaking Flow Process

User speaks
    ↓
Frontend records → sends audio stream
    ↓
Lambda receives audio
    ↓
Transcribe: Audio → Text
    ↓
Bedrock: Text + Context → AI Response
    ↓
Comprehend: Language analysis
    ↓
Polly: Response → Audio
    ↓
Lambda sends audio to frontend
    ↓
Frontend plays audio
    ↓
Save session to DynamoDB

AI & Voice Content

AI & Voice is divided into 3 main parts:

Amazon Bedrock - AI conversation
- List available models
- Invoke models
- Monitor usage
- Cost optimization
Amazon Transcribe - Speech-to-Text
- Batch transcription
- Real-time streaming
- Check job status
- Get transcription result
Amazon Polly - Text-to-Speech
- Synthesize speech
- List available voices
- SSML support
- Cost optimization

Checklist

After completing the AI & Voice section, you should:

Understand the complete Speaking Flow process
Know how to use Amazon Bedrock for AI conversation
Understand how Transcribe converts speech to text
Know how Polly synthesizes speech from text
Understand how to integrate these 3 services with Lambda
Understand costs and optimization strategies

📸 TODO: Add detailed Speaking Flow diagram

📸 TODO: Add screenshot of Bedrock model invocation

Next Steps

You’ve completed the AI & Voice section! Continue to CI/CD Pipeline to set up automated deployment.