
1. Azure OpenAI
1.1. GPT 3.5 Turbo & GPT 4
1.2. GPT-4 Turbo with Vision
1.3. DALL-E
1.4. Whisper
1.4.1. Speech-to-Text
1.5. Text-to-Speech
1.6. Embeddings
2. Language
2.1. Named Entity Recognition (NER)
2.2. PII and PHI Detection
2.3. Language Detection
2.4. Sentiment Analysis
3. Vision
3.1. Optical Character Recognition (OCR)
3.2. Image Analysis
3.3. Video Analysis
4. Custom Vision
4.1. Submit your own sets of images and label them to train your own model.
5. Face
5.1. Face liveness detection (anti-deepfake)
5.2. Face Dectection
5.2.1. Face ID
5.2.1.1. Creates a unique numeric value/ID for a face
5.2.2. Face landmarks
5.2.2.1. 27 predefined points (eyes, nose, eyebrows, lips)
5.2.3. Attributes
5.2.3.1. Accessories
5.2.3.2. Blur
5.2.3.3. Exposure
5.2.3.4. Glasses
5.2.3.5. Head pose
5.2.3.6. Mask
5.2.3.6.1. Wearing a mask
5.2.3.7. Noise
5.2.3.8. Occlusion
5.2.3.9. QualityForRecognition
5.2.4. Face rectangle
5.3. Face Recognition
5.3.1. Group faces
5.3.2. Identification
5.3.2.1. Takes one or several Face IDs and a PersonGroup/LargePersonGroup
5.3.3. Verification
5.3.3.1. Takes one Face ID and a Person Object, performs one-to-one matching
6. Bot Service
6.1. Bot Framework SDK
6.2. Power Virtual Agents
6.3. Bot Framework Composer
7. Azure AI Search
7.1. Formerly Cognitive Search
7.2. Indexing Engine
7.3. Query Engine
7.4. Chunking and vectorization of larger pools of data
7.5. Semantic ranking
7.6. AI Enrichment
8. Immersive Reader
8.1. Standalone web app, improves reading accessibility
8.2. Isolate text content
8.3. Display pictures for common words
8.4. Highlight parts of speech
8.5. Read content aloud
8.6. Translate content in real-time
8.7. Split words into syllables
9. Speech
9.1. Speech-to-text
9.2. Real-time speech-to-text
9.3. Synthesize text to speech
9.3.1. Uses Speech Synthesis Markup Language (SSML)
9.4. Speech Translation
9.4.1. Real-time, multilingual translation of speech
9.5. Batch transcription
9.6. Spoken language identification
9.7. Speaker recognition
9.8. Pronunciation assessment
9.9. Intent recognition
9.10. Can build in speech studio, Speech SDK, Speech CLI, or REST APIs
10. Translator
10.1. Text Translator
10.2. Document Translation
10.3. Custom Translator
10.3.1. Build customized models to translate domain- and industry-specific language, terminology, and style.
11. Content Safety
11.1. Detects harmful user-generated and AI-generated content in apps and services
12. Document Intelligence
12.1. Formerly Form Recognizer
12.2. Document analysis with prebuilt models for common document types and custom models
12.2.1. Invoice
12.2.2. Receipt
12.2.3. Identity
12.2.4. US Mortgage
12.2.5. Health insurance cards
12.2.6. Contract
12.2.7. Credit/Debit Card
12.2.8. Marriage Certificate
12.2.9. US Tax Forms
13. Video Indexer
13.1. Built on Face, Translator, Azure AI Vision, and Speech
13.2. Extract insights from videos using Azure AI Indexer video and audio models
13.2.1. Deep search
13.2.2. Content creation
13.2.3. Accessibility
13.2.4. Monetization
13.2.5. Content moderation
13.2.6. Recommendations
13.3. Video models
13.3.1. Face detection
13.3.2. Celebrity identification
13.3.3. Account-based face identification
13.3.4. Thumbnail Extraction for faces
13.3.5. OCR
13.3.6. Visual content moderation
13.3.7. Labels identification
13.3.8. Scene segmentation
13.3.9. Shot detection
13.3.10. Black frame detection
13.3.11. Keyframe extraction
13.3.12. Rolling credits
13.3.13. Editorial shot type detection
13.3.14. Observed people tracking
13.3.15. Matched person
13.3.16. Object detection
13.3.17. Slate detection
13.3.18. Textual logo detection