Build agentic AI solutions with DeepSeek-R1, CrewAI, and Amazon SageMaker AI

In this post, we demonstrate how you can deploy an LLM such as DeepSeek-R1—or another FM of your choice—from popular model hubs like SageMaker JumpStart or Hugging Face Hub to SageMaker AI for real-time inference. We explore inference frameworks like Hugging Face TGI which helps streamline deployment while integrating built-in performance optimizations to minimize latency and maximize throughput. Additionally, we showcase how the SageMaker developer-friendly Python SDK simplifies endpoint orchestration, allowing seamless experimentation and scaling of LLM-powered applications.