Responsibilities:
Analyze requirements for AI system scalability and performance.
Develop infrastructure to support large-scale AI training and deployment.
Utilize AI tools to optimize computational efficiency.
Collaborate with engineering teams to ensure system reliability.
Communicate technical designs to stakeholders.
Requirements:
Proficiency in Python, Linux, and containerization tools like Kubernetes.
Solid understanding of distributed computing systems.
Knowledge of cloud platforms like GCP or Azure.
Excellent problem-solving and technical skills.
Bachelor’s or Master’s in Computer Science, AI, or related field.