OverviewJoin our Strategic Planning and Architecture (SPARC)fteam within MicrosoftAzure Hardware Systems & InfrastructureAHSI)forganization and be a part of theforganization behind Microsoftexpanding Cloud Infrastructure and responsible for powering MicrosoftcIntelligent Cloudmission. We are seeking a masters\/PhD student to join us in Cambridge winter\/spring 2025 to work on model compression and optimization for LLMs, covering topics such as post training quantization and quantization aware training. You will be joining a welcoming and highly interdisciplinary team and work on creative and challenging problems during your internship.QualificationsRequired\/Minimum Qualifications:Be enrolled in Masters\/PhD program in Computer Science\/Machine Learning or related disciplineSubstantial experience quantization of LLMs, model compressionSubstantial knowledge in low-precision data type such as floating point, integer formats, block floatsOther Requirements:Cloud Background CheckPreferred\/Additional Qualifications:PyTorch, Python, Hands-on experience in SW Tool developmentOutstanding communication skillsResponsibilitiesResearch and develop quantization flow for LLM inference and trainingDesign, implement and evaluate performance of quantized SOTA LLMsWrite and present your findings in technical documents or presentationsBenefits\/perksListed below may vary depending on the nature of your employment with Microsoft and the country where you work.Industry leading healthcareEducational resourcesDiscounts on products and servicesSavings and investmentsMaternity and paternity leaveGenerous time awayGiving programsOpportunities to network and connect