Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us making each other’s ideas stronger. That happens because every one of us shares a belief that we can make something wonderful and share it with the world, changing lives for the better. It’s the diversity of our people and their thinking that inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you’ll do more than join something — you’ll add something. In the Siri Attention and Invocation team we act as the front door to our users’ interactions with Siri on almost every shipping Apple device. We work hard to make sure that Siri responds only when intended, in an efficient and privacy-preserving manner. Description We are looking for an intern to explore speech synthesis and audio generation techniques. The ideal candidate will be very familiar with audio generation or text to speech synthesis. Key responsibilities: Develop audio generation and speech synthesis methods Build automated evaluation pipelines to assess quality of the synthetic data Optimize developed models for efficient inference Minimum Qualifications A Bachelor's degree or higher in a technical discipline (e.g. Computer Science, Engineering, Mathematics, Physics), with exposure to scientific computing. Demonstrable experience in training deep learning systems on multiple GPUs in Pytorch Demonstrable experience in audio, text to speech, speech to text technologies Knowledge of the state of the art in audio generation, e.g. autoregressive vs non-autoregressive systems, etc. Key Qualifications Preferred Qualifications Demonstrable experience with diffusion and/or autoregressive audio generation models Publications in audio generation at well known conferences Education & Experience Additional Requirements