OverviewWe are seeking a talented and motivated MSc\/PhD student summer intern (3-4 months) who will work in an exciting project to leverage advanced generative artificial intelligence (GenAI) techniques including Large Language Models (LLMs) for interfacing with chip development\/production and generating actionable insights. This position offers a unique opportunity to apply machine learning (ML) and data science skills in a semiconductor fab setting.Key TasksDevelop and optimize RAG pipelines work to build ingestion routes for datasets generated from a chip product development\/production environment including technical reports, logs etc.Integrate vector databases for efficient document searching, chunking and embedding retrieval by the modelModel Development and Fine-Tuning - utilize open-source GenAI tools to interface with the data related to chip development\/production. Train, fine-tune, or adapt GenAI to extract meaningful inferences.Collaborate with engineers and data scientists - to validate the AI-generated outputs, align ML solutions with chip development\/production requirements and refine the LLMs for domain-specific tasksInference and Insights Generation - develop methodologies to extract insights about chip development\/production, identify anomalies, and predict operational issues.Benchmark performance - validate the model's outputs and ensure alignment with the goals.Reporting & Documentation - document project progress and present findings to cross-functional teams and stakeholders.Demonstration and dissemination demonstrate the proof-of-concept GenAI model in a live demo setting and disseminate the internship resultsQualifications and trainingA currentMSc or PhD student in Computer Science, Electronics Engineering, Data Science, Machine Learning, or a related fieldEssential Skills and experienceStrong programming skills in Python and familiarity with ML libraries such as TensorFlow, PyTorch, or scikit-learnExperience with data preprocessing and handling large datasets (e.g., using Pandas, NumPy)Hands-on experience with LLMs or NLP frameworksFamiliarity with RAG pipelines, including vector embeddings and document retrieval techniques.Desirable Skills and experienceExperience with vector databasesPragmatic is committed to equity, equality, diversity, and inclusion; we strive to welcome everyone and create inclusive teams. We celebrate difference and encourage everyone to be themselves at work. Please let us know if you would like any adjustments to our application and interview process.