Required Qualifications:
* Extensive experience in optimising AI chip architectures and AI systems, with deep familiarity with mainstream heterogeneous computing software and hardware architectures. Comprehensive expertise spanning applications, foundational software, and chip design.
* Hands-on experience in at least one of the following areas: numerical computation, compilation, algorithm and chip co-design, runtime systems, or shared memory management.
* Solid understanding of AI industry application scenarios, mainstream models, and algorithm development trends, with the ability to derive chip-layer requirements from these insights.
* Expertise in analysing workload sensitivity to micro-architecture features, evaluating performance trade-offs, and providing recommendations to optimise both micro-architecture and application software.
* Familiarity with the performance impact of various compute, memory, and communication configurations, as well as hardware and software implementation choices for AI acceleration.
* Proficiency with GPU compute APIs like CUDA or OpenCL, and experience leveraging GPU/NPU-optimised libraries to enhance performance.
* Practical experience in developing deep learning frameworks, compilers, or system software.
* Strong background in compiler optimisation techniques; familiarity with LLVM-MLIR is a plus.
* Proficiency in software development using C/C++ and Python.
Desired Qualifications:
* Relevant experience in multiple subfields of AI, including application algorithms, frameworks, runtime systems, modelling and simulation, and compilers.
* In-depth understanding of innovative methods, platforms, and tools used by leading AI manufacturers, with proven experience in translating academic or research achievements into commercial products.
* Experience with GPU acceleration using AMD or NVIDIA GPUs.
* Expertise in developing inference backends and compilers for GPU or NPU systems.
* Proficiency with AI/ML inference frameworks such as ONNXRuntime, IREE, or TVM.
* Practical experience deploying AI models in production environments.