On behalf of Huawei, a world-renowned information and communication technology company, we are seeking passionate and talented individuals to join our team as Advanced Engineer (High-Efficiency AI Computing).
As an Advanced Engineer for High-Efficiency AI Computing, you will be at the forefront of shaping the next generation of AI hardware. Bridging the gap between algorithmic innovation and silicon design, you will play a critical role in pushing the boundaries of low-precision and sparse computing. Your work will directly influence the microarchitectural evolution of our flagship AI accelerators, driving massive improvements in energy efficiency and computational performance for large-scale AI models.
Job Description:
- Algorithmic Innovation: Spearhead research into optimal quantization methodologies for advanced low-precision data formats, driving the strategic adoption and expansion of our low-precision computing ecosystem to secure a competitive edge in numerical computing.
- Kernel Architecture: Architect and deploy high-performance, highly scalable kernels for low-precision and sparse computing (e.g., GEMM, FlashAttention), explicitly tailored to maximize the microarchitectural strengths of our in-house AI accelerators.
- Hardware-Software Co-Design: Identify systemic performance bottlenecks and drive the microarchitectural evolution of next-generation AI chips, significantly enhancing energy efficiency and throughput.
- Partner closely with IC design teams to ensure to deliver comprehensive technical specifications, rigorous benchmark reports, and high-value patent proposals.
Skills / Qualifications:
- Education: Ph.D. in Computer Science, Electronic Engineering, Automation, or a highly related technical field.
- Technical Expertise: Deep foundational knowledge in computer architecture, specifically regarding GPU/NPU microarchitecture implementation.
- Domain Knowledge: Profound, demonstrable expertise in algorithm research focused on low-precision quantization and sparsity.
- Proven Impact: A strong track record in inference optimization for Large Language Models (LLMs) or multi-modal models, coupled with hands-on high-performance kernel development.
- System-Level Design: Concrete experience in system-level architectural design for high-performance AI inference accelerators.
- Academic Excellence: A robust publication record in tier-1 architecture and AI conferences (e.g., ISCA, MICRO, HPCA, ASPLOS, NeurIPS, CVPR).
- Professional Attributes: Exceptional cross-functional collaboration and communication skills, with the ability to build consensus across algorithm, silicon design, software, and testing teams.