Kerja Sepenuh Masa, AI Inference - Compression Engineer di Beijing Foreign Enterprise Management Consultants Co.,Ltd.

jobs in Beijing Foreign Enterprise Management Consultants Co.,Ltd.

AI Inference - Compression Engineer

Beijing Foreign Enterprise Management Consultants Co.,Ltd.

Undisclosed

Sepenuh Masa

Singapore

Kongsi

Simpan

Lokasi Kerja

Singapore

Penerangan Kerja

Tanggungjawab

On behalf of Huawei, a world-renowned information and communication technology company, we are seeking passionate and talented individuals to join our team as AI Inference & Compression Engineer.

Key Responsibilities

LLM Inference Acceleration. Research and develop advanced compression algorithms to accelerate LLM serving. Focus on KV cache optimization, model quantization, and resolving memory bandwidth bottlenecks during autoregressive decoding.
Classical Codec Development. Design and implement advanced video compression algorithms, focusing on improving Rate–Distortion (RD) performance, optimizing entropy coding, and enhancing quantization design for real-world applications.
AI-Based Media Coding. Develop and optimize AI-based video coding components, including AI-based loop filters, optical flow, and intelligent rate control.
Model Deployment & Fusion. Bridge the gap between AI research and production. Optimize deep learning models for efficient inference and ensure seamless integration of compression algorithms into deployment frameworks (e.g., vLLM).
Performance & Quality Evaluation. Conduct rigorous objective and subjective visual quality assessments such as PSNR and VMAF for video systems, as well as perplexity, zero-shot benchmarks, latency, and throughput analysis for LLM systems.

Required Qualifications

Master’s or PhD in Computer Science, Electronic Engineering, Mathematics, or related fields (PhD preferred).
Solid understanding of video coding fundamentals including prediction, transform coding, quantization, and entropy coding with hands-on experience in standards such as H.265/HEVC, AV1, or H.266/VVC.
Strong understanding of Transformer architectures and attention mechanisms, as well as key performance bottlenecks in generative AI inference, particularly memory bandwidth constraints (“memory wall”).
Strong proficiency in Python and C/C++. Hands-on experience building, training, and modifying models using PyTorch, TensorFlow, etc.

Preferred Qualifications

ISP Knowledge. Familiarity with Image Signal Processing flow, such as demosaicing, denoising, and tone mapping.
Image Processing. Experience in computer vision-based image enhancement (e.g., de-blurring, artifact removal, or HDR).
Hardware Optimization. Knowledge of SIMD, CUDA, or other hardware acceleration techniques for video and tensor processing.

Peringatan Penting

Jangan pernah kongsikan maklumat bank atau kad kredit anda semasa memohon pekerjaan. Elakkan membuat sebarang pembayaran atau mengisi survey yang tidak berkaitan. Jika ada yang mencurigakan, sila laporkan iklan pekerjaan ini segera.

Lebih Lanjut

Mohon

Kerja Sepenuh Masa, AI Inference - Compression Engineer di Beijing Foreign Enterprise Management Consultants Co.,Ltd. - Maukerja

AI Inference - Compression Engineer

Beijing Foreign Enterprise Management Consultants Co.,Ltd.