jobs in Desay SV

Full Time Software Engineer Jobs, in Desay SV West Region (Singapore) - Maukerja

Software Engineer

Desay SV

Undisclosed

Jurong East, West Region (Singapore)

Share
Save

Working Location

  • Jurong East West Region (Singapore) Singapore

Job Description

Responsibilities

Desay SV Automotive Singapore Pte. Ltd. is an innovative organization committed to exploring frontier technologies. While the company has a strong background in automotive electronics, this role is exclusively focused on advancing applications in large language models and on-device AI inference.



Duties/ Responsibilities

  • On-Device Inference Engine Development. Design, develop, and optimize LLM inference engines for embedded, mobile, and edge devices — covering operator development, graph optimization, memory management, and multi-backend adaptation
  • Model Compression & Lightweight Deployment. Research and apply quantization (INT4/INT8/FP16), pruning, distillation, and KV Cache compression techniques to achieve efficient inference on resource-constrained hardware
  • Heterogeneous Hardware Optimization. Conduct operator-level performance tuning for ARM CPU, NPU, GPU, and DSP; use profiling tools to identify bottlenecks and continuously improve inference throughput and latency
  • LLM Inference Acceleration. Participate in building LLM inference acceleration solutions — including speculative decoding, continuous batching, and KV Cache optimization — to improve model response efficiency on edge devices
  • Cloud–Edge Collaboration. Collaborate on cloud AI Infra and on-device deployment pipelines: model export (ONNX/TorchScript), training–inference consistency validation, and joint cloud–edge inference architecture design
  • Track Frontier LLM Developments. Stay current with cutting-edge LLM research; explore feasible paths for applying the latest model capabilities (e.g., reasoning models, multimodal) to real-world embedded product scenarios


Requirement

  • Master’s degree or above in Computer Vision, Machine Learning, Automation, or related field
  • C++ Proficiency (Core Requirement). Expert-level C++ with deep understanding of memory models, concurrency, and low-level optimization. Proficient in Python for model conversion, evaluation scripts, and training toolin
  • Cloud AI Infra or Embedded Inference Framework Experience. Hands-on experience with either: (a) large-scale GPU training cluster operations and optimization, or (b) core module development in on-device inference frameworks such as MNN, TNN, NCNN, or ExecuTorc
  • Large Model Algorithm Fundamentals. Solid understanding of Transformer attention mechanisms, KV Cache, continuous batching, and speculative decoding. Familiar with mainstream open-source model architectures including LLaMA, Qwen, Gemma, and Mistra
  • Embedded Systems & Heterogeneous Hardware. Understanding of embedded system principles and heterogeneous hardware architectures (ARM, Snapdragon, MTK, Apple Silicon). Experience with driver adaptation or BSP is a plu
  • Engineering Discipline. Proficient in Linux development environments; experienced with performance profiling (perf, Instruments, Snapdragon Profiler), unit testing, and CI/CD workflow

Important Information

Never provide your bank or credit card details when applying for jobs. Do not transfer any money or complete unrelated online surveys. If you see something suspicious, Report this Job ad.

Learn More