600+ Reliability Jobs - June 2026 - Urgent Hiring

search.result_querys_job "reliability"

Never miss any updates for Reliability jobs

Undisclosed

Singapore

Posted
11 days ago
Undisclosed

Singapore

  • Plaud is a bootstrapped, skyrocketing, profitable company with a $250M revenue run rate achieved in just three years.
  • Define the next-gen paradigm for human-AI interaction.
  • Gain exposure to cutting-edge AI for Pro tools and play a direct role in our global expansion. ...
Posted
11 days ago
Undisclosed

KL City

  • Design and operate the SRE practice for Managed oferings, including on-call processes, SLA frameworks, incident response playbooks, and post-incident review (PIR) processes.
  • Build and maintain observability infrastructure: centralised logging (correlation IDs), metrics dashboards, distributed tracing, and alerting for the Predator/Instinct platform stack.
  • Define and track SLOs (Service Level Objectives) and error budgets for real-time transaction processing pipelines, targeting high TPS and low round-trip latency. ...
Posted
11 days ago
Undisclosed

KL City

  • Design and operate the SRE practice for Managed oferings, including on-call processes, SLA frameworks, incident response playbooks, and post-incident review (PIR) processes.
  • Build and maintain observability infrastructure: centralised logging (correlation IDs), metrics dashboards, distributed tracing, and alerting for the Predator/Instinct platform stack.
  • Define and track SLOs (Service Level Objectives) and error budgets for real-time transaction processing pipelines, targeting high TPS and low round-trip latency. ...
Posted
11 days ago
Undisclosed

KL City

  • Design and operate the SRE practice for Managed oferings, including on-call processes, SLA frameworks, incident response playbooks, and post-incident review (PIR) processes
  • Build and maintain observability infrastructure: centralised logging (correlation IDs), metrics dashboards, distributed tracing, and alerting for the Predator/Instinct platform stack
  • Define and track SLOs (Service Level Objectives) and error budgets for real-time transaction processing pipelines, targeting high TPS and low round-trip latency ...
Posted
11 days ago
Undisclosed

Singapore

  • Develop deep technical expertise in your assigned product area and tech stack.
  • Own production deployment, configuration, and release processes
  • Drive performance, reliability, and operability through continuous improvement ...
Posted
19 days ago
Undisclosed

KL City

  • Support the operation and maintenance of overseas cloud-based services, ensuring platform stability, reliability, and performance; proactively identify and resolve system bottlenecks.
  • Follow internal operational processes, taking ownership of incident management, service request management, problem management, and change management.
  • Be responsible for platform software upgrades, as well as the deployment, maintenance, and optimization of core systems. ...
Posted
19 days ago
Undisclosed
  • Leading Preventive Maintenance Plan & Execution.
  • Review & Assuring Required Spare parts & Inventory Level.
  • Develop Action Plan (Mitigation, Corrective & Preventive) for breakdown/issue. ...
Posted
19 days ago
Undisclosed

Singapore

  • Our Client:
  • • Leading consumer internet platform
  • • Large-scale distributed architecture with high concurrency and massive traffic scenarios ...
Posted
12 days ago
Undisclosed

Singapore

  • Advance knowledge of core AWS services: EC2, ECS/EKS, Lambda, S3, RDS/Aurora, DynamoDB, VPC, ELB/ALB/NLB, Route53, IAM.
  • Designing multi-AZ and multi-region highly available architectures.
  • Strong understanding of networking in AWS (subnets, routing tables, NAT, security groups, NACLs, VPC peering, PrivateLink). ...
Posted
19 days ago

SANMINA-SCI SYSTEMS SINGAPORE PTE. LTD.

SGD3,000 - SGD3,000 Per Month

Singapore

  • Responsible for reliability test capability installation in-house and support reliability test requirement.
  • Responsible for communication and coordination with external lab in supporting customer specified reliability test or reliability study demand.
  • Participating New Project Introduction (NPI) review and plan in order to deliver customer required/ specified reliable PCB. ...
Posted
20 days ago
SGD7,000 - SGD7,000 Per Month

Singapore

  • Unix or Linux administration and performance tuning skills, 0 ~ 5 years of leading services in a large scale *nix environment.
  • Java and JVM technologies runtime configurations and troubleshooting. Or proficient in Python/Go/other scripting language.
  • Experience with DevOps tools, processes, and culture. ...
Posted
20 days ago

Centre For Strategic Infocomm Technologies (CSIT)

Undisclosed

Singapore

  • Implement, operate and optimise modernised and virtualised network infrastructure to ensure scalable, sustainable and secure operations
  • Implement architectural standards, reference models and guidelines that enhance security, resiliency and scalability
  • Evaluate and implement network technologies that meet users’ needs and aligned with the industry development ...
Posted
20 days ago
Undisclosed
  • Design, implement, and maintain VMware Cloud Foundation (VCF) infrastructure to support GE’s organizational requirements.
  • Manage and troubleshoot VCF resource availability, including compute, memory, and storage (SAN and vSAN) up to 160 ESXi and more than 1200 VMs across multiple clusters located in both SG and MY, using tools such as NSX-T, vCenter, ESXi 8.x, and VMware vSphere Cluster availability.
  • Experience in integration with backup services using NetBackup (NBU) such as HotAdd for image backup/restore and Media to file level backup/restore to support business application VMs requirements, including full, incremental, and ad-hoc backups. ...
Posted
20 days ago
Undisclosed

KL City

  • Uphold Platform Integrity: Ensure reliability, availability, and performance, proactively addressing issues before they arise.
  • Innovate Continuously: Harness the latest tools for automation, performance enhancement, and operational efficiency.
  • Incident Management Leadership: Steer incident response and root cause analysis, reflecting our uncompromising commitment to excellence. ...
Posted
14 days ago
Undisclosed

Singapore

  • At least 5 years' relevant experience in supporting and/or implementing payments and securities settlement systems with a minimum of 5 years of experience in Application support and operations, or application security, site reliability engineering, or solution architecture
  • Strong knowledge of security principles and best practices in software development, security frameworks, vulnerability assessment tools, and penetration testing methodologies
  • Understanding of enterprise architecture patterns and integration technologies ...
Posted
21 days ago
Undisclosed

Malacca City

  • Pengetahuan sederhana dalam penyelenggaraan peralatan dan pembaikanperalatan "reliability" (elektronik dan mekanikal).
  • Kemahiran interpersonal, pengurusan masa dan kemahiran komunikasi yang bai.
  • Kemahiran menganalisa dan menyelesaikan isu-isu yang rumit. ...
Posted
14 days ago
SGD8,000 - SGD8,000 Per Month

Singapore

  • This is a short-term position of up to one year.
  • You provide front-line (L1) reliability and operational support across the Saudi Wealth Management platform landscape—spanning SAMA regulatory technologies (e.g., Watheeq, SARIE and ZATCA e-Invoicing),front-facing platforms such as RM Plus, core banking (Temenos T24) and payment platforms (TPH, GTX and SecPay). You balance day-to-day service stability with deep understanding of how these technologies enable Saudi business workflows, ensuring resilient operations, compliant outcomes and high-quality client service.
  • –  Monitor the health and availability of Saudi WM platforms (RM Plus, T24, TPH, GTX, SecPay and regulatory services) using dashboards and alerts to detect and respond to issues early. ...
Posted
21 days ago
Undisclosed

Singapore

  • Advance knowledge of core AWS services: EC2, ECS/EKS, Lambda, S3, RDS/Aurora, DynamoDB, VPC, ELB/ALB/NLB, Route53, IAM.
  • Designing multi-AZ and multi-region highly available architectures.
  • Strong understanding of networking in AWS (subnets, routing tables, NAT, security groups, NACLs, VPC peering, PrivateLink). ...
Posted
21 days ago
Undisclosed

Singapore

  • Advance knowledge of core AWS services: EC2, ECS/EKS, Lambda, S3, RDS/Aurora, DynamoDB, VPC, ELB/ALB/NLB, Route53, IAM.
  • Designing multi-AZ and multi-region highly available architectures.
  • Strong understanding of networking in AWS (subnets, routing tables, NAT, security groups, NACLs, VPC peering, PrivateLink). ...
Posted
21 days ago
Undisclosed

Singapore

  • A strong believer of automating DevOps & SRE aspects like infrastructure provisioning, deployment, observability, incident lifecycle, uptime SLA etc.
  • Bold to challenge, open to get challenged, curious to learn & grow
  • Using InfrastructureAsCode tooling like Terraform or Ansible to manage AWS resources ...
Posted
21 days ago
Undisclosed

Singapore

  • Advance knowledge of core AWS services: EC2, ECS/EKS, Lambda, S3, RDS/Aurora, DynamoDB, VPC, ELB/ALB/NLB, Route53, IAM.
  • Designing multi-AZ and multi-region highly available architectures.
  • Strong understanding of networking in AWS (subnets, routing tables, NAT, security groups, NACLs, VPC peering, PrivateLink). ...
Posted
21 days ago
Undisclosed

Singapore

  • Ensuring that the Solace Cloud Services are healthy and reliable, and that SLAs are being met
  • Design and implement our infrastructure tooling, observability, and automation
  • Contribute to making the production operations more efficient, less error-prone, etc. ...
Posted
2 days ago
Undisclosed

Singapore

  • 5–10 years of experience in an SRE or SRE related operations role, including 3+ years supporting e commerce, financial services, or large scale SaaS platforms.
  • Excellent infrastructure troubleshooting and analytical problem solving skills.
  • Strong hands on experience with observability and monitoring tools such as Splunk, Dynatrace, or equivalent, with a proven ability to triage and investigate complex issues. ...
Posted
a day ago
Undisclosed
  • Scope and Define tasks: Test and sample arrangement follow priority which to meet the delivery deadline for production or R&D product.
  • Tool Operation: Operate microscope, mini-chamber, MST(Media Servo Tester), blade tester, also to ensure high-quality test is perform.
  • Tool maintenance: manage spare parts inventory, perform PM within schedule, troubleshoot blade tester and mini-chamber. ...
Posted
11 hours ago
Undisclosed
  • About the Job
  • Lead Platform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GEL’s internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security.
  • As a Senior PRE within GEL’s Infrastructure team, you will play a pivotal role in designing, building, and operating distributed container hosting solutions using Broadcom’s Tanzu product. Your mission is to safeguard and continuously enhance cloud-native applications and services that power the organization’s container ecosystem. You will serve as a subject matter expert for Level 3 support, working closely with cross-functional teams to troubleshoot complex issues, optimize platform performance, and guide application teams in adopting reliability best practices.\ ...
Posted
16 days ago
Undisclosed
  • About the Job
  • Senior Platform Reliability Engineer (PRE) is responsible for engineering, operating, and maintaining GEL’s internal container platform and its supporting infrastructure, with a strong focus on reliability, resiliency, and security.
  • As a Senior PRE within GEL’s Infrastructure team, you will play a pivotal role in designing, building, and operating distributed container hosting solutions using Broadcom’s Tanzu product. Your mission is to safeguard and continuously enhance cloud-native applications and services that power the organization’s container ecosystem. You will serve as a Level2 support, working closely with cross-functional teams to troubleshoot complex issues, optimize platform performance, and guide application teams in adopting reliability best practices. ...
Posted
16 days ago
SGD7,500 - SGD8,500 Per Month

Singapore

  • 7+years strong experience in Production Support / SRE / BizOps (L2 Operations -hands-on troubleshooting, monitoring, and incident handling)
  • Hands-onexpertise in Linux (commands, system operations)
  • Strongscripting skills in Shell / Python / Jython ...
Posted
5 days ago
SGD8,000 - SGD8,800 Per Month

Singapore

  • Production Support / SRE (L2 Operations)
  • Linux& Scripting Expertise
  • Monitoring& Observability Tools ...
Posted
5 days ago
SGD5,000 - SGD6,000 Per Month

Singapore

  • Permanent position, Salary up to $6,000 with AWS + Bonus
  • Working location at Jurong Island
  • Experience in Equipment Reliability and Maintenance is a MUST ...
Posted
6 days ago