What The Role Is
You play an important role in providing Information and Communications Technology (ICT) engineering service to Ops Systems Sustainment Centre in HTX.
What You Will Be Working On
Lead in Implementation and Deployment of Ransomware-Resilient Recovery Solutions
- Design, implement and deploy end-to-end recovery architectures to restore systems in the event of ransomware attacks
- Work with system owners to define Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)
- Architect tiered recovery models, e.g. Critical Information Infrastructure (CII) systems vis-a-vis Significant Information Infrastructure (SII), aligned to RTO/RPO requirements
- Design solutions supporting immutable backups, offline/air-gapped storage, and isolated recovery environments (IRE)
- Define and document clean recovery paths ensuring restoration from known-good, uncompromised backups
- Define and maintain recovery blueprints for different system tiers, including OS, middleware, application, and data layers
- Document and handover recovery solutions to O&S teams with clear SOPs and technical guides
- Keep abreast of new/emerging technologies to future-proof solutions
Enable Recovery Operations and Readiness - Develop SOPs, runbooks, and validation procedures for system recovery
- Plan and execute recovery drills and tests to verify restoration speed and data integrity
- Lead the identification and remediation of gaps in recovery readiness
- Oversee backup validation processes including routine test restores readiness through post-drill reviews and continuous improvement
Function as Technical Authority and Solution Ownership - Function as the technical expert for recovery solutions post-deployment
- Provide in-depth guidance on backup and recovery operations across Ops Systems platforms
- Define and maintain backup retention and recovery validation standards
- Collaborate with infrastructure and application teams to ensure seamless integration of recovery processes into operations
- Render technical expertise to Ops Systems Engineering unit’s undertakings such as the O&S expertise to flagship programmes, governance & checklist for O&S gatekeeping of AORs, etc
- Advise on budget or capacity planning for storage, retention, and recovery environments
Coach and guide junior engineers in the Engineering unit - Provide coaching to junior officers or project team members [no formal people management responsibilities]
- Guide engineers on the application of knowledge, and translation of knowledge into viable solutions
Manage any other tasks as assigned by the supervisor
What We Are Looking For
Tertiary qualification in Computer Science, Information Technology, Electrical and Electronics Engineering or equivalent
Minimum 10 years of IT infrastructure, systems engineering, or operations, with at least 5 years in backup and recovery leadership
Proven expertise in designing and implementing ransomware-resilient system recovery strategies or equivalent
Track record on delivering RTO/RPO-aligned recovery capabilities
Strong leadership and collaboration skills across multi-disciplinary teams
Detail-oriented, structured, and proactive in identifying and mitigating recovery risks
Skilled in communicating complex recovery concepts to both technical and non-technical stakeholders
Ability to cope with reasonably high level of stress
Ability to work in team and independently
Possess a good grasp of IT industry best practices and processes, while keeping abreast of advances in technology and best practices
All new appointees will be appointed on a two-year contract in the first instance.
We wish to inform that only shortlisted candidates will be notified within 4 weeks upon closing of the advertisement.