Lead in-depth investigation into incidents originating from a Line of Business (LoB) Application Team or an Infrastructure Team (e.g., network, compute, or storage), security breaches, performance degradation, and outages to uncover the root causes.
Responsible to deploy, manage and operate scalable, highly available, and fault-tolerant systems on cloud platforms such as AWS, Azure, and GCP.
Develop and implement incident response procedures and protocols to effectively mitigate and contain network incidents. Provide guidance and support to incident response teams during critical incidents, ensuring timely resolution and minimal impact on business operations.
Implement and maintenance CI/CD pipelines.
Monitoring system performance, troubleshoot issues, and optimizing system performance through load testing and capacity planning.
Responsible for automating infrastructure deployment, configuration, and management with following ‘Infrastructure as a Code’.
Work closely with engineering team and product team.
Minimum Qualifications
at least 2 years as lead.
Minimally 5 years of Devops, networking infrastructure engineering or related experience.
Proficient in Multi-Cloud (preferred, AWS), Docker, Terraform.
Experience in building & maintenance CI/CD Pipelines.
Have knowledge about network.
Knowledge of resource monitoring, logging, high availability, redundancy, autoscaling, and failover.
Good verbal & written communication in English and Indonesia.
Good teamwork skills. Capable of collaborating with other team members
Eager to learn, understand, and apply new technologies.