Job Scope:
We are seeking a highly skilled Development Operations Engineer (DevOps) to join our dynamic team. The ideal candidate will have extensive experience in managing and deploying mobile applications, with a specific focus on quick commerce and online grocery delivery services. You will be responsible for ensuring the reliability, scalability, and performance of our application infrastructure.
Key Responsibilities:
- Infrastructure Management: Design, implement, and manage scalable and reliable cloud infrastructure, ensuring high availability and disaster recovery capabilities.
- Incident Management: Lead incident response efforts, including root cause analysis, post-mortem reviews, and developing strategies to prevent future incidents.
- Monitoring and Logging: Implement and maintain monitoring, logging, and alerting solutions to ensure system health and performance, proactively resolving infrastructure issues before they impact end users.
- Security: Implement and maintain security best practices across all environments, including conducting regular security audits and compliance checks.
- CI/CD Pipeline: Develop, maintain, and optimize continuous integration/continuous deployment (CI/CD) pipelines for mobile applications, automating deployment processes and workflows to enhance efficiency.
- Performance Optimization: Analyze and optimize system and database performance, implementing improvements to ensure optimal performance under varying load conditions and managing scaling strategies.
- Collaboration: Collaborate closely with the development team to ensure seamless integration and deployment of new features, participating in agile ceremonies and contributing to sprint planning, reviews, and retrospectives.
- Documentation and Training: Create and maintain comprehensive documentation of systems, processes, and configurations, providing training and support to development teams on DevOps best practices and tools.
Requirements
• Minimum of 3-5 years of experience in a DevOps role, preferably within a startup that had one or more mobile applications.
• Available to respond to incidents at any time.
• Solid understanding of networking, security, and database management.
• Proven experience with Digital Ocean, AWS and/or GCP.
• Strong knowledge of containerization and orchestration tools such as Docker and Kubernetes.
• Proficiency in scripting languages such as Bash, Python Ruby.
• Experience with monitoring tools such as Prometheus, Grafana, ELK stack.
• Strong problem-solving skills and the ability to work under pressure.
• Excellent communication and collaboration skills.