We seek a highly skilled and experienced Engineer for our Engineering team. You will be at the forefront of ensuring the seamless operation and scalability of our services. Your expertise will contribute significantly to the reliability and efficiency of our platforms, influencing both our product and engineering practices in site reliability engineering ensuring high system availability, scalability, and performance.
*Please note that this role is hybrid (2 days per week in the office)
Responsibilities:
• Develop and maintain tools and automation systems for deployment, monitoring, alerting, and incident response to reduce manual interventions and improve the efficiency of our operations.
• Collaborate closely with software engineering teams to understand product requirements and provide technical guidance for infrastructure design and and support scalable, durable, and reliable services.
• Participate in incident handling, investigating, and resolving production incidents to minimize impact and ensure system stability.
• Lead incident response and post-mortem analysis to prevent future outages and improve response strategies.
• Perform capacity planning and optimization to ensure the platform meets performance and scalability targets.
• Conduct regular system and performance analysis, identify improvement areas and implement solutions to enhance efficiency and stability.
• Troubleshoot and resolve complex system issues, including performance bottlenecks, network connectivity problems, and infrastructure failures.
• Implement and maintain security best practices throughout the platform, ensuring compliance with industry standards and regulations.
• Advocate for SRE best practices across the organization and contribute to setting service level objectives (SLOs) and service level indicators (SLIs).
• Mentor junior SRE team members and contribute to team growth and skill development.
• Participate in on-call rotations to provide support for the production environment, responding to and resolving incidents promptly.
• 5+ years of relevant work experience as a Platform Engineer, SRE, or similar role, implementing operational processes and tools, managing environments, and large-scale cloud application environments.
• Strong knowledge of system architecture and networking concepts, specifically designing scalable and fault-tolerant systems.
• Very good knowledge of Automated Build Systems (e.g., Jenkins, ArgoCD) for building and managing CI/CD pipelines.
• Proven experience in "Everything as code": infrastructure (e.g., Terraform, CrossPlane, Packer) and configuration management (e.g., Ansible). Scripting is second nature.
• Proficiency in at least one programming language (e.g., Python, Go, Java) and experience with scripting languages (e.g., Bash, PowerShell).
• Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes) and cloud platforms (e.g., AWS, Azure, GCP).
• Solid understanding of Linux-based systems, including administration, troubleshooting, and performance tuning.
• Knowledge of monitoring and logging frameworks (e.g., Prometheus, ELK Stack) and experience with implementing observability practices.
• Excellent problem-solving and analytical skills, with the ability to troubleshoot complex issues and provide effective solutions.
• Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams.
• Proficiency in English, both written and spoken.
We are the pioneers and trailblazers of a global IT Market Category (DEX) that is shaping the future of how the world works, giving our customers’ IT Teams total digital visibility across their enterprise. Our innovative solutions integrate real-time analytics, automation, and employee feedback across all endpoints. This enables our IT teams to solve complex technical challenges, create ever more productive workplaces, and deliver happy, satisfied employees in the digital workplace.
With over 1000 employees across 5 continents, Nexthink operates as One Team, connecting, collaborating and innovating to continuously grow. We call our employees ‘Nexthinkers’ and our commitment to diversity, inclusion, and equity is second to none. We currently have over 75 nationalities working with us, from all cultures and backgrounds, speaking many different languages.
If you are looking for a change and like a nice atmosphere, lots of challenges, and having fun while working, this is a great opportunity for you! Check what we offer:
• 💼 Permanent Contract and a competitive compensation package (Stock Options also included).
• 📍 Amazing centrally located offices near the Bernabeu Stadium.
• 🩺 Private Health Insurance (Sanitas) and daily meal vouchers of 11 EUR will be entirely covered by us.
• 🏡 Hybrid work model balancing office and remote work, with a structured approach for new hires to foster connections and onboarding.
• 🏖️ Flexible Hours and unlimited vacation (employees have unlimited paid time off on top of the 23 days of holidays we offer) plus 3 company-paid volunteer days.
• 🤸 Up to 25 EUR per month for a gym subscription.
• 🛴 Flexible retribution plan for kindergarten & transport tickets.
• 🧑🏫 Reimbursement of up to 50% of the cost of English & Spanish classes.
• 🍉 Fresh fruit, cookies, and occasionally some soft drinks as well.
• 🍕 Regular company and team events like Pizza talks, Team Building activities, Christmas parties, hosting Meetups at the office and more!
• 📣 Bonuses for referring successful hires after three months of continuous employment.
• 🚚 We offer a relocation package to people who are coming from another country.
Please note that not all the benefits listed above are available for temporary, contract, and internship roles. To ensure you have the most up-to-date information, we recommend checking with your Recruitment Partner.