Apply Online
Interested candidates kindly apply before 10/12/2023.
CV must be in PDF format, saved with your Full name.
Hiring for SRE-Cloud Native Engineer (Python / GO, Bare Metal)!!!
Location: Pune
Must Haves :
- Bachelors and/or Masters in CS /EE or related field
- 5+ years of hands-on experience as an SRE with focus on cloud native technologies
- Hands-on experience in programming experience in one or more languages including Golang/ Python.
- Well versed with compiling implementation plan, code walk-through, and solution designs.
- Well versed with developing the code with distributed team and code merge issues resolution.
- Hands-on experience deploying, managing and troubleshooting Kubernetes clusters and components.
- Strong experience configuring and administering Linux systems in cloud/Saas production environments.
- Systematic problem-solving approach to troubleshooting, and the desire to solve the root cause of common problems in 24×7 environments.
Preferred Qualifications:
- Experience delivering infrastructure as code – Ansible, Terraform, Git, Jenkins, Helm, ArgoCD.
- Good understanding of test driven development, continuous integration and delivery.
- Good understanding of DNS, DHCP, LDAP, NFS, Kerberos, PAM, PXE, SNMP, SSH, HTTP/S, NTP, troubleshooting network performance issues.
- Experience with monitoring and logging systems such as Prometheus, Grafana, Nagios, ELK etc. and the ability to identify new technologies as appropriate.
- Experience tuning and optimizing storage solutions including Object Storage and NFS.
- Knowledge of virtualization, multiple hypervisor technologies as well as cloud computing technologies like AWS, Azure, GCP.
- Configuration and maintenance of web servers, load balancers, databases, storage systems and messaging systems.
- A passion to design for high availability and scale, with the discipline and desire for extensive automation.
- Strong communication skills with the ability and willingness to work with diverse teams, and customers, across multiple time zones.
How you will make an impact:
- Assume broad responsibilities for successful delivery of our services in a hybrid model including but not limited to, deployment, configuration, integrations, and ongoing operations.
- Deploy, administer, manage multiple Kubernetes clusters, both on-prem and in private cloud environments
- Lead efforts to triage, debug and fix issues related to network, storage, scheduling, applications, and systems, for proactive and reactive incident resolution and root cause analysis.
- Develop and continuously improve platform capabilities for observability, monitoring, notifications, logging, tracing and continuous delivery with reduced toil.
- Develop standard solutions that enable consistency in service delivery and engage with multiple cross-functional teams to solve problems that impact service levels.
- Collaborate with the platform engineers for continuous automation of fleet-wide infrastructure and application deployments.
- Determine and set SLOs for the service and build the process and tools to measure and implement the SLOs, prevent recurring problems and undesirable service conditions.
- Participate in on-call rotation responsibilities.