DevOps Engineer/SRE
Responsibilities:
Design, develop, and maintain CI/CD pipelines using GitLab CI/CD to support a robust and efficient software delivery lifecycle.
Manage and maintain infrastructure on cloud platforms (especially GCP) using Infrastructure as Code (IaC) practices with tools such as Terraform.
Deploy and operate applications on Kubernetes (GKE), including configuration for autoscaling, security, and cost optimization.
Utilize tools such as ArgoCD to implement GitOps and automate deployments across multiple environments.
Build and manage monitoring and alerting systems using tools such as Prometheus, Grafana, Cloud Monitoring (Stackdriver), and other observability tools.
Collaborate with Development, QA, and Security teams to ensure systems are performant, secure, and auditable.
Manage and maintain database systems such as PostgreSQL, MySQL, and MongoDB with a focus on performance, availability, and backup strategies.
Investigate and resolve production issues related to system performance, reliability, and scalability.
Contribute to the design of Disaster Recovery and High Availability strategies.
Advocate and lead the adoption of DevOps best practices within the team and provide technical mentorship to peers.
Qualifications:
Bachelor's degree or higher in Computer Science, Computer Engineering, Information Technology, or related fields.
Minimum 3 years of experience in DevOps, Site Reliability Engineering, or Infrastructure Engineering roles.
Proficiency in working with cloud platforms, especially Cloud Platform (GCP) and its services (e.g., GKE, Cloud SQL, VPC, Load Balancer, IAM, Cloud Storage).
Strong experience in building and maintaining CI/CD pipelines using GitLab CI.
Solid understanding of Infrastructure as Code (IaC) using Terraform or Pulumi.
Hands-on experience managing Kubernetes clusters (GKE) including deployments, workloads, and cluster operations.
Proficient in writing and managing YAML configuration files, and experienced with tools like ArgoCD, Helm, Kustomize, Vault, Istio, Envoy, and Kong.
Strong knowledge of monitoring/logging systems, including Prometheus, Grafana, and cloud-native observability tools.
Operational experience with relational and NoSQL databases such as PostgreSQL, MySQL, and MongoDB.
Solid understanding of Linux systems and the ability to write shell scripts (bash/sh) for automation and system troubleshooting.
Good understanding of security principles including IAM, secret management, and network policy configuration.
Experience working with large-scale systems involving high availability, scalability, and microservices architecture.
Excellent communication skills and the ability to collaborate effectively with cross-functional teams (e.g., Development, QA, Product, Security).
Hybrid
22 active jobs
Submit your application now and take the next step in your career journey.
Similar Jobs