Site Reliability Engineer

Site Reliability Engineer

We’re looking for a driven, motivated Site Reliability Engineer to join our growing team who can exhibit the following:

Job Functions

  • Run the production environment by monitoring availability and taking a holistic view of system health
  • Build software and systems to manage platform infrastructure and applications
  • Improve reliability, quality, and time-to-market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Provide primary operational support and engineering for multiple large distributed software applications
  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplifts
  • Balance feature development speed and reliability with well-defined service level objectives

Qualifications

  • 8+ years experience in Azure/AWS
  • 4+ years experience in Devops CI/CD
  • 4+ years experience in Kubernetes / Docker / Linux / Powershell / Bash
  • Ability to program (structured and OO) with one or more high level languages
  • Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks

Nice to have

  • Database knowledge in databases like Cassandra, Neo4j, Druid, etc…
Job Category: AWS AzureDevOps
Job Type: Full Time
Job Location: Remote

Apply for this position

Allowed Type(s): .pdf, .doc, .docx