Site Reliability Engineer

By November 28, 2018Job Posting

The Site Reliability Engineer an advocate for continuous improvement of systems, internal processes, and techniques. This position requires experience in building reliable systems and high quality infrastructures. This role understands the importance of availability and scalability. Securing systems and preferring outcomes over output is critical. Working with teams and helping others succeed is essential. You aren’t afraid to bring your skills and expertise to create inventive solutions.


●  Work with the Engineering teams to design and build a secure, available, and scalable infrastructure (using SaaS, IaaS, PaaS, and IaC)

●  Configure network in a secure, accessible, and manageable setup

●  Implement system and service monitoring for mission critical systems

●  Maintain connection between monitoring solutions to incident alerting systems ●  Evolve the auto-remediation system to automatically resolve production support incidents before passing them to on-call engineers

●  Reduce the need for manual human-system interaction

●  Work with Engineering and QA teams to support environments and CI/CD pipelines

●  Mentor Schedulicity team members on networking, infrastructure, and security

●  Work with vendors as needed

●  Participate in on-call rotation and help identify potential production issues.

System Knowledge Requirements:

●  Windows Server and Linux system administration

●  Network best practices and configuration

●  Firewall/ACL management

●  Identity and access management

●  System and service monitoring solutions (e.g. CloudWatch, New Relic)

●  Incident alerting solutions (e.g. PagerDuty)

●  Server instance (VM) management

●  VPC best practices and configuration

●  VPN setup and support

●  DNS configuration (e.g. CloudFlare, Route 53)

●  TLS/SSL knowledge and best practices

●  System security best practices

●  Cloud storage and CDN solutions (e.g. S3, CloudFront, Akamai)

●  Scripting to support automation (e.g. PowerShell, Bash)

●  Experience with AWS, GCP, or Azure

Qualifications/Work Experience Requirements:

●  BS/BA in CS/CE/EE or 5+ years of experience as a site reliability engineer, infrastructure manager, or system administrator

●  Familiarity with modern security standards (e.g. PCI DSS)

●  Experience working in an agile environment

Open Date: November 28, 2018

Close Date: When it’s filled

Where: Bozeman, Montana-Home Office

Interested? Email your resume to

Leave a Reply