| Site Reliability Engineer
Job Locations
    
 US-TX-Dallas
 
    
     
      | ID | 2025-3472
 
 | Category | Technology-Engineering
 
 | Position Type | Regular Full-Time
 
 | Overview
 
 ViaPath is seeking a Site Reliability Engineer in our Enterprise Operations department. SRE personnel combine engineering experience and an innate drive to improve existing systems and processes with the creativity to develop novel solutions to evolving challenges. SRE is responsible for the availability and reliability of critical platform services and applications, including launching product updates, locating production errors and issues and building integrations that improve users' experience. The SRE will support our Product Engineering pipeline, cloud, and datacenter environments. This position requires participation in an on-call rotation to provide 24/7 operations support. This position is a hybrid based position (office/home based) based out of one of the following ViaPath offices: Altoona, PA, Dallas, TX, Fruitland, ID, Mobile, AL or Pittsburgh, PAResponsibilities
 
 
     Run the production environment by monitoring availability and taking a holistic view of system health Build software and systems to manage platform infrastructure and applicationsImprove reliability, quality, and time-to-market of our suite of software solutionsMeasure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improveProvide primary operational support and engineering for multiple large, distributed software applicationsGather and analyze metrics from both operating systems and applications to assist in performance tuning and fault findingPartner with development teams to improve services through rigorous testing and release proceduresParticipate in system design consulting, platform management, and capacity planningBalance feature development speed and reliability with well-defined service level objectivesQualifications
 
 
     Bachelor's degree in computer science or other highly technical, scientific discipline preferred; related equivalent years of experience will be considered in lieu of a degree. A minimum of 2 years of site reliability, NetOps, DevOps, or similar experience, including responsibility for supporting production systems.Experience administrating Linux, installing, configuring, and maintaining Linux operating systems. Analyze and resolve problems associated with the operating systems, hardware, applications, and software.A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.Language and communication, possess excellent written and verbal skills. Ability to actively listen and to identify essential issues. Ability to read and interpret technical instructions and documentation.Excellent problem solving skills Preferred Experience with the following technologies 
     Experience with the core AWS services, including ALB, ELB, EC2, RDS, and S3 is preferred.Experience with distributed storage technologies like NFS, HDFS, S3 as well as dynamic resource management frameworks (Kubernetes, Cinc, Jenkins) is preferred.Coding experience beyond simple scripts. Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C/C++, Ruby, and JavaScriptPrevious success in technical engineeringCincGITLabKubernetesProxmoxMySql |