Back

[Remote] Infrastructure Operations Engineer

Worldwide Salaried Open

Note: The job is a remote job and is open to candidates in USA. Voltage Park is your enterprise AI factory, offering scalable compute power and AI infrastructure. They are seeking a highly skilled Infrastructure Operations Engineer to ensure the stability, scalability, and performance of their compute, storage, and platform infrastructure, supporting AI/ML workloads at scale.

Responsibilities

  • At the direction of the Manager of Infrastructure Operations, design, build, and roll out new platforms and patterns to minimize incidents and enable customer facing and internal features
  • Deploy updates and improvements to support both Voltage Park’s internal and end customer use cases
  • Collaborate with colleagues in Infrastructure Engineering, Network Operations, Customer Success and Software and Platform Development Teams
  • Participate in the on-call rotation which is evenly distributed across all team members in a primary / secondary pattern where you are primary then move to a secondary position

Skills

  • 8+ years working with Linux as a server / hosting platform, extra points for Ubuntu experience
  • 5+ years experience with AWS
  • 2+ years experience with Kubernetes and strong container fundamentals
  • 2+ years experience with Terraform and Ansible
  • 2+ years with network attached storage management (via NFS, ceph, or other protocols). Extra points for experience with VAST storage systems
  • Experience with monitoring systems (Prometheus, ELK stack)
  • Familiarity with the gitops workflow
  • Software development experience using Python, Go, bash, or other languages for the purposes of automation & connecting systems & APIs together
  • Deep networking fundamentals, extra points for experience with datacenter level networks, 400Gb ethernet, and Infiniband
  • Experience building and delivering complex systems
  • Effective at navigating tradeoffs between design, risk, cost, and outcomes
  • Comfortable with navigating ambiguity
  • Strong written and oral communication
  • Experience with bare metal hardware troubleshooting and provisioning, extra points for working with Dell hardware
  • Experience with GPU servers, both in bare metal form or under virtualization
  • Deep experience with network switches, routers, and firewalls, particularly SONiC switches, Palo Alto firewalls and Juniper Networks as vendors
  • Experience with VAST storage systems

Company Overview

  • Voltage Park is a cloud platform providing on-demand and reserved GPU infrastructure for AI and machine learning workloads. It is a sub-organization of Lightning AI. It was founded in 2023, and is headquartered in Berkeley, California, USA, with a workforce of 51-200 employees. Its website is https://voltagepark.com/.
  • Apply To This Job

    More jobs

    [Remote] Lead Product Manager

    Worldwide Salaried

    [Remote] Digital Innovation Co-Op – Supply Chain

    Worldwide Salaried

    [Remote] Head of Marketing

    Worldwide Salaried

    [Remote] Customer Success Engineer - Austin, TX

    Worldwide Salaried

    [Remote] Regional Sales Leader

    Worldwide Salaried

    [Remote] Account Executive, Commercial

    Worldwide Salaried

    [Remote] Senior Technical Recruiter

    Worldwide Salaried

    [Remote] Account Executive, Strategic Enterprise

    Worldwide Salaried

    [Remote] Principal Product Manager

    Worldwide Salaried

    [Remote] Commissioning Project Manager Job Details | Westinghouse Electric Company, LLC

    Worldwide Salaried

    Experienced Part-Time Remote Data Entry Clerk – Typing – Entry Level Opportunity at arenaflex

    Worldwide Salaried

    Consumer Protection Compliance Testing Manager - Remote

    Worldwide Salaried

    Remote Data Analyst – Statistical Modeling, Predictive Analytics & Business Insights (Remote – $26/hr) – arenaflex

    Worldwide Salaried

    Experienced Industrial Engineering and Transportation Analyst – Data Entry Specialist for arenaflex Distribution Center

    Worldwide Salaried

    Experienced Virtual Executive Administrative Assistant – Data Entry and Operations Support Specialist – Contract to Hire Opportunity at arenaflex

    Worldwide Salaried

    Senior Backend Engineer (High-Throughput Platforms)

    Worldwide Salaried

    Consultancy: Book Publishing Project Lead – Senegal

    Worldwide Salaried

    Senior Applied Machine Learning Engineer, Asset Intelligence

    Worldwide Salaried

    Executive Recruiter/Head Hunter

    Worldwide Salaried

    Experienced Full Stack Data Entry Specialist – Web & Cloud Application Development

    Worldwide Salaried