Jobs

Build where you're strongest. All of our portfolio companies are hiring. We’d love to help facilitate a match.
companies
Jobs

DevOps Engineer

VAST Data

VAST Data

Software Engineering
Reykjavík, Iceland
Posted on Jan 29, 2026

DevOps Engineer

  • Engineering
  • Iceland - Reykjavik
  • Senior
  • Full-time

Description

VAST Data is the data platform company for the AI era. We are building the enterprise software infrastructure to capture, catalog, refine, enrich, and protect massive datasets and make them available for real-time data analysis and AI training and inference.

This is a great opportunity to be part of one of the fastest-growing infrastructure companies in history, an organization at the center of the revolution in artificial intelligence.

"VAST's data management vision is the future of the market." — Forbes

Our Iceland team is developing a next-generation cloud resource management platform that provides a unified API surface for managing infrastructure across multiple cloud providers. We're building systems that abstract complexity while maintaining the power and flexibility that enterprise customers demand.

Join us and discover just how VAST the possibilities are.

We're looking for a DevOps Engineer to join our engineering team in Iceland and help shape how our platform is deployed, operated, and observed in real-world environments.

You'll work at the intersection of infrastructure, reliability, and developer enablement—designing and operating Kubernetes-based environments, building deployment and release workflows, and establishing best practices for staging and production as the platform evolves.

This is a highly hands-on role with significant autonomy. You'll collaborate closely with backend engineers, influence architectural and operational decisions, and play a key role in defining how we run and scale production systems. You'll also be encouraged to leverage modern tooling—including AI-assisted workflows—to debug, maintain, and continuously improve the reliability of our environments.

What You'll Do:

- Design, deploy, and evolve Kubernetes-based environments across development, staging, and production

- Own and improve the release workflow, including CI/CD pipelines and Git-based deployment practices

- Build and maintain containerized workloads, Docker images, and Kubernetes runtimes

- Operate and improve our GitOps setup using Argo CD

- Define and implement monitoring, alerting, and observability best practices across services

- Develop and maintain Prometheus metrics, dashboards, and alerting rules

- Work with distributed tracing and service mesh technologies (e.g. Linkerd) to improve reliability and visibility

- Collaborate closely with backend engineers to ensure services are production-ready from day one

- Identify gaps in our infrastructure and proactively propose improvements to reliability, scalability, and developer experience

- Automate repetitive operational tasks and environment setup using infrastructure-as-code and scripting

- Help define what "good" production and staging environments should look like as the platform matures

- Leverage AI tools to assist with debugging, incident analysis, root-cause investigation, and operational decision-making in staging and production environments

Requirements

Required:

- Strong hands-on experience with Kubernetes in real-world environments

- Solid understanding of Docker, container images, and containerized application workflows

- Experience deploying and operating applications in Kubernetes clusters

- Familiarity with CI/CD pipelines and release workflows

- Experience working with Git-based workflows and infrastructure repositories

- Ability to reason about system reliability, failure modes, and operational best practices

- Comfortable taking ownership and driving initiatives independently

- Good communication skills in English

Preferred:

- Experience with Helm and/or Kustomize

- Experience using Argo CD or other GitOps tools

- Familiarity with Prometheus, Grafana, and Alertmanager

- Experience with distributed tracing and observability tooling

- Familiarity with service meshes (Linkerd, Istio, or similar)

- Experience operating and scaling production-grade environments

- Background in automating infrastructure and environment provisioning

- Experience working in early-stage or evolving platform environments

- Strong comfort using AI-assisted tools to help manage, debug, and maintain staging and production systems

- Using AI for log analysis, alert triage, configuration validation, and incident investigation

- Applying AI tools pragmatically to accelerate operational work while maintaining system understanding and accountability

- Curiosity and openness toward integrating AI into day-to-day SRE and DevOps workflows

- Experience using AI-assisted development tools (GitHub Copilot, Claude Code, Cursor, or similar)

Why Join Us?

- Impact: Be a key contributor to a platform that will serve enterprise customers globally

- Growth: Work alongside experienced engineers and expand your skills in cloud-native development

- Autonomy: Take ownership of significant features and architectural decisions

- Culture: Join a collaborative team that values quality, simplicity, and pragmatic solutions

- Location: Work from our Reykjavík office with a team passionate about building great software