Why this role matters:
We are building cutting-edge AI solutions within the boundaries of the Italian National Cloud Strategy. You won’t just be clicking buttons in AWS; you will be architecting compliant, self-hosted AI platforms on sovereign infrastructure. This is a role for an engineer who understands how to build cloud-native reliability without relying on public hyperscalers.
Role Summary
As a DevOps Engineer for Sovereign AI, you will design and maintain the infrastructure that powers our AI/ML workloads on Italian National Cloud providers (e.g., PSN, Aruba, TIM Enterprise). Your focus will be on self-managed Kubernetes, strict Data Sovereignty, and implementing open-source MLOps toolchains that function independently of US-based public clouds.
Key Responsibilities
1. Sovereign Infrastructure & Orchestration
- Deploy and manage production-grade Kubernetes clusters on private cloud or national provider infrastructure (using Rancher, OpenShift, or Kubespray).
- Manage underlying virtualization layers (e.g., OpenStack or VMware vSphere) if bare-metal access is required.
- Ensure high availability and disaster recovery within the specific zones/regions of the national provider.
2. Self-Hosted MLOps
- Since we cannot use managed services (like SageMaker/Vertex), you will architect and maintain a self-hosted MLOps stack using tools like Kubeflow, MLflow, or Polyaxon.
- Configure and optimize MinIO or Ceph for S3-compatible object storage to handle large training datasets locally.
- Manage container registries (Harbor) located strictly within Italian borders.
3. Compliance & Security (GDPR/AGID)
- Strictly enforce Data Sovereignty principles; ensure no data egresses outside of Italy/EU.
- Implement security standards compliant with AGID (Agenzia per l’Italia Digitale) and ACN (Agenzia per la Cybersicurezza Nazionale) guidelines.
- Manage strict network policies (Calico/Cilium) and air-gapped or proxy-restricted environments.
4. GPU & Hardware Optimization
- Configure NVIDIA vGPU or PCI passthrough on virtualized national cloud instances.
- Optimize the AI stack (CUDA drivers, Container Toolkit) for maximum performance on constrained infrastructure.
- Serverless GPU usage experience
# Tech Stack & Skills
The environment is “Cloud Native” but relies heavily on open-source and self-hosted equivalents of public cloud services.
| Domain | Technologies |
| Cloud Environment | Italian National Cloud (PSN, TIM, Aruba, Almaviva) |
| Orchestration | Red Hat OpenShift, SUSE Rancher, or Vanilla K8s |
| Virtualization | OpenStack, KVM, VMware |
| Storage | Ceph, Rook, GlusterFS, MinIO (S3 compatible) |
| AI/ML Platform | Kubeflow (Crucial), MLflow, JupyterHub |
| CI/CD | GitLab CI (Self-hosted), Jenkins, ArgoCD |
| Observability | Prometheus, Grafana, Loki (PLG Stack) |
Qualifications
Required:
- 3+ years experience in System Engineering, DevOps, or SRE.
- Mastery of Kubernetes: You must know how to deploy and fix K8s when you don’t have a “Support” button from Google or Amazon.
- Experience with Linux System Administration (RHEL, Ubuntu, CentOS) at a deep level.
- Understanding of Data Sovereignty and GDPR regulations in a technical context.
- Proficiency in Python and Bash scripting.
Preferred:
- Experience migrating workloads from AWS/Azure to Private/National Clouds.
- Knowledge of GitOps principles (using ArgoCD or Flux).
- Experience working with Public Sector clients or heavily regulated industries (Finance, Healthcare).
- Italian language fluency (often required for documentation with national providers).
What We Offer
- Competitive salary tailored to the Italian market.
- Opportunity to work on high-impact projects within the National Strategic framework.
- [Welfare Plans / Buoni Pasto / CCNL Metalmeccanico or Commercio level].
- Training budget for Kubernetes (CKA/CKS) and Red Hat certifications.