Zakaria EL BAZI

Zakaria EL BAZI

Site Reliability Engineer

DevOps/SRE engineer with 7+ years of experience specializing in Kubernetes, cloud infrastructure, and large-scale platform reliability. Certified in Kubernetes (CKA, CKAD, CGOA, KCNA, KCSA), AWS, Azure, and Terraform.

Paris, France zakaria [at] elbazi [dot] co

Experience

Paris, France

Site Reliability Engineer

Tenable

Tenable Identity Exposure

  • Built an internal CLI to automate daily SRE tasks on the legacy single-tenant Azure platform, reducing toil and standardizing operations
  • Operate at scale across thousands of VMs and hundreds of AKS clusters with reliable, automated workflows
  • Modernized observability with OpenTelemetry — consolidated Grafana Agent + multiple Prometheus instances + Azure Monitor Agents into a single-agent pipeline
  • Maintain and harden Azure while contributing to the platform's migration to AWS
  • Designed and executed a zero-downtime migration of 1000+ AKS clusters from ingress-nginx to Traefik
AzureAKSAWSOpenTelemetryKubernetesTraefik
Paris, France

DevOps Lead Engineer

NetApp — Spot Ocean for Apache Spark

  • Engineered resilient multi-cloud infrastructure using Terraform, Kubernetes/Helm, and CI/CD pipelines
  • Achieved SOC2 compliance, optimized cloud costs, ensured 24/7 reliability
  • Developed tooling and CLI in Go to enhance developer experience
  • Partnered with Solutions Architects for customer-specific implementations
TerraformKubernetesHelmGoCI/CDSOC2
Paris, France

Data & DevOps Engineer

Société Générale

  • Migration of Horton big data environment to Cloudera data platform
  • Maintained CI/CD pipelines — Jenkins, Nexus, SonarQube, Ansible/AWX
  • Monitoring stack with Elastic Stack
  • POC Kubernetes/Airflow for ML-based workflows
  • Infrastructure management (private cloud) with Terraform
JenkinsTerraformClouderaKubernetesAirflowElastic Stack
Bordeaux, France

Data & DevOps Engineer

Excelerate Systems

  • Built secure big data architectures using Elastic Stack, SearchGuard, Postgres, Airflow, Docker, Kubernetes, and AWS
  • Solution architect for SearchGuard — security, alerting, and compliance for Elasticsearch
  • Consultant and trainer in the above technologies
ElasticsearchSearchGuardAWSDockerKubernetes
Bordeaux, France

Deep Learning Engineer

Excelerate Systems

  • Computer Vision & Deep Learning on the edge for smart connected devices
Computer VisionDeep LearningEdge Computing
Bordeaux, France

Visiting Professor

IONIS Education Group

  • ESME-Sudria — Ecole Spéciale de Mécanique et d'Electricité
  • ISEGCOM — La grande école de communication
  • KEDGE Business School — e-MBA track
Rabat, Morocco

Machine Learning Engineer

AIOX Labs — AI Studio

  • Built an intelligent travel recommendation system using Collaborative Filtering with KNN and a micro-service architecture
MLKNNMicroservices

Education

Ingénieur d'État (Master of Engineering)

INSEA — Institut National de Statistique et d'Economie Appliquée

Data Engineering & Business Intelligence

Classes Préparatoires (CPGE)

CNC — Maths / Physics (MP)

Preparatory classes for grandes écoles

Certifications

CKA — Certified Kubernetes Administrator
CKAD — Certified Kubernetes Application Developer
KCNA / KCSA / CGOA — Kubernetes & GitOps Certifications
AWS Solutions Architect — Amazon Web Services
Terraform Associate — HashiCorp Certified
Azure Fundamentals — Microsoft Certified
Azure Data Fundamentals — Microsoft Certified
Big Data Developer — Mastery Award for Students

Skills

Cloud & Infrastructure

AWS · Azure · Terraform · Private Cloud

Containers & Orchestration

Kubernetes · Docker · Helm · AKS · EKS

CI/CD & DevOps

Jenkins · GitHub Actions · ArgoCD · Ansible · SonarQube

Observability

OpenTelemetry · Prometheus · Grafana · Thanos · Elastic Stack

Data Engineering

Apache Spark · Hadoop · Airflow · Cloudera · Elasticsearch · PostgreSQL

Programming & Tools

Go · Python · Bash · Git · Linux

Languages

Français Professional · English Professional · العربية Native · ⵜⴰⵎⴰⵣⵉⵖⵜ Native

Projects, Talks & Publications

Open Source

CronJob Scale-Down Operator

Kubernetes operator that auto-scales down Deployments & StatefulSets during specific time windows

Talk

Low-Cost, Unlimited Metrics Storage with Thanos

Devoxx Morocco 2024 · Slides ↗

Talk

Introduction: Machine Learning & Deep Learning

Sci-Land 2, FSTS — Feb 2019 · Slides ↗

Talk

Machine Learning in Production

Talk'y Digital, XPR_CAMP — 2019 · Slides ↗

Publication

Artificial Intelligence in CyberSecurity

Research publication

Blog

Technical Articles

K8s, Azure, Terraform and more

Honors & Awards

1er Prix du Projet Innovant de l'année 2018 — AUSIM Maroc, Concours Innov'it

Contact