Federico Cipriani

Summary

Lifelong programmer with 7+ years across Software Engineering, DevOps, and Platform roles. I specialize in Kubernetes, CI/CD, and distributed systems, with a strong focus on cost-aware infrastructure, observability, stress testing, and chaos engineering. Experienced Linux user passionate about scalable, secure systems and developer enablement. Feel free to ask me anything!

Education

Electronic Technician

Otto Krause Buenos Aires, Argentina

Space Systems Engineering

Universidad Nacional de San Martín (UNSAM) Buenos Aires, Argentina

Languages

I am an EU citizen (Italian) with native proficiency in Spanish and C2 proficiency in English.

Experience

Site Reliability Engineer

Argus Labs Remote: Buenos Aires, Argentina
02/2025 - Today

Migrated Pulumi Stacks from TypeScript to Golang to improve maintainability and performance.

Developed and maintained Kubernetes Custom Resource Definitions (CRDs) to power a bespoke Operator controller.

Designed and operated CI/CD pipelines using GitHub Actions, ArgoCD, AWS (ECR, EKS), n8n, and Buildkite.

Provisioned cloud-native observability and messaging stacks, including Groundcover, Jaeger, Prometheus, and NATS.

Adopted a Platform Engineering approach to empower developers through self-service infrastructure.

Contributed to Architecture Decision Records (ADRs) as part of ongoing technical collaboration.

Founder, Engineer

Engineered Software Remote: Buenos Aires, Argentina
10/2024 - Today

Built the project around sharing technical insights and business analysis through curated content and original writing.

Explored and prototyped multiple SaaS ideas focused on developer tools, technical content platforms, and the GenAI tech boom.

Designed and deployed self-hosted infrastructure for landing pages, forms, CI/CD pipelines, and observability.

I maintain a regular ideation and research loop to refine product-market fit and architectural approaches.

Site Reliability Engineer

Yuno Remote/Hybrid: Buenos Aires, Argentina & Bogotá, Colombia
06/2022 - 12/2024

Terraforming multiple AWS accounts, with peering for various specific purposes.

Provisioning and maintaining several EKS clusters, alongside managing their service mesh.

Engaged with cloud provider teams to assess technical and cost-fit solutions for our needs.

Designed dashboards and monitors for SLOs, collaborating on SLI definitions with various teams.

Implemented near real-time Change Data Capture (CDC) pipelines for the Data team's BI needs.

Proactively met weekly to analyze cloud spending and plan FinOps automation strategies for AWS, Datadog, and Snowflake.

Co-led Reliability Track sessions, leveraging Testkube and building a Python module to upscale EKS resources on demand.

Developed several FastAPI backends to integrate Argo, AWS, Datadog, Kubernetes, Opsgenie, and Slack.

Implemented Ansible roles for diverse tasks, including CI/CD pipelines and managing PostgreSQL roles and grants.

Collaborated with SecOps on PCI-DSS and ISO 27001 compliance audits to meet industry standards.

Site Reliability Engineer

Agot AI (Acquired) Remote/Hybrid: Buenos Aires, Argentina & Pittsburgh, US
03/2021 - 05/2022

Provisioning a hybrid on-premise cloud with K3S on Nvidia Jetson and Inference Servers (with LGTM stack).

Review and refactoring of technology stacks in both FrontEnd and BackEnd areas.

Quality control and code sanitization of all projects through GitLab CI pipeline jobs (ShellCheck, Pytest, etc).

Migration of EC2 instances to AWS Batch and MWAA to optimize Machine Learning SDLC pre-processing pipelines.

Research and benchmarking of cloud-based Machine Learning training tools (SageMaker, MLFlow, and Kubeflow).

Inference box servers budget & compatibility planning for on-premise instances.

Software Engineer

Democracia en Red Remote: Buenos Aires, Argentina
11/2020 - 04/2021

Development of full-stack solutions with Svelte and FastAPI technologies for a civic software platform.

Integration with Blockchain Federal Argentina (BFA) Proof of Authority (PoA) network for an e-voting system.

Provisioning of resources in both DigitalOcean and Amazon Web Services (AWS) public clouds.

Prototyping, implementation, and maintenance of standardized solutions.

Software Engineer

Nevrona Remote: Buenos Aires, Argentina
04/2019 - 12/2020

Served as Lead Developer and Architect for the FrontEnd & Middleware team.

Co-led development of a comprehensive work and social networking platform from inception.

Designed and built administrative interface (backoffice) for platform management.

Authored technical documentation and established FrontEnd and general development guidelines.

Conducted technical interviews and recruitment for both BackEnd and FrontEnd teams.

Systems Administrator

APLI On-site: Buenos Aires, Argentina
05/2017 - 12/2019

Remote and on-site support of server infrastructure.

Development of scripts and utilities for the shell in Bash, Perl, and Python.

Monitoring and alarm systems with Nagios, Icinga, Nagstamon, and Prometheus.

I've built a Raspberry-based monitor wall for a Network Operation Center (NOC).

Licenses & certifications

I currently hold the following certified exams and courses:

    HashiCorp Certified

  • Terraform Associate (003).

Linux Foundation

Cloud-native & Automation

  • Certified Argo Project Associate (CAPA).
  • Certified GitOps Associate (CGOA).
  • Certified Kubernetes Administrator (CKA).
  • Certified Kubernetes Application Developer (CKAD).
  • Kubernetes and Cloud Native Associate (KCNA).
  • Kubernetes and Cloud Native Security Associate (KCSA).
  • Scaling Cloud Native Applications with KEDA (LFEL1014).

Security & Compliance

  • Ethics for Open Source Development (LFC104).
  • Developing Secure Software (LFD121).
  • Security Self-Assessments for Open Source Projects (LFEL1005).
  • Securing Projects with OpenSSF Scorecard (LFEL1006).
  • Automating Supply Chain Security - SBOMs and Signatures (LFEL1007).
  • XSS Exploits and Defenses (LFEL1010).

See the above-listed certifications and all of my other verified credentials on Credly and CertDirectory.

Technical Portfolio

Here’s a brief list of some of the most interesting projects I’ve worked on, which I may still be actively involved in.

Wormhole VAA Observer

The Wormhole VAA Observer is an ongoing project that monitors and analyzes Verified Action Approvals (VAAs) within the Wormhole cross-chain messaging protocol. It features REST and gRPC microservices, developed in Rust, showcasing both protocol architectures. I've also provided infrastructure-as-code (IaC) manifests and their guide for deploying the solution on my own home lab Kubernetes cluster. For me, it was an excellent project to apply methodologies like Architectural Decision Records (ADRs), Domain-Driven Design (DDD), and the Repository pattern, among other techniques.

Hacklab

Hacklab is an Infrastructure-as-Code (IaC) project aimed at automating the provisioning, configuration, and deployment of a hybrid Kubernetes cluster that spans both on-premises nodes and a cloud VPS connected via WireGuard. By leveraging declarative tools such as Nix, Terraform, Helm, and Kustomize, Hacklab manages the entire lifecycle of infrastructure components, including Kubeadm+K3s clusters, GPU-enabled workloads, and the LGTM observability stack. The project includes production-ready Kubernetes manifests for a variety of services, such as MinIO object storage, HashiCorp Vault, and a comprehensive observability suite featuring Prometheus, Loki, Grafana, Tempo, and Alloy. Additionally, I’ve incorporated experimental integrations with Kubeflow for machine learning workflows, exploring potential use cases in a highly scalable setup for a Cloud Development Environment (CDE) home lab.

Dotfiles

Includes an opinionated, modular setup to provision a racked server, personal laptop, and Raspberry Pi cluster.
Supports optional modules for Kubernetes, CUDA, container runtimes, and development tools using Nix Flakes.

Skillset

Cloud & Infrastructure

Ansible, Argo, AWS Ecosystem (ACM, CE, CloudFormation, CloudWatch, CodeArtifact, EC2, ECR, ElastiCache, EKS, IAM, KMS, Lambda, RDS, Route53, S3, Secrets Manager, SQS, VPC), CI/CD, FinOps, Infrastructure as Code (IaC), Istio, GitOps, Github Actions, Helm, Kustomize, Kubernetes, K3S, KEDA, NixOS, Service Mesh, Terraform, Terragrunt, Testkube, Velero.

Monitoring & Reliability

Alertmanager, Alloy, Business Continuity Plan (BCP), Chaos Mesh, Datadog, Distributed Tracing, Fluentd, Grafana, Prometheus, Loki, Obersvability Instrumentation, Incident Response Management, Opsgenie, Post Mortem Writing, Chaos Engineering, RPO/RTO, Statuspage, Stress Testing.

Security & Compliance

AIDE, CIS, CVEs/CWEs, Encryption, GDPR, HIDS/NIDS, LUKS, NIST Compliance, MITRE ATT&CK, Nmap, OWASP, PCI-DSS, SAST, SBOMs, STRIDE, Security Self-Assessment, TCPDump/LibPCap, WAF, Wireshark, Zero Trust Architecture.

Data Engineering

Airbyte, CDC, DVC, DynamoDB, ElasticSearch, ELT/ETL, Kafka, Kubeflow, MLOps, MSK, MySQL, MWAA, PeerDB, PostgreSQL, SageMaker.

Programming & Frameworks

ACID, Axum, CLEAN, CQRS, Deno, Express/Polka, FastAPI, Haskell, Nix/Flakes, NodeJS, Python, Rust, Shell, Svelte/Kit, Typer, Typescript.

Work Culture & Practices

Agile, Architecture Design Records (ADRs), Follow-The-Sun (GDSE), KPI/OKR Alignment, Async Communication, Meeting Minutes & Outcome-Based Agendas.