Data Science meets
Full-Stack Execution.

Diego Villagran Salazar — Data Scientist & Full-Stack Developer.
I build machine learning systems, analytics products, and scalable web apps that create measurable business impact.

View Projects

Featured Projects

Selected systems // 2024—2025

SYS_001

NYC Ride-Hailing Analytics Dashboard

Challenge

Transportation stakeholders needed reliable insights across Uber and Lyft trip patterns, pricing, and airport operations in New York City.

Solution

Built an interactive Streamlit analytics platform with predictive ML models, geospatial maps, and multi-tab operational dashboards.

Impact

Delivered fare prediction with R² > 0.85 and airport classification with 92% accuracy for practical decision support.

SYS_002

India Air Quality ETL Intelligence System

Challenge

Raw environmental data from hundreds of IoT sensors was fragmented and difficult to convert into policy-ready insights.

Solution

Designed a cloud ETL architecture using Azure Databricks, PySpark, PostgreSQL, and BI reporting for continuous analytics.

Impact

Processed 2M+ daily records from 500+ sensors and transformed noisy streams into consistent, actionable health indicators.

SYS_003

Code Master — Interactive Learning Platform

Challenge

Students needed a more engaging and structured way to practice coding with feedback and measurable progress.

Solution

Developed a gamified full-stack platform with Astro/Django architecture, secure auth, and learning-oriented UX.

Impact

Scaled to 10k+ users and earned 2nd place in the 2024 EdTech Innovation Awards.

SYSTEMS ARCHITECTURE // HIGH-FIDELITY INTERFACES // EDITORIAL DESIGN // SCALABLE ENGINEERING // SYSTEMS ARCHITECTURE // HIGH-FIDELITY INTERFACES // EDITORIAL DESIGN // SCALABLE ENGINEERING // SYSTEMS ARCHITECTURE // HIGH-FIDELITY INTERFACES // EDITORIAL DESIGN // SCALABLE ENGINEERING // SYSTEMS ARCHITECTURE // HIGH-FIDELITY INTERFACES // EDITORIAL DESIGN // SCALABLE ENGINEERING //

Systems & Capabilities

Machine Learning Pipelines

From preprocessing and feature engineering to training, evaluation, and deployment of predictive models.

Data Engineering

ETL orchestration with Python, PySpark, SQL, and cloud platforms for reliable high-volume analytics workflows.

Analytics Products

Interactive dashboards and decision systems with Streamlit and Power BI focused on real-world business metrics.

Web Platform Development

Scalable full-stack applications with Next.js, React, TypeScript, and cloud-ready deployment practices.

Working Principles

“Turn complex data into clear decisions and scalable products.”

01. Impact over output

I prioritize measurable outcomes: model accuracy, decision quality, processing speed, and business value.

02. End-to-end ownership

I build complete systems, from data collection and cleaning to production deployment and monitoring.

03. Clarity at scale

Good architecture keeps complexity contained so teams can iterate quickly without breaking reliability.

Technical Stack & Tooling

PythonPandasNumPyScikit-learnTensorFlowPyTorchPySparkSQLPostgreSQLPower BIStreamlitNext.jsReactTypeScriptTailwind CSSDockerKubernetesAzureAWSGoogle CloudGitGitHub

Let’s build something
intelligent and useful.

diegovillasal@gmail.com
2026 — Diego Villagran