Wen-Kai "SansWord" Huang
"A data platform builder interested in AI systems and their mathematical foundations."
Download Resume PDFSenior Software Engineer with 9+ years of experience designing and operating large-scale distributed systems and global data infrastructure on AWS and GCP, serving ~900M monthly active users across US, APAC, and Europe. Specialized in high-throughput, fault-tolerant pipelines (>20K msgs/sec), scalable data platforms and cloud-native migrations. Proven technical leader and mentor recognized for cross-functional collaboration and driving large-scale impact across globally distributed teams. Currently exploring agentic development and harness engineering through hands-on projects and public writing — applying measurement discipline (cost, model comparison, workflow design) to AI-assisted software engineering.
- Led end-to-end delivery of Yahoo's data infrastructure evolution to GCP, leading a dedicated 5+ engineer team to build a new GCP-native ingestion platform using Cloud Composer, Airflow, and Dataproc — improving data freshness from 3+ hours to under 60 minutes across US, APAC, and Europe.
- Contributed to Yahoo's hybrid cloud data infrastructure, supporting a zero-downtime AWS migration with Kubernetes-based auto-scaling and maintaining the multi-region lambda architecture serving recommendation, ads, and downstream systems across News, Sports, and Finance.
- Identified and resolved large-scale query inefficiencies, building aggregation and caching layers that reduced data scanned from 725 GB to 121 KB and cut query costs by 70% on Yahoo's BigQuery infrastructure.
- Drove an on-call reliability initiative, resolving root causes and improving monitoring to reduce open incident tickets from 1,000+ to zero; established practices subsequently adopted by the broader on-call team.
- Led cross-functional collaboration between ML research and engineering to productionize ranking models, co-inventing a patented salient entity algorithm (US11803605B2), and integrating ranking models into production.
- Drove early AWS adoption across Yahoo engineering, building blue/green deployment and auto-scaling patterns via ECS that were later adopted by global teams.
- Contributed to migrating 80 TB from house object storage to S3 with zero downtime, implementing dual-write/dual-read patterns to ensure data consistency and safe cutover.
- Scaled Pigeon, a high-availability Apache Pulsar-based message bus sustaining 20K msgs/sec with data consistency guarantees; drove migration from Apache ActiveMQ and improved MTBF from 3.5 days to 90 days (~26×).
- Core developer of Cupid, a fault-tolerant multi-tenant discount service; invented a flexible JSON-based rule engine empowering teams to independently configure promotions — driving 10K/day coupon redemption growth and boosting annual sales events.
Public write-up of 3 weeks going from AI skeptic to shipping 3 projects with Claude; reflects on where AI fills knowledge gaps vs. where senior judgment still steers design decisions (e.g. introducing a CubeDriver interface to decouple BLE from pointer/touch input).
Self-authored study guide covering interview skills, a full worked leaderboard design (Redis sorted sets, fan-out, sharding, rank problem), and a companion note on using AI to prepare for system design interviews.
Measured notes on Claude Code workflows: 7.8× cost delta between Sonnet 4.6 and Opus 4.7 on an identical feature run; per-story-point cost analysis; definition of harness engineering.
Real-time smart-cube solve analyzer with BLE ingestion, live 3D rendering, phase detection (CFOP/Roux), and opt-in Firestore cloud sync. Deliberately picked an unfamiliar stack to stress-test AI-assisted workflows on greenfield code.
Co-invented methods to identify salient entities in articles at scale and apply them in production recommendation systems at Yahoo.
| Languages | Python, Java, SQL, Shell, Groovy |
| Cloud — GCP | Dataproc, Cloud Composer, Dataflow, Airflow, BigQuery |
| Cloud — AWS | S3, ECS, ELB, ElastiCache |
| Infrastructure | Terraform, Kubernetes, Docker, Redis |
| Stream Processing | Apache Storm, Apache Pulsar, Kafka |
| AI Coding Practice | Agentic Programming, Harness Engineering |
| Writing & Learning | Technical blog posts, study guides, public writing |