Stack
Technologies I use to build GPU-accelerated AI systems.
GPU Computing & HPC
High-Performance Computing and GPU-Accelerated Systems
- CUDAExpert
Custom kernel development and optimization
- A100 / H100Expert
NVIDIA enterprise GPU architectures
- NCCLExpert
Multi-GPU communication optimization
- TensorRTAdvanced
GPU inference optimization
- TritonAdvanced
GPU kernel development framework
- cuDNNAdvanced
Deep learning GPU primitives
AI Infrastructure
Distributed AI Systems and Machine Learning Platforms
- PyTorchExpert
Distributed training and inference
- TransformersExpert
Large language model architectures
- Distributed TrainingExpert
Multi-node, multi-GPU scaling
- Model OptimizationExpert
Quantization, pruning, distillation
- MLOpsAdvanced
Production ML pipeline management
- Inference ServingAdvanced
High-throughput model serving
Systems Programming
Low-level systems and performance optimization
- RustExpert
Systems programming and cryptography
- PythonExpert
AI/ML development and automation
- C++Advanced
Performance-critical applications
- SPDM ProtocolAdvanced
Hardware security attestation
- CryptographyAdvanced
ECDSA, secure protocols
Cloud & Infrastructure
Enterprise-scale deployment and orchestration
- KubernetesExpert
GPU workload orchestration
- DockerExpert
Containerized GPU applications
- AWSAdvanced
EC2 P4/P5 GPU instances
- SlurmAdvanced
HPC cluster management
- PrometheusAdvanced
GPU metrics and monitoring