Francisco Mendes theRTLmaker

Low-level systems engineer working at the boundary of hardware, GPUs and ML performance

👋 About me

I work close to the metal.

My background is in CPU architecture, verification and memory systems, but my day-to-day curiosity lives in low-level programming: GPU kernels, performance modeling, and how real workloads stress real hardware.

I care about understanding systems end-to-end, from cache lines and warps up to training loops and frameworks, and using that understanding to make things faster and more predictable.

🔭 Current focus

100 Days of CUDA

A hands-on deep dive into GPU programming, kernel design and performance behavior.

Repo:
https://github.com/theRTLmaker/CUDA_in_100_days

🌱 Actively building depth in

CUDA and GPU kernel optimization
GPU memory hierarchies and profiling
Performance modeling and benchmarking
ML training workloads and system bottlenecks

⚙️ Technical toolkit

Low-level C++ and performance-oriented Python
CUDA programming and GPU profiling tools
CPU microarchitecture, caches and coherency
SystemVerilog and hardware-software interfaces

🧠 Interests

GPU and accelerator programming
ML systems and performance engineering
Hardware-aware software design
Debugging at uncomfortable layers
Making abstractions earn their keep

📬 Connect

Always up for conversations about GPUs, low-level systems, performance engineering or how software really hits the hardware.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Francisco Mendes theRTLmaker

Achievements

Achievements

Highlights

Organizations

Block or report theRTLmaker