🌐 Web UI First · v0.2.1 on PyPI

nsys-ai

Open Nsight Systems profiles directly in a browser-first timeline UI, then drill into terminal tools when you need them.

$ pip install nsys-ai click to copy
GitHub PyPI Python MIT
nsys-ai profile.nsys-rep
GPU 0 | 39.0s – 42.0s 39.127s 39.532s 39.937s N0 [───────── Iteration 142 ──────────────────────────────────────────────────] N1 [── forward ──────────] [── backward ─────────────────────────────────] N2 [Attn] [MLP] [Norm] [Attnβ–Έbwd] [MLPβ–Έbwd] [AllReduce] ─────────────────────────────────────────────────────── S7 flash_fwd 26ms β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–ˆβ–ˆβ–‘β–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ S12 β–ˆβ–ˆβ–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ S15 β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ ─────────────────────────────────────────────────────── 39.128sβ†’39.154s β”‚ 26.2ms [S7] β”‚ flash_fwd_splitkv_kernel └─ πŸ“ Iteration 142 └─ πŸ“ forward └─ β–Έ Attention β–Ά flash_fwd_splitkv_kernel 26.2ms

πŸš€ Try It Now

Get started in 60 seconds with a real GPU training profile

Quick Start β€” Example 20: Megatron-LM DistCA
# 1. Install $ pip install nsys-ai # 2. Clone & download a real profile (~700 MB) $ git clone https://github.com/GindaChen/nsys-ai && cd nsys-ai $ cd examples/example-20-megatron-distca $ python download_data.py # 3. Open the web viewer (default command) $ nsys-ai output/megatron_distca.nsys-rep # 4. Optional: profile overview in terminal $ nsys-ai info output/megatron_distca.sqlite # 5. Optional: terminal timeline TUI $ nsys-ai timeline output/megatron_distca.sqlite --gpu 4 --trim 39 42

See all examples β†’ examples/

🌐

Web Timeline

Default experience: multi-GPU browser viewer with progressive rendering, NVTX hierarchy bars, pinch-to-zoom, trackpad pan, and AI chat. No --trim required.

πŸ–₯️

Timeline TUI

Perfetto-style horizontal timeline with per-stream kernel visualization, NVTX hierarchy bars, and time-cursor navigation. Zoom, pan, and bookmark positions.

🌲

Tree TUI

Interactive NVTX hierarchy browser. Expand/collapse nodes, see kernel durations, filter by name or threshold, and navigate with vim-style keybindings.

🌐

HTML Exports

Generate interactive HTML viewers, Perfetto JSON traces for ui.perfetto.dev, and flat CSV/JSON exports for scripting and sharing.

πŸ”

Smart Search

Search kernels and NVTX annotations by name, with hierarchical search support. Filter by GPU, time window, and parent NVTX scope.

πŸ“Š

Overlap Analysis

Compute vs NCCL overlap, collective breakdowns by type, and automatic training iteration detection for distributed workloads.

πŸ€–

AI Analysis

Optional AI module for auto-commentary on kernel distributions, NVTX annotation suggestions, and bottleneck detection.

Keyboard-Driven Navigation

Navigation

← β†’Pan through time
↑ ↓Select stream
TabSnap to next kernel
+ -Zoom in / out
aToggle time mode

Analysis

/Filter by name
mMin duration filter
dDemangled names
CConfig panel
hHelp overlay

Bookmarks

BSave bookmark
'Bookmark list
, .Cycle bookmarks
`Jump back
[ ]Range bookmark

πŸ“š Documentation

Comprehensive guides for Nsight Systems profiling

Interactive Tools