GMP 调度模型详解

面试题：GMP 指的是什么？为什么要有 P？

GMP 是 Go 语言运行时调度器的核心模型，是理解 Go 并发机制的关键。

GMP 三要素

G (Goroutine)

定义：用户态轻量级线程
特点：栈大小可伸缩（2KB-1GB）
状态：运行、阻塞、等待等多种状态
调度：由 Go 运行时调度，而非操作系统

M (Machine)

定义：系统线程（OS Thread）
职责：执行 Goroutine 代码
数量：默认限制为 10000 个
管理：由 Go 运行时创建和销毁

P (Processor)

定义：逻辑处理器
职责：维护本地 Goroutine 队列
数量：默认等于 CPU 核心数
作用：提供执行上下文

详细架构图

点击查看完整代码实现

┌─────────────────────────────────────────────────────────────┐
│                    Go Scheduler                              │
├─────────────────────────────────────────────────────────────┤
│  ┌─────┐    ┌─────┐    ┌─────┐    ┌─────┐    ┌─────┐       │
│  │  P  │    │  P  │    │  P  │    │  P  │    │ ... │       │
│  │ ┌─┐ │    │ ┌─┐ │    │ ┌─┐ │    │ ┌─┐ │    │     │       │
│  │ │G│ │    │ │G│ │    │ │G│ │    │ │G│ │    │     │       │
│  │ ├─┤ │    │ ├─┤ │    │ ├─┤ │    │ ├─┤ │    │     │       │
│  │ │G│ │    │ │G│ │    │ │G│ │    │ │G│ │    │     │       │
│  │ ├─┤ │    │ ├─┤ │    │ ├─┤ │    │ ├─┤ │    │     │       │
│  │ │G│ │    │ │G│ │    │ │G│ │    │ │G│ │    │     │       │
│  └─│─┘ │    └─│─┘ │    └─│─┘ │    └─│─┘ │    └─────┘       │
│    │   │      │   │      │   │      │   │                  │
│  ┌─▼─┐ │    ┌─▼─┐ │    ┌─▼─┐ │    ┌─▼─┐ │                  │
│  │ M │ │    │ M │ │    │ M │ │    │ M │ │                  │
│  └───┘ │    └───┘ │    └───┘ │    └───┘ │                  │
└─────────────────────────────────────────────────────────────┘
│                                                               │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │               Global Run Queue                         │ │
│  │  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐        │ │
│  │  │G│  │G│  │G│  │G│  │G│  │G│  │G│  │G│  │G│  ...   │ │
│  │  └─┘  └─┘  └─┘  └─┘  └─┘  └─┘  └─┘  └─┘  └─┘        │ │
│  └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

::: :::

核心数据结构

点击查看完整代码实现

// 简化的 G 结构
type g struct {
    stack       stack     // 栈信息
    stackguard0 uintptr   // 栈保护
    goid        int64     // goroutine ID
    gopc        uintptr   // 创建位置
    startpc     uintptr   // 函数起始地址
    waiting     sudog     // 等待信息
    param       unsafe.Pointer // 参数
    atomicstatus uint32   // 状态
    // ... 更多字段
}

// 简化的 M 结构
type m struct {
    g0      *g       // 调度栈
    curg    *g       // 当前运行的 g
    p       puintptr // 当前关联的 P
    nextp   puintptr // 下一个 P
    park    note     // 休眠信号
    alllink *m       // 链表指针
    spinning bool    // 是否在自旋
    // ... 更多字段
}

// 简化的 P 结构
type p struct {
    id          int32        // P 的 ID
    status      uint32       // P 的状态
    link        puintptr     // 链表指针
    m           muintptr     // 关联的 M
    mcache      *mcache      // 内存缓存
    runqhead    uint32       // 本地队列头
    runqtail    uint32       // 本地队列尾
    runq        [256]guintptr // 本地运行队列
    runnext     guintptr     // 下一个要运行的 G
    // ... 更多字段
}

::: :::

为什么需要 P？

1. 历史演进

GM 模型的问题（Go 1.0 之前）：

┌─────────────────────────────────────┐
│           Global Run Queue          │
│  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐  ┌─┐   │
│  │G│  │G│  │G│  │G│  │G│  │G│   │
│  └─┘  └─┘  └─┘  └─┘  └─┘  └─┘   │
└─────────────────────────────────────┘
   │    │    │    │    │    │
   ▼    ▼    ▼    ▼    ▼    ▼
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│  M  │ │  M  │ │  M  │ │ ... │
└─────┘ └─────┘ └─────┘ └─────┘

GM 模型的缺陷：

单一全局队列：竞争激烈，需要频繁加锁
M 直接调度 G：上下文切换开销大
缺乏局部性：无法利用 CPU 缓存
负载不均：无法做到工作窃取

2. GMP 模型的优势

引入 P 后的改进：

点击查看完整代码实现

// 模拟 GMP 模型的工作方式
package main

import (
    "fmt"
    "runtime"
    "sync"
    "sync/atomic"
    "time"
)

// 模拟 GMP 调度
type GMPSimulator struct {
    // 全局队列
    globalQueue chan int
    
    // P 的本地队列
    localQueues []chan int
    
    // 统计信息
    totalTasks    int64
    processedTasks int64
    
    // P 的数量
    numP int
}

func NewGMPSimulator(numP, queueSize int) *GMPSimulator {
    sim := &GMPSimulator{
        globalQueue: make(chan int, queueSize*numP),
        localQueues: make([]chan int, numP),
        numP:        numP,
    }
    
    // 为每个 P 创建本地队列
    for i := 0; i < numP; i++ {
        sim.localQueues[i] = make(chan int, queueSize)
    }
    
    return sim
}

// 模拟任务生产者
func (sim *GMPSimulator) producer(numTasks int) {
    for i := 0; i < numTasks; i++ {
        atomic.AddInt64(&sim.totalTasks, 1)
        
        // 随机选择一个 P 的本地队列
        pIndex := i % sim.numP
        
        select {
        case sim.localQueues[pIndex] <- i:
            // 成功加入本地队列
        default:
            // 本地队列满了，加入全局队列
            sim.globalQueue <- i
        }
    }
    
    // 关闭所有队列
    close(sim.globalQueue)
    for i := 0; i < sim.numP; i++ {
        close(sim.localQueues[i])
    }
}

// 模拟 M 处理器（工作者）
func (sim *GMPSimulator) worker(pIndex int, wg *sync.WaitGroup) {
    defer wg.Done()
    
    processTask := func(task int) {
        // 模拟处理任务
        time.Sleep(time.Microsecond)
        atomic.AddInt64(&sim.processedTasks, 1)
        if task%1000 == 0 {
            fmt.Printf("P%d 处理任务 %d\n", pIndex, task)
        }
    }
    
    for {
        select {
        // 优先从本地队列获取任务
        case task, ok := <-sim.localQueues[pIndex]:
            if !ok {
                goto checkGlobal
            }
            processTask(task)
            
        default:
            // 本地队列空，从全局队列获取
            goto checkGlobal
        }
        continue
        
    checkGlobal:
        select {
        case task, ok := <-sim.globalQueue:
            if !ok {
                // 全局队列也关闭了，尝试工作窃取
                if sim.tryWorkStealing(pIndex, processTask) {
                    continue
                }
                return // 没有更多工作
            }
            processTask(task)
            
        default:
            // 全局队列也空，尝试工作窃取
            if !sim.tryWorkStealing(pIndex, processTask) {
                time.Sleep(time.Microsecond) // 自旋等待
            }
        }
    }
}

// 工作窃取
func (sim *GMPSimulator) tryWorkStealing(thisPIndex int, processTask func(int)) bool {
    // 尝试从其他 P 的队列窃取任务
    for i := 0; i < sim.numP; i++ {
        if i == thisPIndex {
            continue
        }
        
        select {
        case task, ok := <-sim.localQueues[i]:
            if ok {
                fmt.Printf("P%d 从 P%d 窃取任务 %d\n", thisPIndex, i, task)
                processTask(task)
                return true
            }
        default:
            continue
        }
    }
    return false
}

func demonstrateGMPAdvantages() {
    fmt.Println("=== GMP 模型优势演示 ===")
    
    numP := runtime.NumCPU()
    numTasks := 10000
    
    sim := NewGMPSimulator(numP, 256)
    
    start := time.Now()
    
    // 启动生产者
    go sim.producer(numTasks)
    
    // 启动工作者
    var wg sync.WaitGroup
    for i := 0; i < numP; i++ {
        wg.Add(1)
        go sim.worker(i, &wg)
    }
    
    wg.Wait()
    duration := time.Since(start)
    
    fmt.Printf("处理 %d 个任务耗时: %v\n", numTasks, duration)
    fmt.Printf("实际处理任务数: %d\n", atomic.LoadInt64(&sim.processedTasks))
    fmt.Printf("吞吐量: %.2f tasks/ms\n", float64(numTasks)/float64(duration.Nanoseconds())*1000000)
}

func main() {
    demonstrateGMPAdvantages()
}

:::

3. P 的核心作用

1. 提供执行上下文

// P 包含了 G 执行所需的资源
type executionContext struct {
    mcache      *mcache      // 内存分配缓存
    deferpool   [5][]*defer  // defer 池
    sudogcache  []*sudog     // sudog 缓存
    sudogbuf    [128]*sudog  // sudog 缓冲区
    timers      *timers      // 定时器
}

2. 实现本地调度

点击查看完整代码实现

// 本地队列减少全局队列竞争
func localScheduling() {
    // 1. 优先从本地队列获取
    if gp := runqget(_p_); gp != nil {
        return gp
    }
    
    // 2. 从全局队列获取
    if gp := globrunqget(_p_, 0); gp != nil {
        return gp
    }
    
    // 3. 尝试工作窃取
    if gp := runqsteal(_p_, nil); gp != nil {
        return gp
    }
    
    return nil
}

:::

3. 支持工作窃取

点击查看完整代码实现

// 工作窃取算法
func runqsteal(_p_ *p, p2 *p) *g {
    if p2 == nil {
        // 随机选择一个 P
        p2 = allp[fastrand()%uint32(gomaxprocs)]
    }
    
    // 从 p2 的队列尾部窃取一半任务
    n := len(p2.runq)
    if n == 0 {
        return nil
    }
    
    // 窃取一半
    h := n / 2
    for i := 0; i < h; i++ {
        gp := p2.runq[n-1-i]
        p2.runq[n-1-i] = nil
        _p_.runq[len(_p_.runq)] = gp
    }
    
    return _p_.runq[0]
}

::: :::

GMP 调度流程

点击查看完整代码实现

// 简化的调度流程
func schedule() {
    _g_ := getg()
    _p_ := _g_.m.p.ptr()
    
    // 1. 检查全局队列（每 61 次）
    if gp := globrunqget(_p_, 1); gp != nil {
        execute(gp)
        return
    }
    
    // 2. 检查本地队列
    if gp := runqget(_p_); gp != nil {
        execute(gp)
        return
    }
    
    // 3. 执行工作窃取
    if gp := findrunnable(); gp != nil {
        execute(gp)
        return
    }
    
    // 4. 进入空闲状态
    stopm()
}

::: :::

性能对比

1. 锁竞争对比

点击查看完整代码实现

// GM 模型：全局锁竞争
var globalMutex sync.Mutex
var globalQueue []int

func gmModelEnqueue(task int) {
    globalMutex.Lock()
    globalQueue = append(globalQueue, task)
    globalMutex.Unlock()
}

// GMP 模型：本地队列减少竞争
func gmpModelEnqueue(p *processor, task int) {
    if len(p.localQueue) < cap(p.localQueue) {
        p.localQueue = append(p.localQueue, task)
    } else {
        // 本地队列满，加入全局队列
        globalMutex.Lock()
        globalQueue = append(globalQueue, task)
        globalMutex.Unlock()
    }
}

::: :::

2. 缓存局部性

func demonstrateCacheLocality() {
    // P 的存在使得同一个 P 上的 Goroutine 
    // 更可能在同一个 CPU 核心上运行
    // 提高 CPU 缓存命中率
    
    type Processor struct {
        id          int
        mcache      []byte    // 内存缓存
        timers      []Timer   // 定时器
        runqueue    []Goroutine
    }
    
    // 每个 P 维护自己的资源，减少跨核心访问
}

面试要点

1. GMP 的定义

G: Goroutine，用户态线程
M: Machine，系统线程
P: Processor，逻辑处理器，提供执行上下文

2. P 存在的原因

解决 GM 模型的全局队列竞争问题
提供本地调度，减少锁竞争
支持工作窃取，实现负载均衡
提供执行上下文，包含内存缓存等资源

3. 调度优势

本地队列优先：减少全局竞争
工作窃取：实现负载均衡
缓存局部性：提高 CPU 缓存命中率
可扩展性：支持大量 Goroutine

4. 关键数量关系

P 的数量：默认等于 CPU 核心数（可通过 GOMAXPROCS 调整）
M 的数量：按需创建，默认最大 10000
G 的数量：可以有数百万个

理解 GMP 模型是掌握 Go 并发编程的基础，也是 Go 高性能的重要保障。

GMP 调度模型详解 ​

面试题：GMP 指的是什么？为什么要有 P？ ​

GMP 三要素 ​

G (Goroutine) ​

M (Machine) ​

P (Processor) ​

详细架构图 ​

核心数据结构 ​

为什么需要 P？ ​

1. 历史演进 ​

2. GMP 模型的优势 ​

3. P 的核心作用 ​

GMP 调度流程 ​

性能对比 ​

1. 锁竞争对比 ​

2. 缓存局部性 ​

面试要点 ​

1. GMP 的定义 ​

2. P 存在的原因 ​

3. 调度优势 ​

4. 关键数量关系 ​

GMP 调度模型详解

面试题：GMP 指的是什么？为什么要有 P？

GMP 三要素

G (Goroutine)

M (Machine)

P (Processor)

详细架构图

核心数据结构

为什么需要 P？

1. 历史演进

2. GMP 模型的优势

3. P 的核心作用

GMP 调度流程

性能对比

1. 锁竞争对比

2. 缓存局部性

面试要点

1. GMP 的定义

2. P 存在的原因

3. 调度优势

4. 关键数量关系