教程集 > Golang编程 > golang教程 > 正文 go基础之map-增和改（二）

go基础之map-增和改（二）

发布时间：2022-03-08 编辑：jiaochengji.com

教程集为您提供go基础之map-增和改（二）等资源，欢迎您收藏本站，我们将为您提供最新的go基础之map-增和改（二）资源

<h3>go基础之map-增和改（二）</h3> <ul><li><ul><li>写在之前</li><li>环境说明</li><li>makemap_small和makemap的区别</li><li>添加元素（没触发扩容的情况）</li><li>一直到在发生扩容前的map内存结构是怎样的呢</li><li>发生扩容</li><li>总结</li></ul></li></ul>

在上篇文章《go基础之map-写在前面（一）》介绍了map的数据结构，本篇会详细介绍map的增和改的代码实现，由于增和改的实现基本上差不多，所以就纳到一起分析了。如果想详细查看源码的注释，可以查看我的GitHub,欢迎批评指正。我的打算是把一些常用的数据结构都分析一遍，如果有志同道合的人，可以联系我。

我的具体调试环境在《go基础之map-写在前面（一）》已经说明的非常仔细了，现在只讲我分析增和改的调试代码。

<pre><code class="lang-go hljs">package main import ( "fmt" "strconv" ) func main() { m1 := make(map[string]string, 9) fmt.Println(m1) for i := 0; i < 20; i { str := strconv.Itoa(i) m1[str] = str } } </code></pre>

老规矩编译一波，看看第9行的申明到底干了啥？

<pre><code class="lang-bash hljs">go tool compile -N -l -S main.go > main.txt </code></pre>

编译结果有点多，我只列举重点代码：

<pre><code class="lang-bash hljs">"".main STEXT size=394 args=0x0 locals=0x98 0x0000 00000 (main.go:8) TEXT "".main(SB), ABIInternal, $152-0 .... 0x0036 00054 (main.go:9) LEAQ type.map[string]string(SB), AX 0x003d 00061 (main.go:9) PCDATA $0, $0 0x003d 00061 (main.go:9) MOVQ AX, (SP) 0x0041 00065 (main.go:9) MOVQ $9, 8(SP) 0x004a 00074 (main.go:9) MOVQ $0, 16(SP) 0x0053 00083 (main.go:9) CALL runtime.makemap(SB) .... 0x0107 00263 (main.go:13) PCDATA $0, $5 0x0107 00263 (main.go:13) LEAQ type.map[string]string(SB), DX 0x010e 00270 (main.go:13) PCDATA $0, $4 0x010e 00270 (main.go:13) MOVQ DX, (SP) 0x0112 00274 (main.go:13) PCDATA $0, $6 0x0112 00274 (main.go:13) MOVQ "".m1 56(SP), BX 0x0117 00279 (main.go:13) PCDATA $0, $4 0x0117 00279 (main.go:13) MOVQ BX, 8(SP) 0x011c 00284 (main.go:13) PCDATA $0, $0 0x011c 00284 (main.go:13) MOVQ CX, 16(SP) 0x0121 00289 (main.go:13) MOVQ AX, 24(SP) 0x0126 00294 (main.go:13) CALL runtime.mapassign_faststr(SB) .... </code></pre> <ul><li>第9行调用了<code>runtime.makemap</code>方法做一些初始化操作，我把map的初始容量设为大于8底层才会走该方法，否则会调用<code>runtime.makemap_small方法。</code></li><li>第22行调用了<code>runtime.mapassign_faststr方法</code>，该方法对应<code>main.go</code>第13行的赋值方法<code>m1[str] = str</code>。</li></ul>

我们找到了方法，在后面就可以在<code>$go_sdk_path/src/runtime/map.go</code>和<code>$go_sdk_path/src/runtime/map_faststr.go</code>
找到方法，然后断点调试即可。

<h2>makemap_small和makemap的区别</h2>

<code>makemap_small</code>的代码如下：

<pre><code class="lang-go hljs">// makemap_small implements Go map creation for make(map[k]v) and // make(map[k]v, hint) when hint is known to be at most bucketCnt // at compile time and the map needs to be allocated on the heap. func makemap_small() *hmap { h := new(hmap) h.hash0 = fastrand() return h } </code></pre>

该代码的实现十分简单，就设置了一个hash种子，其他的例如申通桶内存的操作只有在真正赋值数据的时候才会创建桶。该方法在什么情况会被调用呢？如注释说说"hint is known to be at most bucketCnt at compile time and the map needs to be allocated on the heap",bucketCnt就是8个，所以上面我的示例代码为何要设初始容量为9的原因就在这里。
我就直接略过这种情况，因为在实际应用场景下还是要指定容量,避免后面因为频繁扩容造成性能损失，<code>makemap</code>的代码如下:

<pre><code class="lang-go hljs">// makemap implements Go map creation for make(map[k]v, hint). // If the compiler has determined that the map or the first bucket // can be created on the stack, h and/or bucket may be non-nil. // If h != nil, the map can be created directly in h. // If h.buckets != nil, bucket pointed to can be used as the first bucket. func makemap(t *maptype, hint int, h *hmap) *hmap { mem, overflow := math.MulUintptr(uintptr(hint), t.bucket.size) // 是否超出了最大的可分配虚拟内存或者超出了uintptr表示的值 if overflow || mem > maxAlloc { hint = 0 } // initialize Hmap if h == nil { h = new(hmap) } // 随机hash因子 h.hash0 = fastrand() // Find the size parameter B which will hold the requested # of elements. // For hint < 0 overLoadFactor returns false since hint < bucketCnt. // 计算B的值，桶的个数为 1 << B B := uint8(0) // 不断的循环得到最大的B值 for overLoadFactor(hint, B) { B } h.B = B // allocate initial hash table // if B == 0, the buckets field is allocated lazily later (in mapassign) // If hint is large zeroing this memory could take a while. if h.B != 0 { var nextOverflow *bmap // 根据B的值去申请桶，包括逸出桶 h.buckets, nextOverflow = makeBucketArray(t, h.B, nil) if nextOverflow != nil { h.extra = new(mapextra) // nextOverflow指向逸出桶的内存地址 h.extra.nextOverflow = nextOverflow } } return h } </code></pre>

看程序注释大概明白该代码的作用就是得到B值和申请桶，<code>overLoadFactor</code>方法是用了6.5的扩容因子去计算出最大的B值，保证你申请的容量count要大于 (1>> B) * 6.5, 这个扩容因子想必大家都不陌生，在java中是0.75，为什么在go中是0.65呢？在<code>runtime/map.go</code>开头处有测试数据，综合考虑来说选择了6.5。大家可能注意到<code>maptype</code>用来申请桶的内存块了，下面看看<code>maptype</code>的代码,也有助于理解map的结构：

<pre><code class="lang-go hljs">type maptype struct { typ _type // map的key的类型 key *_type // map的value的类型 elem *_type // 桶的类型 bucket *_type // internal type representing a hash bucket // function for hashing keys (ptr to key, seed) -> hash hasher func(unsafe.Pointer, uintptr) uintptr // key的类型的大小 keysize uint8 // size of key slot // value类型元素的大小 elemsize uint8 // size of elem slot // 桶里面所有元素的大小 bucketsize uint16 // size of bucket flags uint32 } </code></pre>

<code>makemap</code>方法里面<code>math.MulUintptr(uintptr(hint), t.bucket.size)</code>用到了bucket的size，这里这个size和<code>maptype</code>的bucketsize一模一样都是272（《go基础之map-写在前面（一）》有介绍为什么是272），所以就能计算出需要分配的内存。仔细分析<code>makemap</code>的字段，可以发现定义了map的基本数据结构，后面代码用来申请桶的内存块的时候都使用了这个数据结构。
<code>makemap</code>第36行代码调用了方法<code>makeBucketArray</code>方法来申请内存，我们简单看看它里面的细节：

<pre><code class="lang-go hljs">// makeBucketArray initializes a backing array for map buckets. // 1< // dirtyalloc should either be nil or a bucket array previously // allocated by makeBucketArray with the same t and b parameters. // If dirtyalloc is nil a new backing array will be alloced and // otherwise dirtyalloc will be cleared and reused as backing array. func makeBucketArray(t *maptype, b uint8, dirtyalloc unsafe.Pointer) (buckets unsafe.Pointer, nextOverflow *bmap) { base := bucketShift(b) nbuckets := base // For small b, overflow buckets are unlikely. // Avoid the overhead of the calculation. if b >= 4 { // Add on the estimated number of overflow buckets // required to insert the median number of elements // used with this value of b. nbuckets = bucketShift(b - 4) sz := t.bucket.size * nbuckets up := roundupsize(sz) if up != sz { nbuckets = up / t.bucket.size } } if dirtyalloc == nil { // 申请nbuckets个桶，包括了逸出桶，所以所有桶在内存中其实连续的，虽然逻辑上有差异 buckets = newarray(t.bucket, int(nbuckets)) b0 := (*dmap)(add(buckets, uintptr(0)*uintptr(t.bucketsize))) println(b0.debugOverflows) } else { // dirtyalloc was previously generated by // the above newarray(t.bucket, int(nbuckets)) // but may not be empty. buckets = dirtyalloc size := t.bucket.size * nbuckets if t.bucket.ptrdata != 0 { memclrHasPointers(buckets, size) } else { memclrNoHeapPointers(buckets, size) } } if base != nbuckets { // We preallocated some overflow buckets. // To keep the overhead of tracking these overflow buckets to a minimum, // we use the convention that if a preallocated overflow bucket's overflow // pointer is nil, then there are more available by bumping the pointer. // We need a safe non-nil pointer for the last overflow bucket; just use buckets. // 得到逸出桶的位置 nextOverflow = (*bmap)(add(buckets, base*uintptr(t.bucketsize))) // 得到最后一个桶的位置 last := (*bmap)(add(buckets, (nbuckets-1)*uintptr(t.bucketsize))) // 给最后一个桶的逸出桶指针设置了桶的起始位置，环状了？官方解释说简单为了逸出桶这个指针不是空指针 last.setoverflow(t, (*bmap)(buckets)) } return buckets, nextOverflow } </code></pre>

注意12行处有个优化点，当B小于4的时候，也就是初始申请map的容量的时候的count < (1 >> B) * 6.5的时候，很大的概率其实不会使用逸出桶。当B大于4的时候，程序就预估出一个逸出桶的个数，在26行处就一并申请总的桶的内存块。第27行的代码是在源码中没有的，我只是用来调试代码所用，这个在《go基础之map-写在前面（一）》有介绍这个小技巧。在第49行就通过bucketsize计算出逸出桶的位置，并且在51到53行有个技巧，给最后一个桶的溢出桶指针设置了桶的起始地址，这个在后面系列的博客会介绍到为何这么使用。
ok，现在map的数据如下:

只有2个桶，而且每个桶的tophash值都是默认值0，由于此时key和value都为空，故没有展示出来。<code>extra</code>没有值是因为B小4没有溢出桶导致的。

<h2>添加元素（没触发扩容的情况）</h2>

<code>m1[str] = str</code>我上面分析了是对应的<code>map_faststr.go</code>里面的<code>mapassign_faststr</code>方法。第一次添加的key和value都是string类型。

<pre><code class="lang-go hljs">func mapassign_faststr(t *maptype, h *hmap, s string) unsafe.Pointer { //d := (*dmap)(unsafe.Pointer(uintptr(h.buckets))) //bucketD := uintptr(0) //for bucketD < bucketShift(h.B) 3 { // flag := false // for _, debugKey := range d.debugKeys { // if debugKey == "" { // continue // } // if flag == false { // print("bucket:") // println(bucketD) // } // print("key:") // println(debugKey) // flag = true // } // bucketD // d = (*dmap)(unsafe.Pointer(uintptr(h.buckets) bucketD*uintptr(t.bucketsize))) //} //println() //取出第三位是否是1，如果是1则表示正有另外一个协程在往map里面写数据 if h.flags&hashWriting != 0 { throw("concurrent map writes") } key := stringStructOf(&s) //获取key的hash值 hash := t.hasher(noescape(unsafe.Pointer(&s)), uintptr(h.hash0)) // Set hashWriting after calling t.hasher for consistency with mapassign. // 将标志位设置为正在写 h.flags ^= hashWriting if h.buckets == nil { h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) } again: // 获取该key落到第几个bucket,每个bucket指的是类似链并的bmap结构 mask := bucketMask(h.B) bucket := hash & mask // 如果存在扩容情况 if h.growing() { // 从oldbuckets里面复制到新申请的buckets里面 growWork_faststr(t, h, bucket) } // 寻址到第几个bmap b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) bucket*uintptr(t.bucketsize))) // 得到bmap的tophash值 top := tophash(hash) var insertb *bmap // 插入到哪个bmap里面 var inserti uintptr // 插入到bmap哪个位置 var insertk unsafe.Pointer // 插入key到bmap哪个位置 //找到一个空的地方插入该key bucketloop: for { for i := uintptr(0); i < bucketCnt; i { if b.tophash[i] != top { if isEmpty(b.tophash[i]) && insertb == nil { insertb = b inserti = i } // 一开始都是0，也就是emptyRest if b.tophash[i] == emptyRest { break bucketloop } continue } // 到这里已经找到tophash了，2个不同的key也有可能相等，继续判断是否key相等 // 在bucket中的key位置 k := (*stringStruct)(add(unsafe.Pointer(b), dataOffset i*2*sys.PtrSize)) // 字符串key的长度都不等的话肯定不是一个key if k.len != key.len { continue } // 要么2个字符串直接相等，要么直接内存地址相等 if k.str != key.str && !memequal(k.str, key.str, uintptr(key.len)) { continue } // already have a mapping for key. Update it. // 找到了相同的key，则要去更新value inserti = i insertb = b goto done } // 插入第9个的时候会走向这里，但是溢出的hmap是没有的 ovf := b.overflow(t) if ovf == nil { break } b = ovf } // Did not find mapping for key. Allocate new cell & add entry. // If we hit the max load factor or we have too many overflow buckets, // and we're not already in the middle of growing, start growing. // 如果次数个数超出了增长因子，或者没有超出增长因子，但是有太多的逸出桶了，这个和java的hashmap一样，当太多红黑树了，还是会影响查找效率，因为理想情况下，map的 // 查找效率应该是o(1) if !h.growing() && (overLoadFactor(h.count 1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { hashGrow(t, h) goto again // Growing the table invalidates everything, so try again } if insertb == nil { // all current buckets are full, allocate a new one. insertb = h.newoverflow(t, b) inserti = 0 // not necessary, but avoids needlessly spilling inserti } // 把tophash值放到topsh槽里面去 insertb.tophash[inserti&(bucketCnt-1)] = top // mask inserti to avoid bounds checks // 把key放到bmap里面 // dataOffset是为了得到内存对齐后的key的位置 // 为什么插入的是2*sys.PtrSize呢，因为string其实占了16字节 insertk = add(unsafe.Pointer(insertb), dataOffset inserti*2*sys.PtrSize) // store new key at insert position // 这块内存就放key的值 *((*stringStruct)(insertk)) = *key // key个数加1 h.count done: // done不关心是否是更新还是新增，拿到相应的位置即可 // 找到value存的内存位置 elem := add(unsafe.Pointer(insertb), dataOffset bucketCnt*2*sys.PtrSize inserti*uintptr(t.elemsize)) if h.flags&hashWriting == 0 { throw("concurrent map writes") } // 将标志位恢复 h.flags &^= hashWriting return elem } </code></pre> <ol><li>2到21行是我的调试代码，打印下桶里面有哪些键值对。</li><li>23行和32行都是对写标志位的操作，可见，map不支持多个goroutine写操作。</li><li>26行把key转成<code>stringStructOf</code>类型，后面方便用<code>stringStructOf</code>里面的<code>len</code>和具体的string的字符数组值<code>str</code>，这也是个优化点，少了后面通过<code>len(str)</code>的计算，提高效率。</li><li>28行<code>noescape</code>防止key被逃逸分析，计算出key的hash。</li><li>38行到54行的<code>again</code>的代码块主要计算式key应该落在哪个buket，为什么作为一个代码块操作呢？是因为在触发扩容的时候，会重新计算落到哪个bucket。40行计算出bucket掩码，这里二进制值是<code>10</code>,41行和hash做与运算，得到的值就是要把key应该存的桶号。这里也是个优化操作，通过二进制运算提高效率。第50行计算得到的值正是放到bucket里面的前8个hash槽里面。先忽略掉251行的扩容情况。</li><li>57行到123行的<code>bucketloop</code>代码块主要作用是找到key和value存取的位置，并把key放到bucket所在的内存位置。里面有2个for循环，外循环轮询bucket以及bucket的逸出桶，里循环轮询桶的8个tophash槽，如果找到空的tophash槽(66行和67行)就执行到<code>done</code>语句块。71行之后就是key的高8位hash码相等了，那么就有可能bucket已经存在了这个key，所以就先比key的长度，再比较内存。102~105行先忽略。107-110会申请一个逸出桶然后把key存到逸出桶的第一个位置。113行把tophash值放到hash槽里面。</li><li>至于第66行为什么要比较hash槽等于<code>emptyRest</code>才算找到了呢？这个在后面的系列会介绍到。</li></ol>

<code>done</code>代码块的代码比较清晰，就是得到value放的内存位置，并且把状态设置为写完成。

<pre><code class="lang-go hljs">done: // done不关心是否是更新还是新增，拿到相应的位置即可 // 找到value存的内存位置 elem := add(unsafe.Pointer(insertb), dataOffset bucketCnt*2*sys.PtrSize inserti*uintptr(t.elemsize)) if h.flags&hashWriting == 0 { throw("concurrent map writes") } // 将标志位恢复 h.flags &^= hashWriting return elem } </code></pre>

现在的map的内存结构是什么样的呢？

<h2>一直到在发生扩容前的map内存结构是怎样的呢</h2>

为啥明明2个桶都没填充完就要马上扩容了呢？这是因为扩容因子作用了：

<pre><code class="lang-go hljs">// overLoadFactor reports whether count items placed in 1< func overLoadFactor(count int, B uint8) bool { return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen) } </code></pre>

<code>count</code>此时值是13，13 * (2 / 2) = 13，但是在下次的13的key放进来的时候就会发生扩容了。

上面说到key为13的时候发生扩容，下面具体分析如何扩容的：

<pre><code class="lang-go hljs">// Did not find mapping for key. Allocate new cell & add entry. // If we hit the max load factor or we have too many overflow buckets, // and we're not already in the middle of growing, start growing. // 如果次数个数超出了增长因子，或者没有超出增长因子，但是有太多的逸出桶了，这个和java的hashmap一样，当太多红黑树了，还是会影响查找效率，因为理想情况下，map的 // 查找效率应该是o(1) if !h.growing() && (overLoadFactor(h.count 1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { d := (*dmap)(unsafe.Pointer(uintptr(h.buckets))) bucketD := uintptr(0) for bucketD < bucketShift(h.B) 3 { flag := false for i, debugKey := range d.debugKeys { if debugKey == "" { continue } println(d.tophash[i]) if flag == false { print("bucket:") println(bucketD) } print("key:") println(debugKey) flag = true } bucketD d = (*dmap)(unsafe.Pointer(uintptr(h.buckets) bucketD*uintptr(t.bucketsize))) } println() hashGrow(t, h) goto again // Growing the table invalidates everything, so try again } </code></pre>

在上面的2层for循环里面虽然找到了bucket还有剩余位置，但是第7行的<code>overLoadFactor(h.count 1, h.B)</code>计算出要发生扩容。8~28行是我的调试代码，用来打印出此时map的内存结构。<code>hashGrow</code>会做具体的扩容操作，然后执行<code>again</code>从新计算落入哪个bucket。
看看<code>hashGrow</code>干了嘛：

<pre><code class="lang-go hljs">func hashGrow(t *maptype, h *hmap) { // If we've hit the load factor, get bigger. // Otherwise, there are too many overflow buckets, // so keep the same number of buckets and "grow" laterally. bigger := uint8(1) //如果你不是因为超过了负载因子而是因为太多元素导致的，则不扩容二倍 if !overLoadFactor(h.count 1, h.B) { bigger = 0 h.flags |= sameSizeGrow } oldbuckets := h.buckets // 扩容那么就扩1倍 newbuckets, nextOverflow := makeBucketArray(t, h.B bigger, nil) flags := h.flags &^ (iterator | oldIterator) if h.flags&iterator != 0 { flags |= oldIterator } // commit the grow (atomic wrt gc) h.B = bigger h.flags = flags h.oldbuckets = oldbuckets h.buckets = newbuckets h.nevacuate = 0 h.noverflow = 0 if h.extra != nil && h.extra.overflow != nil { // Promote current overflow buckets to the old generation. if h.extra Go基础编程：Map
Golang从入门到放弃200618--Map(1)Map的初始化和基本操作
Go基础学习三之数组array、切片slice、map
Go语言基础教程——map篇
应用编程基础课第三讲：Go编程基础
学习 Go 语言 1 — 基础语法
想系统学习GO语言(Golang
由浅入深聊聊Golang的map
Go编程基础-学习1
Golang map的底层实现

上一篇：用Go语言实现二维数组的2种遍历方式以及一个完美案例 (Golang经典编程案例) 下一篇：装饰器模式在nsq中的运用

[关闭]

go基础之map-增和改（二）

最近更新

浏览排行