Skip to content

四十七、别让性能被锁住

wenjianzhang edited this page Nov 23, 2019 · 1 revision

内置的RLock和WLock

RLock在同时读取的时间是不互斥的,有人可能就会认为开销是比较小或者可以忽略不计的,下面的代码可以看出影响有多少;

package lock

import (
	"fmt"
	"sync"
	"testing"
)

var cache map[string]string

const NUM_OF_READER int = 40
const READ_TIMES = 100000

func init() {
	cache = make(map[string]string)

	cache["a"] = "aa"
	cache["b"] = "bb"
}

func lockFreeAccess() {

	var wg sync.WaitGroup
	wg.Add(NUM_OF_READER)
	for i := 0; i < NUM_OF_READER; i++ {
		go func() {
			for j := 0; j < READ_TIMES; j++ {
				_, err := cache["a"]
				if !err {
					fmt.Println("Nothing")
				}
			}
			wg.Done()
		}()
	}
	wg.Wait()
}

func lockAccess() {

	var wg sync.WaitGroup
	wg.Add(NUM_OF_READER)
	m := new(sync.RWMutex)
	for i := 0; i < NUM_OF_READER; i++ {
		go func() {
			for j := 0; j < READ_TIMES; j++ {

				m.RLock()
				_, err := cache["a"]
				if !err {
					fmt.Println("Nothing")
				}
				m.RUnlock()
			}
			wg.Done()
		}()
	}
	wg.Wait()
}

func BenchmarkLockFree(b *testing.B) {
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		lockFreeAccess()
	}
}

func BenchmarkLock(b *testing.B) {
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		lockAccess()
	}
}

上述代码中 我们通过初始化一个map做为一个虚拟缓存,两个方法分别读取缓存,一个方法使用的读锁,一个方法没有用锁,使用benchma来检查一下耗时

$ go test -bench=.  
goos: darwin
goarch: amd64
pkg: github.com/wenjianzhang/golearning/src/ch46/lock
BenchmarkLockFree-4          100          10176144 ns/op
BenchmarkLock-4               10         117721790 ns/op
PASS
ok      github.com/wenjianzhang/golearning/src/ch46/lock        8.439s

可以看出两个方法性能相差一个数量级 查看cpu消耗

go test -bench=. -cpuprofile=cpu.prof
goos: darwin
goarch: amd64
pkg: github.com/wenjianzhang/golearning/src/ch46/lock
BenchmarkLockFree-4          200          10080805 ns/op
BenchmarkLock-4               10         117811130 ns/op
PASS
ok      github.com/wenjianzhang/golearning/src/ch46/lock        4.523s

通过 go test -bench=. -cpuprofile=cpu.prof 语句输出cpi.prof文件

使用 ** go tool pprof cpu.prof ** 来查看

$ go tool pprof cpu.prof  

能看到下边这样的控制要程序的输出

Type: cpu
Time: Nov 23, 2019 at 4:05pm (CST)
Duration: 4.49s, Total samples = 11.45s (254.80%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) 

使用 ** top -cum ** 语句查询耗时最长并按照 cum 进行排序

再次说明一下 cum 是累计时长;

Showing nodes accounting for 8.46s, 73.89% of 11.45s total
Dropped 10 nodes (cum <= 0.06s)
Showing top 10 nodes out of 29
      flat  flat%   sum%        cum   cum%
     1.09s  9.52%  9.52%      5.47s 47.77%  github.com/wenjianzhang/golearning/src/ch46/lock.lockFreeAccess.func1
     3.98s 34.76% 44.28%      4.60s 40.17%  runtime.mapaccess2_faststr
     0.09s  0.79% 45.07%      3.12s 27.25%  github.com/wenjianzhang/golearning/src/ch46/lock.lockAccess.func1
         0     0% 45.07%      2.62s 22.88%  runtime.goexit0
         0     0% 45.07%      2.62s 22.88%  runtime.mcall
     1.71s 14.93% 60.00%      1.72s 15.02%  runtime.findnull
         0     0% 60.00%      1.72s 15.02%  runtime.funcname
         0     0% 60.00%      1.72s 15.02%  runtime.gostringnocopy
         0     0% 60.00%      1.72s 15.02%  runtime.isSystemGoroutine
     1.59s 13.89% 73.89%      1.59s 13.89%  sync.(*RWMutex).RLock

可以使用** list lockAccess ** 查看 lockAccess 耗时详情

(pprof) list lockAccess
Total: 10.30s
ROUTINE ======================== github.com/wenjianzhang/golearning/src/ch46/lock.lockAccess.func1 in /Users/zhangwenjian/Code/golearning/src/ch46/lock/lock_test.go
      30ms      2.78s (flat, cum) 26.99% of Total
         .          .     41:   var wg sync.WaitGroup
         .          .     42:   wg.Add(NUM_OF_READER)
         .          .     43:   m := new(sync.RWMutex)
         .          .     44:   for i := 0; i < NUM_OF_READER; i++ {
         .          .     45:           go func() {
      10ms       10ms     46:                   for j := 0; j < READ_TIMES; j++ {
         .          .     47:
         .      1.29s     48:                           m.RLock()
      20ms      220ms     49:                           _, err := cache["a"]
         .          .     50:                           if !err {
         .          .     51:                                   fmt.Println("Nothing")
         .          .     52:                           }
         .      1.25s     53:                           m.RUnlock()
         .          .     54:                   }
         .       10ms     55:                   wg.Done()
         .          .     56:           }()
         .          .     57:   }
         .          .     58:   wg.Wait()
         .          .     59:}
         .          .     60:
(pprof) 

可以看出读锁的耗时还是比较长的;

sync.Map

  • 适合读多写少,且 Key 相对稳定的环境
  • 采用了空间换时间的方案,并且采用指针的方式间接实现值的映射,所以存储空间会较 built-in map 大
  • 协程安全
  • 内部分为只读区域,和读写区域

Concurrent Map

  • 使用于读写都很频繁的情况
  • 把大的map 变成了多个小的map,不同的读写操作就对应不同的小的map进行加锁,降低访问同一区域的的概率,提升了读写的效率

定义一个Map 接口 设定 SetGetDel 常见的方法,针对不同的Map 只要实现这三个接口就好了

type Map interface {
	Set(key interface{}, val interface{})
	Get(key interface{}) (interface{}, bool)
	Del(key interface{})
}

示例代码,可以参照https://github.com/wenjianzhang/golearning 对应ch46路径下的代码 简单介绍一下

RWLockMap 使用的是 **sync.RWMutex **

  • Get 使用的RLock
  • Set 使用的Lock
  • Del 使用的Lock

sync.Map 直接调用

  • Get 使用 Load
  • Set 使用 Store
  • Del 使用 Delete

ConcurrenMap

  • Get 使用 Get
  • Set 使用 Set
  • Del 使用 Del

相同读写线程数

const (
	NumOfReader = 100
	NumOfWriter = 100
)
$ go test -bench=.                     
goos: darwin
goarch: amd64
pkg: github.com/wenjianzhang/golearning/src/ch46/maps
BenchmarkSyncmap/map_with_RWLock-4                   200           7733864 ns/op
BenchmarkSyncmap/sync.map-4                          200           8198908 ns/op
BenchmarkSyncmap/concurrent_map-4                    300           5475214 ns/op
PASS
ok      github.com/wenjianzhang/golearning/src/ch46/maps        7.649s

可以看出 map_with_RWLock 和 sync.map性能相当,concurrent_map更优;

写多读少

const (
	NumOfReader = 100
	NumOfWriter = 200
)
$ go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/wenjianzhang/golearning/src/ch46/maps
BenchmarkSyncmap/map_with_RWLock-4                   100          15719999 ns/op
BenchmarkSyncmap/sync.map-4                          100          13498850 ns/op
BenchmarkSyncmap/concurrent_map-4                    200           7318006 ns/op
PASS
ok      github.com/wenjianzhang/golearning/src/ch46/maps        5.679s

concurrent_map 还是最优的

读多写少

const (
	NumOfReader = 200
	NumOfWriter = 100
)
─$ go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/wenjianzhang/golearning/src/ch46/maps
BenchmarkSyncmap/map_with_RWLock-4                   100          10347642 ns/op
BenchmarkSyncmap/sync.map-4                          200           8800578 ns/op
BenchmarkSyncmap/concurrent_map-4                    300           4990815 ns/op
PASS
ok      github.com/wenjianzhang/golearning/src/ch46/maps        6.325s

依然是concurrent_map 最好;

绝对的写少读多

const (
	NumOfReader = 10
	NumOfWriter = 1
)
─$ go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/wenjianzhang/golearning/src/ch46/maps
BenchmarkSyncmap/map_with_RWLock-4                  5000            243022 ns/op
BenchmarkSyncmap/sync.map-4                        10000            111173 ns/op
BenchmarkSyncmap/concurrent_map-4                  10000            176643 ns/op
PASS
ok      github.com/wenjianzhang/golearning/src/ch46/maps        5.043s

这个时间sync.Map的性能就会好于其他的了

总结

Clone this wiki locally