Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INGOREME: Hnsw index #21186

Draft
wants to merge 171 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
171 commits
Select commit Hold shift + click to select a range
2773950
datalink return int64
cpegeric Jan 9, 2025
aa01e64
compile usearch
cpegeric Jan 10, 2025
5652837
update Makefile
cpegeric Jan 10, 2025
61efd95
update
cpegeric Jan 10, 2025
4dd9b74
update
cpegeric Jan 10, 2025
615ea54
update
cpegeric Jan 10, 2025
2554068
add thirdparties lib path
cpegeric Jan 10, 2025
8f104ff
update
cpegeric Jan 10, 2025
d4326f5
fix docker
cpegeric Jan 10, 2025
e8abc9d
Merge branch 'main' into hnsw_index
mergify[bot] Jan 10, 2025
e86ed1e
ignore license check
cpegeric Jan 10, 2025
5330453
Merge branch 'hnsw_index' of github.com:cpegeric/matrixone into hnsw_…
cpegeric Jan 10, 2025
53d7ea7
update test
cpegeric Jan 10, 2025
a2d0d7c
license
cpegeric Jan 10, 2025
cc40f44
usearch testing
cpegeric Jan 10, 2025
69983d1
fix ut
cpegeric Jan 10, 2025
1a34e07
ut depends on cgo thirdparties
cpegeric Jan 10, 2025
2f9a252
add static library
cpegeric Jan 10, 2025
d4408e3
musl not compatiable with g++
cpegeric Jan 10, 2025
566b0c5
musl
cpegeric Jan 13, 2025
0e60220
musl bug fix
cpegeric Jan 13, 2025
f200286
comments
cpegeric Jan 13, 2025
f61eac5
Merge branch 'main' into hnsw_index
cpegeric Jan 13, 2025
283a712
musl update
cpegeric Jan 14, 2025
e7009f2
Merge branch 'hnsw_index' of github.com:cpegeric/matrixone into hnsw_…
cpegeric Jan 14, 2025
34d92ae
add DEBUG_OPT
cpegeric Jan 14, 2025
fe3c33e
revert DEBUG_OPT
cpegeric Jan 14, 2025
c4f6a37
parser support hnsw index
cpegeric Jan 14, 2025
d16d7c3
Merge branch 'main' into hnsw_index
cpegeric Jan 14, 2025
c938f9c
update
cpegeric Jan 15, 2025
57fd28c
update x86_64 cmake
cpegeric Jan 15, 2025
c5b4169
x86_64 optimize with simsimd
cpegeric Jan 15, 2025
754fde1
Makefile
cpegeric Jan 15, 2025
dc7eaaa
update
cpegeric Jan 15, 2025
f604dd1
update
cpegeric Jan 15, 2025
0148c42
update
cpegeric Jan 15, 2025
96dcd88
update CC
cpegeric Jan 15, 2025
0a30324
disable simsimd
cpegeric Jan 15, 2025
9c6f5e1
disable simsimd for now
cpegeric Jan 15, 2025
ed7e9df
update
cpegeric Jan 15, 2025
5fa6289
Merge branch 'main' into hnsw_dev
cpegeric Jan 15, 2025
1a7fd43
update
cpegeric Jan 15, 2025
70467df
compile code update
cpegeric Jan 15, 2025
762eb05
bug fix
cpegeric Jan 16, 2025
ff587c4
update
cpegeric Jan 16, 2025
2ff6e02
bug fix table type name crashed with ivfflat table type name
cpegeric Jan 16, 2025
7628429
quantization
cpegeric Jan 16, 2025
f53cb3d
update
cpegeric Jan 16, 2025
86d9151
Merge branch 'main' into hnsw_dev
cpegeric Jan 16, 2025
e481029
table_function call end() when end of input
cpegeric Jan 16, 2025
07d2873
apply end
cpegeric Jan 17, 2025
06e103b
table_function
cpegeric Jan 17, 2025
c5bfb42
update makefile
cpegeric Jan 17, 2025
1fad511
enable openmp on linux
cpegeric Jan 17, 2025
9e131ab
check avx512fp16
cpegeric Jan 17, 2025
8b4419c
check avx512fp16
cpegeric Jan 17, 2025
70c026f
checkt avx512fp16
cpegeric Jan 17, 2025
2e1bf6e
Merge branch 'main' into hnsw_index
cpegeric Jan 17, 2025
5ecdebd
merge fix
cpegeric Jan 17, 2025
098411b
table function
cpegeric Jan 17, 2025
f4d8dc3
add types.go
cpegeric Jan 17, 2025
0893107
update
cpegeric Jan 17, 2025
a8886dd
check type when init
cpegeric Jan 17, 2025
80266cd
create index
cpegeric Jan 20, 2025
f9ed8f7
update
cpegeric Jan 20, 2025
d93552f
usearch index config
cpegeric Jan 20, 2025
4503a83
update
cpegeric Jan 20, 2025
63d87f0
update
cpegeric Jan 20, 2025
117f5d5
checksum
cpegeric Jan 20, 2025
64e7bb5
update
cpegeric Jan 20, 2025
cd9ec9a
update
cpegeric Jan 20, 2025
77001af
add cache
cpegeric Jan 21, 2025
f58eead
update
cpegeric Jan 21, 2025
e1a1c7c
update
cpegeric Jan 21, 2025
1ae70a0
update
cpegeric Jan 21, 2025
d9d1ecd
update
cpegeric Jan 21, 2025
c152363
update
cpegeric Jan 21, 2025
5ada712
update
cpegeric Jan 21, 2025
bd9270c
sync.Once
cpegeric Jan 21, 2025
b7ce43d
update
cpegeric Jan 21, 2025
a13ca24
expired
cpegeric Jan 21, 2025
b0857fe
update
cpegeric Jan 21, 2025
897d543
update
cpegeric Jan 21, 2025
a9a3aca
update
cpegeric Jan 21, 2025
42fa247
update
cpegeric Jan 21, 2025
b239c5b
delete from map when Load from database failed
cpegeric Jan 21, 2025
ac155fc
update
cpegeric Jan 21, 2025
48eef00
remove test
cpegeric Jan 21, 2025
7519be1
update
cpegeric Jan 21, 2025
89f519f
update
cpegeric Jan 21, 2025
3b64525
defer ticker.Stop
cpegeric Jan 21, 2025
025d180
update
cpegeric Jan 22, 2025
146711b
search
cpegeric Jan 22, 2025
52792e6
check house keep
cpegeric Jan 22, 2025
a664c09
time in micro
cpegeric Jan 22, 2025
ac21d2f
checksum
cpegeric Jan 22, 2025
42ed6af
thread safe heap
cpegeric Jan 22, 2025
45537f9
int64 id
cpegeric Jan 22, 2025
c0806e9
search with multiple threads
cpegeric Jan 22, 2025
192a8cf
l2sq always positive
cpegeric Jan 22, 2025
5796cf8
unixmicro
cpegeric Jan 22, 2025
ac72b40
add LastUpdate
cpegeric Jan 22, 2025
3efdaf7
remove cache by key
cpegeric Jan 22, 2025
974a0b2
destroy remove all keys
cpegeric Jan 22, 2025
58c6685
bug fix erros.Join
cpegeric Jan 22, 2025
84caf0b
generalized vector index cache
cpegeric Jan 23, 2025
0590238
update
cpegeric Jan 23, 2025
e055e04
cleanup
cpegeric Jan 23, 2025
de53357
cleanup
cpegeric Jan 23, 2025
d43cbe5
cleanup
cpegeric Jan 23, 2025
a676937
cleanup
cpegeric Jan 23, 2025
07bf903
update
cpegeric Jan 23, 2025
1ba7762
cleanup
cpegeric Jan 23, 2025
5fec6ea
comments
cpegeric Jan 23, 2025
334c406
update comments
cpegeric Jan 23, 2025
0aed0de
comments
cpegeric Jan 23, 2025
d8d8036
limit number of threads for search mini-indexes
cpegeric Jan 23, 2025
914418d
bug fix
cpegeric Jan 23, 2025
a57863e
fix indexes is empty
cpegeric Jan 23, 2025
ccbfb38
rename
cpegeric Jan 23, 2025
f570782
cleanup
cpegeric Jan 23, 2025
4d47db9
mocksearch
cpegeric Jan 23, 2025
a2541d5
cache expired
cpegeric Jan 23, 2025
ea0b729
bug fix
cpegeric Jan 23, 2025
88f4b59
more tests
cpegeric Jan 23, 2025
a06d54a
bug fix remove GetIndex
cpegeric Jan 23, 2025
24b0351
race test
cpegeric Jan 23, 2025
d83c819
cleanup
cpegeric Jan 24, 2025
271235b
test
cpegeric Jan 24, 2025
8284468
check status
cpegeric Jan 24, 2025
dd828f3
remove lock from expired
cpegeric Jan 24, 2025
490add5
update tests
cpegeric Jan 24, 2025
1d519f6
add hnsw test
cpegeric Jan 24, 2025
4dc919d
add index file
cpegeric Jan 24, 2025
05b0109
tests
cpegeric Jan 24, 2025
aa7a7bb
update
cpegeric Jan 24, 2025
08bc50b
test Once
cpegeric Jan 24, 2025
6a9c34e
add build test
cpegeric Jan 24, 2025
2800609
stub newHnswAlgo
cpegeric Jan 24, 2025
1a52395
hnsw_create test
cpegeric Jan 24, 2025
d3eb0e6
TODO load file
cpegeric Jan 25, 2025
306eccf
fallocate
cpegeric Jan 25, 2025
54cf8bd
add filesize and load chunk without order by
cpegeric Jan 25, 2025
75a42bc
fix tests
cpegeric Jan 25, 2025
e28845a
update config
cpegeric Jan 27, 2025
03ad23c
hnsw_search
cpegeric Jan 27, 2025
78d250a
ef_search
cpegeric Jan 27, 2025
2befcc6
ef_search
cpegeric Jan 27, 2025
190f34a
ef_search
cpegeric Jan 27, 2025
fa59add
remove ef_search sys var
cpegeric Jan 27, 2025
5b50e29
apply index
cpegeric Jan 27, 2025
2c9b492
update
cpegeric Jan 28, 2025
2ef55cb
update
cpegeric Jan 28, 2025
237f173
working
cpegeric Jan 28, 2025
785ca2d
check limit
cpegeric Jan 28, 2025
3653963
Merge branch 'hnsw_dev' into hnsw_index
cpegeric Jan 28, 2025
3b00137
sca test
cpegeric Jan 28, 2025
6d4731f
Merge branch 'hnsw_dev' into hnsw_index
cpegeric Jan 28, 2025
5d9c883
sca test
cpegeric Jan 28, 2025
f83a202
Merge branch 'hnsw_dev' into hnsw_index
cpegeric Jan 28, 2025
0e30165
sca
cpegeric Jan 28, 2025
52483c0
Merge branch 'hnsw_dev' into hnsw_index
cpegeric Jan 28, 2025
1bdf77c
Merge branch 'main' into hnsw_index
cpegeric Jan 28, 2025
3bf59a6
Merge branch 'main' into hnsw_dev
cpegeric Jan 28, 2025
5795fff
merge fix
cpegeric Jan 30, 2025
2ecb896
Merge branch 'hnsw_dev' into hnsw_index
cpegeric Jan 30, 2025
9213dcb
update cross apply
cpegeric Jan 31, 2025
f151772
remove cross apply
cpegeric Jan 31, 2025
3a8a028
search test
cpegeric Jan 31, 2025
b87300a
reindex
cpegeric Jan 31, 2025
4b9f9b5
comment unused code
cpegeric Jan 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .licenserc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ header:
- 'test/'
- 'optools/'
- 'cgo/external/'
- 'thirdparties/'
- 'pkg/frontend/test/'
- 'pkg/vm/engine/tae/mergesort/sort.go'
- 'pkg/vm/engine/tae/mergesort/heap.go'
Expand Down
61 changes: 41 additions & 20 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -41,17 +41,27 @@
# where am I
ROOT_DIR = $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
BIN_NAME := mo-service
UNAME_S := $(shell uname -s)
UNAME_S := $(shell uname -s | tr A-Z a-z)
UNAME_M := $(shell uname -m)
GOPATH := $(shell go env GOPATH)
GO_VERSION=$(shell go version)
BRANCH_NAME=$(shell git rev-parse --abbrev-ref HEAD)
LAST_COMMIT_ID=$(shell git rev-parse --short HEAD)
BUILD_TIME=$(shell date +%s)
MO_VERSION=$(shell git symbolic-ref -q --short HEAD || git describe --tags --exact-match)
GO_MODULE=$(shell go list -m)
MUSL_DIR=$(ROOT_DIR)/musl
MUSL_CC=$(MUSL_DIR)/bin/musl-gcc
MUSL_VERSION:=1.2.5

# check the MUSL_TARGET from https://musl.cc
# make MUSL_TARGET=aarch64-linux musl to cross make the aarch64 linux executable
ifeq ($(MUSL_TARGET),)
MUSL_TARGET=$(UNAME_M)-$(UNAME_S)
#MUSL_TARGET=x86_64-linux
endif
MUSL_NAME=$(MUSL_TARGET)-musl-cross
MUSL_DIR=$(ROOT_DIR)/$(MUSL_NAME)
MUSL_TAR=$(MUSL_NAME).tgz
MUSL_CC=$(MUSL_DIR)/bin/$(MUSL_TARGET)-musl-gcc
MUSL_CXX=$(MUSL_DIR)/bin/$(MUSL_TARGET)-musl-g++

# cross compilation has been disabled for now
ifneq ($(GOARCH)$(TARGET_ARCH)$(GOOS)$(TARGET_OS),)
Expand Down Expand Up @@ -95,11 +105,12 @@ pb: vendor-build generate-pb fmt
# build mo-service
###############################################################################

THIRDPARTIES_INSTALL_DIR=$(ROOT_DIR)/thirdparties/install
RACE_OPT :=
DEBUG_OPT :=
CGO_DEBUG_OPT :=
CGO_OPTS :=
GOLDFLAGS=-ldflags="-X '$(GO_MODULE)/pkg/version.GoVersion=$(GO_VERSION)' -X '$(GO_MODULE)/pkg/version.BranchName=$(BRANCH_NAME)' -X '$(GO_MODULE)/pkg/version.CommitID=$(LAST_COMMIT_ID)' -X '$(GO_MODULE)/pkg/version.BuildTime=$(BUILD_TIME)' -X '$(GO_MODULE)/pkg/version.Version=$(MO_VERSION)'"
CGO_OPTS :=CGO_CFLAGS="-I$(THIRDPARTIES_INSTALL_DIR)/include"
GOLDFLAGS=-ldflags="-extldflags '-L$(THIRDPARTIES_INSTALL_DIR)/lib -Wl,-rpath,$(THIRDPARTIES_INSTALL_DIR)/lib' -X '$(GO_MODULE)/pkg/version.GoVersion=$(GO_VERSION)' -X '$(GO_MODULE)/pkg/version.BranchName=$(BRANCH_NAME)' -X '$(GO_MODULE)/pkg/version.CommitID=$(LAST_COMMIT_ID)' -X '$(GO_MODULE)/pkg/version.BuildTime=$(BUILD_TIME)' -X '$(GO_MODULE)/pkg/version.Version=$(MO_VERSION)'"
TAGS :=

ifeq ($(GOBUILD_OPT),)
Expand All @@ -110,42 +121,51 @@ endif
cgo:
@(cd cgo; ${MAKE} ${CGO_DEBUG_OPT})

.PHONY: thirdparties
thirdparties:
@(cd thirdparties; ${MAKE})

# build mo-service binary
.PHONY: build
build: config
build: config cgo thirdparties
$(info [Build binary])
$(CGO_OPTS) go build $(TAGS) $(RACE_OPT) $(GOLDFLAGS) $(DEBUG_OPT) $(GOBUILD_OPT) -o $(BIN_NAME) ./cmd/mo-service

# https://wiki.musl-libc.org/getting-started.html
# https://musl.cc/
.PHONY: musl-install
musl-install:
ifeq ("$(UNAME_S)","Linux")
ifeq ("$(UNAME_S)","linux")
ifeq ("$(wildcard $(MUSL_CC))","")
@rm -rf /tmp/musl-$(MUSL_VERSION) musl-$(MUSL_VERSION).tar.gz
@curl -SfL "https://musl.libc.org/releases/musl-$(MUSL_VERSION).tar.gz" -o /tmp/musl-$(MUSL_VERSION).tar.gz
@tar -zxf /tmp/musl-$(MUSL_VERSION).tar.gz -C $(ROOT_DIR)
@cd musl-$(MUSL_VERSION) && ./configure --prefix=$(MUSL_DIR) --syslibdir=$(MUSL_DIR)/syslib && $(MAKE) && $(MAKE) install
@rm -rf musl-$(MUSL_VERSION) /tmp/musl-$(MUSL_VERSION).tar.gz
@rm -rf /tmp/$(MUSL_TAR)
@echo "https://musl.cc/${MUSL_TAR}"
@curl -SfL "https://musl.cc/$(MUSL_TAR)" -o /tmp/$(MUSL_TAR)
@tar -zxf /tmp/$(MUSL_TAR) -C $(ROOT_DIR)
endif
endif

.PHONY: musl-cgo
musl-cgo: musl-install
@(cd $(ROOT_DIR)/cgo; CC=$(MUSL_CC) ${MAKE} ${CGO_DEBUG_OPT})


musl-thirdparties: musl-install
@(cd $(ROOT_DIR)/thirdparties; MUSL=ON CC=$(MUSL_CC) CXX=$(MUSL_CXX) ${MAKE} ${CGO_DEBUG_OPT})

.PHONY: musl
musl: override CGO_OPTS += CC=$(MUSL_CC)
musl: override GOLDFLAGS:=-ldflags="--linkmode 'external' --extldflags '-static' -X '$(GO_MODULE)/pkg/version.GoVersion=$(GO_VERSION)' -X '$(GO_MODULE)/pkg/version.BranchName=$(BRANCH_NAME)' -X '$(GO_MODULE)/pkg/version.CommitID=$(LAST_COMMIT_ID)' -X '$(GO_MODULE)/pkg/version.BuildTime=$(BUILD_TIME)' -X '$(GO_MODULE)/pkg/version.Version=$(MO_VERSION)'"
musl: override GOLDFLAGS:=-ldflags="--linkmode 'external' --extldflags '-static -L$(THIRDPARTIES_INSTALL_DIR)/lib -lstdc++ -Wl,-rpath,$(THIRDPARTIES_INSTALL_DIR)/lib' -X '$(GO_MODULE)/pkg/version.GoVersion=$(GO_VERSION)' -X '$(GO_MODULE)/pkg/version.BranchName=$(BRANCH_NAME)' -X '$(GO_MODULE)/pkg/version.CommitID=$(LAST_COMMIT_ID)' -X '$(GO_MODULE)/pkg/version.BuildTime=$(BUILD_TIME)' -X '$(GO_MODULE)/pkg/version.Version=$(MO_VERSION)'"
musl: override TAGS := -tags musl
musl: musl-install musl-cgo config
musl: musl-install musl-cgo config musl-thirdparties
musl:
$(info [Build binary(musl)])
$(CGO_OPTS) go build $(TAGS) $(RACE_OPT) $(GOLDFLAGS) $(DEBUG_OPT) $(GOBUILD_OPT) -o $(BIN_NAME) ./cmd/mo-service

# build mo-tool
.PHONY: mo-tool
mo-tool: config
mo-tool: config cgo thirdparties
$(info [Build mo-tool tool])
$(CGO_OPTS) go build -o mo-tool ./cmd/mo-tool
$(CGO_OPTS) go build $(GOLDFLAGS) -o mo-tool ./cmd/mo-tool

# build mo-service binary for debugging with go's race detector enabled
# produced executable is 10x slower and consumes much more memory
Expand All @@ -162,9 +182,9 @@ debug: build
# Excluding frontend test cases temporarily
# Argument SKIP_TEST to skip a specific go test
.PHONY: ut
ut: config
ut: config cgo thirdparties
$(info [Unit testing])
ifeq ($(UNAME_S),Darwin)
ifeq ($(UNAME_S),darwin)
@cd optools && ./run_ut.sh UT $(SKIP_TEST)
else
@cd optools && timeout 60m ./run_ut.sh UT $(SKIP_TEST)
Expand Down Expand Up @@ -226,8 +246,9 @@ clean:
rm -f $(BIN_NAME)
rm -rf $(ROOT_DIR)/vendor
rm -rf $(MUSL_DIR)
rm -rf /tmp/musl-$(MUSL_VERSION).tar.gz
rm -rf /tmp/$(MUSL_TAR)
$(MAKE) -C cgo clean
$(MAKE) -C thirdparties clean

###############################################################################
# static checks
Expand Down
2 changes: 2 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ require (
github.com/confluentinc/confluent-kafka-go/v2 v2.4.0
github.com/containerd/cgroups/v3 v3.0.1
github.com/cpegeric/pdftotext-go v0.0.0-20241112123704-49cb86a3790e
github.com/detailyang/go-fallocate v0.0.0-20180908115635-432fa640bd2e
github.com/docker/go-units v0.5.0
github.com/dolthub/maphash v0.1.0
github.com/dslipak/pdf v0.0.2
Expand Down Expand Up @@ -81,6 +82,7 @@ require (
github.com/ti-mo/netfilter v0.5.2
github.com/tidwall/btree v1.7.0
github.com/tidwall/pretty v1.2.1
github.com/unum-cloud/usearch/golang v0.0.0-20250109115700-cfb5132dfb5a
go.uber.org/automaxprocs v1.5.3
go.uber.org/ratelimit v0.2.0
go.uber.org/zap v1.24.0
Expand Down
4 changes: 4 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,8 @@ github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ3
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/detailyang/go-fallocate v0.0.0-20180908115635-432fa640bd2e h1:lj77EKYUpYXTd8CD/+QMIf8b6OIOTsfEBSXiAzuEHTU=
github.com/detailyang/go-fallocate v0.0.0-20180908115635-432fa640bd2e/go.mod h1:3ZQK6DMPSz/QZ73jlWxBtUhNA8xZx7LzUFSq/OfP8vk=
github.com/dgraph-io/badger v1.6.0/go.mod h1:zwt7syl517jmP8s94KqSxTlM6IMsdhYy6psNgSztDR4=
github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ=
github.com/dgryski/go-farm v0.0.0-20190423205320-6a90982ecee2/go.mod h1:SqUrOPUnsFjfmXRMNPybcSiG0BgUW2AuFH8PAnS2iTw=
Expand Down Expand Up @@ -801,6 +803,8 @@ github.com/ugorji/go v1.1.4/go.mod h1:uQMGLiO92mf5W77hV/PUCpI3pbzQx3CRekS0kk+RGr
github.com/ugorji/go v1.1.7/go.mod h1:kZn38zHttfInRq0xu/PH0az30d+z6vm202qpg1oXVMw=
github.com/ugorji/go/codec v0.0.0-20181204163529-d75b2dcb6bc8/go.mod h1:VFNgLljTbGfSG7qAOspJ7OScBnGdDN/yBr0sguwnwf0=
github.com/ugorji/go/codec v1.1.7/go.mod h1:Ax+UKWsSmolVDwsd+7N3ZtXu+yMGCf907BLYF3GoBXY=
github.com/unum-cloud/usearch/golang v0.0.0-20250109115700-cfb5132dfb5a h1:pFQofxPrJxZWs1bgfTv4AdCLi+fmG1kIy5QTgdpCJok=
github.com/unum-cloud/usearch/golang v0.0.0-20250109115700-cfb5132dfb5a/go.mod h1:NxBpQibuBBeA/V8RGbrNzVAv4OyWWL5yNao7mVz656k=
github.com/urfave/negroni v1.0.0/go.mod h1:Meg73S6kFm/4PpbYdq35yYWoCZ9mS/YSx+lKnmiohz4=
github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyCJ6HpOuEn7z0Csc=
github.com/valyala/fasthttp v1.6.0/go.mod h1:FstJa9V+Pj9vQ7OJie2qMHdwemEDaDiSdBnvPM1Su9w=
Expand Down
3 changes: 2 additions & 1 deletion optools/compose_bvt/Dockerfile.tester
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,6 @@ COPY . /matrixone
RUN cd mo-tester && sed -i 's/127.0.0.1/cn0/g' mo.yml

ENV LC_ALL 'C.UTF-8'
ENV LD_LIBRARY_PATH '/matrixone/thirdparties/install/lib'

CMD ["/matrixone/optools/compose_bvt/entrypoint.sh"]
CMD ["/matrixone/optools/compose_bvt/entrypoint.sh"]
3 changes: 3 additions & 0 deletions optools/images/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,11 @@ RUN make build

FROM ubuntu:22.04

ENV LD_LIBRARY_PATH '/lib'

COPY --from=builder /go/src/github.com/matrixorigin/matrixone/mo-service /mo-service
COPY --from=builder /go/src/github.com/matrixorigin/matrixone/etc /etc
COPY --from=builder /go/src/github.com/matrixorigin/matrixone/thirdparties/install/lib /lib

# Install some utilities used for debugging or by startup script
RUN apt-get update && apt-get install -y \
Expand Down
6 changes: 4 additions & 2 deletions optools/run_ut.sh
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,15 @@ function run_tests(){
logger "INF" "Ingore code coverage $(echo ${leave_out[@]}|tr " " "|")"
local cover_profile='profile.raw'
make cgo
make thirdparties
THIRDPARTIES_INSTALL_DIR=${BUILD_WKSP}/thirdparties/install
if [[ $SKIP_TESTS == 'race' ]]; then
logger "INF" "Run UT without race check"
CGO_CFLAGS="-I${BUILD_WKSP}/cgo" CGO_LDFLAGS="-L${BUILD_WKSP}/cgo -lmo" go test -short -v -json -tags matrixone_test -p ${UT_PARALLEL} -timeout "${UT_TIMEOUT}m" $test_scope | tee $UT_REPORT
CGO_CFLAGS="-I${BUILD_WKSP}/cgo -I${THIRDPARTIES_INSTALL_DIR}/include" CGO_LDFLAGS="-Wl,-rpath,${THIRDPARTIES_INSTALL_DIR}/lib -L${THIRDPARTIES_INSTALL_DIR}/lib -L${BUILD_WKSP}/cgo -lmo" go test -short -v -json -tags matrixone_test -p ${UT_PARALLEL} -timeout "${UT_TIMEOUT}m" $test_scope | tee $UT_REPORT

else
logger "INF" "Run UT with race check"
CGO_CFLAGS="-I${BUILD_WKSP}/cgo" CGO_LDFLAGS="-L${BUILD_WKSP}/cgo -lmo" go test -short -v -json -tags matrixone_test -p ${UT_PARALLEL} -timeout "${UT_TIMEOUT}m" -race $test_scope | tee $UT_REPORT
CGO_CFLAGS="-I${BUILD_WKSP}/cgo -I${THIRDPARTIES_INSTALL_DIR}/include" CGO_LDFLAGS="-Wl,-rpath,${THIRDPARTIES_INSTALL_DIR}/lib -L${THIRDPARTIES_INSTALL_DIR}/lib -L${BUILD_WKSP}/cgo -lmo" go test -short -v -json -tags matrixone_test -p ${UT_PARALLEL} -timeout "${UT_TIMEOUT}m" -race $test_scope | tee $UT_REPORT
fi
}

Expand Down
78 changes: 78 additions & 0 deletions pkg/catalog/secondary_index_utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (

"github.com/matrixorigin/matrixone/pkg/common/moerr"
"github.com/matrixorigin/matrixone/pkg/sql/parsers/tree"
"github.com/matrixorigin/matrixone/pkg/vectorindex"
)

// Index Algorithm names
Expand All @@ -31,6 +32,7 @@ const (
MoIndexIvfFlatAlgo = tree.INDEX_TYPE_IVFFLAT // used for IVF flat index on Vector/Array columns
MOIndexMasterAlgo = tree.INDEX_TYPE_MASTER // used for Master Index on VARCHAR columns
MOIndexFullTextAlgo = tree.INDEX_TYPE_FULLTEXT // used for Fulltext Index on VARCHAR columns
MoIndexHnswAlgo = tree.INDEX_TYPE_HNSW // used for HNSW Index on Vector/Array columns
)

// ToLower is used for before comparing AlgoType and IndexAlgoParamOpType. Reason why they are strings
Expand Down Expand Up @@ -69,13 +71,22 @@ func IsFullTextIndexAlgo(algo string) bool {
return _algo == MOIndexFullTextAlgo.ToString()
}

func IsHnswIndexAlgo(algo string) bool {
_algo := ToLower(algo)
return _algo == MoIndexHnswAlgo.ToString()
}

// ------------------------[START] IndexAlgoParams------------------------
const (
IndexAlgoParamLists = "lists"
IndexAlgoParamOpType = "op_type"
IndexAlgoParamOpType_l2 = "vector_l2_ops"
//IndexAlgoParamOpType_ip = "vector_ip_ops"
//IndexAlgoParamOpType_cos = "vector_cosine_ops"
HnswM = "m"
HnswEfConstruction = "ef_construction"
HnswQuantization = "quantization"
HnswEfSearch = "ef_search"
)

const (
Expand Down Expand Up @@ -119,6 +130,27 @@ func IndexParamsToStringList(indexParams string) (string, error) {
res += fmt.Sprintf(" %s = %s ", IndexAlgoParamLists, val)
}

if val, ok := result[HnswM]; ok {
res += fmt.Sprintf(" %s = %s ", HnswM, val)
}

if val, ok := result[HnswEfConstruction]; ok {
res += fmt.Sprintf(" %s = %s ", HnswEfConstruction, val)
}

if val, ok := result[HnswEfSearch]; ok {
res += fmt.Sprintf(" %s = %s ", HnswEfSearch, val)
}

if val, ok := result[HnswQuantization]; ok {
val = ToLower(val)
_, ok := vectorindex.QuantizationValid(val)
if !ok {
return "", moerr.NewInternalErrorNoCtxf("invalid quantization '%s'", val)
}
res += fmt.Sprintf(" %s '%s' ", HnswQuantization, val)
}

if opType, ok := result[IndexAlgoParamOpType]; ok {
opType = ToLower(opType)
if opType != IndexAlgoParamOpType_l2 {
Expand Down Expand Up @@ -214,6 +246,52 @@ func indexParamsToMap(def interface{}) (map[string]string, error) {
return nil, moerr.NewInternalErrorNoCtx("invalid list. list must be > 0")
}

if len(idx.IndexOption.AlgoParamVectorOpType) > 0 {
opType := ToLower(idx.IndexOption.AlgoParamVectorOpType)
if opType != IndexAlgoParamOpType_l2 {
//opType != IndexAlgoParamOpType_ip &&
//opType != IndexAlgoParamOpType_cos &&

return nil, moerr.NewInternalErrorNoCtx(fmt.Sprintf("invalid op_type. not of type '%s'",
IndexAlgoParamOpType_l2))
//IndexAlgoParamOpType_ip, IndexAlgoParamOpType_cos,
}
res[IndexAlgoParamOpType] = idx.IndexOption.AlgoParamVectorOpType
} else {
res[IndexAlgoParamOpType] = IndexAlgoParamOpType_l2 // set l2 as default
}
case tree.INDEX_TYPE_HNSW:
if idx.IndexOption.HnswM < 0 {
return nil, moerr.NewInternalErrorNoCtx("invalid M. hnsw.M must be > 0")
}
if idx.IndexOption.HnswEfConstruction < 0 {
return nil, moerr.NewInternalErrorNoCtx("invalid ef_construction. hnsw.ef_construction must be > 0")
}
if idx.IndexOption.HnswEfSearch < 0 {
return nil, moerr.NewInternalErrorNoCtx("invalid ef_search. hnsw.ef_search must be > 0")
}
if len(idx.IndexOption.HnswQuantization) > 0 {
_, ok := vectorindex.QuantizationValid(idx.IndexOption.HnswQuantization)
if !ok {
return nil, moerr.NewInternalErrorNoCtx("invalid hnsw quantization.")
}
}

// hnswM or HnswEfConstruction == 0, use usearch default value
if idx.IndexOption.HnswM > 0 {
res[HnswM] = strconv.FormatInt(idx.IndexOption.HnswM, 10)
}
if idx.IndexOption.HnswEfConstruction > 0 {
res[HnswEfConstruction] = strconv.FormatInt(idx.IndexOption.HnswEfConstruction, 10)
}
if idx.IndexOption.HnswEfSearch > 0 {
res[HnswEfSearch] = strconv.FormatInt(idx.IndexOption.HnswEfSearch, 10)
}

if len(idx.IndexOption.HnswQuantization) > 0 {
res[HnswQuantization] = idx.IndexOption.HnswQuantization
}

if len(idx.IndexOption.AlgoParamVectorOpType) > 0 {
opType := ToLower(idx.IndexOption.AlgoParamVectorOpType)
if opType != IndexAlgoParamOpType_l2 {
Expand Down
21 changes: 21 additions & 0 deletions pkg/catalog/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -316,12 +316,14 @@ const (
const (
UniqueIndexSuffix = "unique_"
FullTextIndexSuffix = "fulltext_"
HnswIndexSuffix = "hnsw_"
SecondaryIndexSuffix = "secondary_"
PrefixIndexTableName = "__mo_index_"
IndexTableNamePrefix = PrefixIndexTableName
UniqueIndexTableNamePrefix = PrefixIndexTableName + UniqueIndexSuffix
SecondaryIndexTableNamePrefix = PrefixIndexTableName + SecondaryIndexSuffix
FullTextIndexTableNamePrefix = PrefixIndexTableName + FullTextIndexSuffix
HnswIndexTableNamePrefix = PrefixIndexTableName + HnswIndexSuffix

/************ 0. Regular Secondary Index ************/

Expand Down Expand Up @@ -367,6 +369,25 @@ const (
FullTextIndex_TabCol_Word = "word"
FullTextIndex_TabCol_Id = "doc_id"
FullTextIndex_TabCol_Position = "pos"

/************ 4. HNSW Index *************/

// HNSW Table Types
// NOTE: avoid duplicate TblType name with IVFFLAT or other index
Hnsw_TblType_Metadata = "hnsw_meta"
Hnsw_TblType_Storage = "hnsw_index"

// HNSW Storage - Column names
Hnsw_TblCol_Storage_Index_Id = "index_id"
Hnsw_TblCol_Storage_Chunk_Id = "chunk_id"
Hnsw_TblCol_Storage_Data = "data"
Hnsw_TblCol_Storage_Tag = "tag"

// HNSW Metadata - Column names
Hnsw_TblCol_Metadata_Index_Id = "index_id"
Hnsw_TblCol_Metadata_Timestamp = "timestamp"
Hnsw_TblCol_Metadata_Checksum = "checksum"
Hnsw_TblCol_Metadata_Filesize = "filesize"
)

const (
Expand Down
Loading
Loading