title
对比Mysql学习ES

[TOC]

1. 对比Mysql学习ES

1.1. 概念

Elasticsearch：index--> type(7.X废除)-->doc--> field

MySQL: 数据库 --> 表 --> 行 --> 列

在ES 7.X的版本中，已经去掉了type的概念，一个index只有一个默认type，就是_doc。这样index可以理解为数据库或者表。

ES 实例：对应 MySQL 实例中的一个 Database。
Index 对应 MySQL 中的 Table 。
Document 对应 MySQL 中表的记录。

1.2. 常用命令

1.2.1. 常用数据类型

1.2.1.1. 文本类型

Mysql	描述	ES	描述
text	最大4G	text	长度不限制,默认分词,支持全文检索，不可以排序
varchar	最大65532字节，uft8下最大为21844字符	keyword	最大32766字节，默认不分词。只能通过精确值搜索到。如果用于过滤、排序、聚合，就设置为此类型

1.2.1.2. 数字类型

Mysql	描述	ES	描述
int	整数型，4字节取值范围(-2 147 483 648，2 147 483 647)	integer	整数型
bigint	整数型，8字节取值范围(-9,223,372,036,854,775,808，9 223 372 036 854 775 807)	long	整数型
float	浮点型，4字节	float	浮点型，4字节
double	浮点型，8字节	double	浮点型，8字节

1.2.1.3. 时间类型

Mysql	描述	ES	描述
datetime	8字节，取值范围1000-01-01 00:00:00/9999-12-31 23:59:59	date	日期，支持格式： yyyy-MM-dd HH:mm:ss yyyy-MM-dd epoch_millis（毫秒值）
timestamp	4字节，表示从1970-01-01 00:00:00到当前的时间秒数结束时间为结束时间是第 2147483647 秒，北京时间 2038-1-19 11:14:07 会随着时区变化，而变化	同上	同上

1.2.1.4. 布尔类型

Mysql	描述	ES	描述
tinyint(1)	tinyint，设置长度为1，就表示布尔类型值有：true、false、null	boolean	布尔类型，表示true、false

1.2.2. CRUD

1.2.2.1. 创建索引

1.2.2.1.1. Mysql

CREATE TABLE `表名`  (
  `字段名1` 字段类型(字段长度) COMMENT '字段描述',
  `字段名2` 字段类型(字段长度) DEFAULT 字段默认值 COMMENT '字段描述',
  PRIMARY KEY (`id`)
)

1.2.2.1.2. ES

PUT 索引名
{
  "mappings": {
    "properties": {
      "字段名":{
        "type": "字段数据类型",
        "index": true/false
      }
    }
  }
}

type 表示当前字段的数据类型
index 表示是否作为索引，可以作为搜索条件过滤，默认值为true

示例：
PUT car
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text"
      },
      "brand":{
      	"type":"keyword"
      },
      "price":{
      	"type":"double"
      },
      "mic":{
      	"type":"boolean"
      },
      "image":{
      	"type":"keyword",
      	"index":false
      }
   }
 }
}

1.2.2.2. 查看索引-字段结构及设置

1.2.2.2.1. Mysql

desc 表名;

1.2.2.2.2. ES

GET 索引名

1.2.2.3. 修改索引字段

1.2.2.3.1. Mysql

ALTER TABLE 表名
ADD COLUMN 字段名 字段类型(字段长度);

1.2.2.3.2. ES

PUT 索引名/_mapping
{
	"properties":{
		"字段名":{
			"type":"字段类型"
		}
	}
}

1.2.2.4. 插入数据

1.2.2.4.1. Mysql

insert info 表名(字段1,字段2) values(字段值1,字段值2);

1.2.2.4.2. ES

POST 索引名/_doc
{
	"字段名":"字段值"
}

1.2.2.4.2.1. 批量插入数据

POST _bulk
{"index": {"_index": "索引名"}}
{"字段名":"字段值"}
{"index": {"_index": "索引名"}}
{"字段名":"字段值","字段名":"字段值"}

1.2.2.5. 更新数据-先删再加

这种更新方式，相当于先删除docment，再添加，之前的数据是无法保留的

POST 索引名/_doc/索引ID
{
	"字段名":"字段值"
}

1.2.2.6. 更新数据-update方式

数据保留，只修改指定字段的值

POST 索引名/_update/索引ID
{
	"doc":{
		"字段名":"字段值"
	}
}

1.2.2.7. 删除数据-根据文档ID删除

1.2.2.7.1. Mysql

delete from 表名 where id = 待删除id;

1.2.2.7.2. ES

DELETE 索引名/_doc/索引id

1.2.2.8. 删除数据-根据查询条件删除

1.2.2.8.1. Mysql

-- 删除表全部数据
delete from 表名;
-- 根据匹配字段值，删除数据
delete from 表名 where 字段1 = 字段1值;

1.2.2.8.2. ES

# 这是删除全部数据的方式
POST 索引名/_delete_by_query
{
  "query": {
    "match_all": {}
  }
}

1.2.2.9. 删除索引

1.2.2.9.1. Mysql

drop table 表名;

1.2.2.9.2. ES

DELETE 索引名

1.2.2.10. 批量删除索引

1.2.2.10.1. Mysql

drop table 表名1,表名2;

1.2.2.10.2. ES

多个索引删除，使用逗号分隔

DELETE 索引名1,索引名2

1.2.2.11. 查询

1.2.2.11.1. 查看所有的表

1.2.2.11.1.1. Mysql

SHOW TABLES;
或者
SELECT
	table_name,
	table_comment,
	table_rows 
FROM
	information_schema.`TABLES` 
WHERE
	table_schema = 'test';

1.2.2.11.1.2. ES

# 查看所有的索引
GET _cat/indices?v 
# 查看所有以c开头的索引
GET _cat/indices/c*?v

1.2.2.11.2. 无查询条件

1.2.2.11.2.1. Mysql

-- 类似于Mysql的全表查询
select * from 表名;

1.2.2.11.2.2. ES

GET 索引名/_search
{}

1.2.2.11.3. 单字段查询-精准查询

1.2.2.11.3.1. Mysql

-- 类似于Mysql的where 单字段查询
select * from 表名 where 字段名=字段值;

1.2.2.11.3.2. ES

GET 索引名/_search
{
  "query": {
    "match": {
      "字段名": "字段值"
    }
  }
}

1.2.2.11.4. 单字段查询-模糊查询

1.2.2.11.4.1. Mysql

-- 类似于Mysql的where 模糊查询  %表示任意多个字符，包含0个字符。_表示单个字符
select * from 表名 where 字段名 like '%字段值%';

1.2.2.11.4.2. ES

prefix：根据前缀查询

wildcard：通配符查询，

* 表示匹配任意个字符
? 表示匹配单个字符

# prefix前缀查询
GET 索引名/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "prefix": {
            "字段名": "字段值"
          }
        }
      ]
    }
  }
}

# wildcard查询
GET 索引名/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "wildcard": {
            "字段名": "*字段值*"
          }
        }
      ]
    }
  }
}

1.2.2.11.5. 多字段查询

1.2.2.11.5.1. Mysql

-- 类似于Mysql的where 多字段and查询
select * from 表名 where 字段名1 = 字段值1 and 字段值2 = 字段值2;

1.2.2.11.5.2. ES

GET 索引名/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase": {
            "字段名1": "字段值1"
          }
        },{
          "match_phrase": {
            "字段名2": "字段值2"
          }
        }
      ]
    }
  }
}

1.2.2.11.6. 分页查询

1.2.2.11.6.1. Mysql

-- ${offset}表示偏移量，0表示第一行。${count}表示返回最大行数
select * from 表名 limit ${offset},${count};

1.2.2.11.6.2. ES

GET 索引名/_search
{
	"from": 开始位置，0表示第一条,
	"size": 查询文档数
}

1.2.2.11.7. 排序

1.2.2.11.7.1. Mysql

-- asc表示正序 desc表示倒序
select * from 表名 order by 字段 asc/desc;

1.2.2.11.7.2. ES

GET 索引名/_search
{
 "sort": [
   {
     "字段名": {
       "order": "排序类型（asc/desc）"
     }
   }
 ]
}

1.2.2.11.8. 只显示某些字段

1.2.2.11.8.1. Mysql

select 字段1,字段2 from 表名;

1.2.2.11.8.2. ES

GET 索引/_search
{
  "_source": ["字段1","字段2"]
}

1.2.2.11.9. 查询名词解释

1.2.2.11.9.1. query

查询关键字，类似于mysql的where关键字

1.2.2.11.9.2. bool

布尔过滤器，当需要使用多个过滤器时，只需要它们放到bool过滤器中即可。

bool由以下部分组成：

{
	"bool":{
		"must":[],
		"should":[],
		"must_not":[]
	}
}

must: 所有的语句都必须（must）匹配，类似于mysql中的AND
must_not：所有的语句都不能（must not）匹配，类似于mysql中的NOT
should：至少一个语句匹配，类似\于mysql中的OR

1.2.2.11.9.3. match

用于执行全文查询的标准查询，包括模糊匹配和短语或接近查询。

类似于等号查询，但是需要分查询字段是keyword类型，还是text类型。

text默认分词，所以使用match查询会根据查询条件分词查询，分词只要匹配到就展示出来。

keyword不分词，使用match就相当于精准匹配。

示例
GET car/_search
{
  "query": {
    "match": {
      "name": "马 X3"
    }
  }
}
结果：
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 3.219012,
    "hits" : [
      {
        "_index" : "car",
        "_type" : "_doc",
        "_id" : "4HUYfHgBkg6JQRd1IVc6",
        "_score" : 3.219012,
        "_source" : {
          "name" : "宝马 X3",
          "brand" : "宝马",
          "price" : 10.2,
          "mic" : false,
          "image" : ""
        }
      },
      {
        "_index" : "car",
        "_type" : "_doc",
        "_id" : "4XUYfHgBkg6JQRd1IVc6",
        "_score" : 1.3419306,
        "_source" : {
          "name" : "宝马 X5",
          "brand" : "宝马",
          "price" : 40.1,
          "mic" : false,
          "image" : ""
        }
      }
    ]
  }
}

1.2.2.11.9.4. match_phrase

match_phrase 称为短语搜索，要求所有的分词必须同时出现在文档中，同时位置必须紧邻一致。

1.2.2.11.9.5. match_phrase_prefix

与match_phrase查询类似，但是会对最后一个Token在倒排序索引列表中进行通配符搜索

1.2.2.11.9.6. multi_match

和match类似，可以匹配多个字段，多个字段是or的关系

1.2.2.11.9.7. query_string

和match_phrase类似，唯一区别的是，分词只要匹配上即可。不需要连续，顺序还可以调换。

类似于match、multi_match的集合，查看示例

OR表示两者取一，AND表示两者同时满足，NOT表示不是
# 查询车名 包含马或者长的。
GET car/_search
{
  "query": {
    "query_string": {
      "fields": ["name"],
      "query": "马 OR 长"
    }
  }
}

# 车名包含马，且包含X3的
GET car/_search
{
  "query": {
    "query_string": {
      "fields": ["name"],
      "query": "马 AND X3"
    }
  }
}

# 车名包含马，不包含X3的
GET car/_search
{
  "query": {
    "query_string": {
      "fields": ["name"],
      "query": "马 NOT X3"
    }
  }
}

1.2.2.11.9.8. range

区间查询，查询时间或者范围内查询

示例

# 查询价格10-20万的车
GET car/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 10,
        "lte": 20
      }
    }
  }
}

1.2.2.11.9.9. trem

term是代表完全匹配，也就是精确查询，搜索前不会再对搜索词进行分词拆解。term属于精确匹配，只能查单个词

示例：

# 精确匹配到，车型匹配为宝马的数据
GET car/_search
{
  "query": {
    "term": {
      "brand": {
        "value": "宝马"
      }
    }
  }
}

1.2.2.11.9.10. trems

类似于term，可以匹配多个单词,多个单词间表示或的意思

示例：

# 可以查询到宝马 X3 和宝马X5,因为是或的关系，所以都查询出来了
GET car/_search
{
  "query": {
    "terms": {
      "name": ["宝","X3"]
    }
  }
}

# 查询到宝马 X3，需要用到bool must
GET car/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "name": "宝"
          }
        },
        {
          "term": {
            "name": "马"
          }
        },
        {
          "term": {
            "name": "x3"
          }
        }
      ]
    }
  }
}
疑问点：为什么宝马要分开，X3 要写成x3,这是由分词决定的。查看分词逻辑
GET car/_analyze
{
  "text" : "宝马 X3"
}
分词结果：
{
  "tokens" : [
    {
      "token" : "宝",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "马",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "x3",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 2
    }
  ]
}

1.2.2.12. 分组查询

1.2.2.12.1. count查询

1.2.2.12.1.1. Mysql

select 分组字段,count(0) from 表名 group by 分组字段;

1.2.2.12.1.2. ES

使用aggs关键字

示例

# 获取各品牌车辆数量
select brand,count(0) from car group by brand;

# ES 统计各品牌车型数量
GET car/_search
{
  "size": 0, 
  "aggs": {
    "brandAgg": {
      "terms": {
        "field": "brand"
      }
    }
  }
}

size:0 的含义，不显示命中(hits)的文档信息

1.2.2.12.2. 指标聚合

max、min、sum、avg这几个函数功能和mysql一致，分别查询最大值，最小值，总数，平均值

示例

# 获取每个品牌最高的价格
GET car/_search
{
  "size": 0,
  "aggs": {
    "brandAgg": {
      "terms": {
        "field": "brand"
      },
      "aggs": {
        "maxPrice": {
          "max": {
            "field": "price"
          }
        }
      }
    }
  }
}

1.3. 分词

1.3.1. 分词插件

1.3.1.1. IK Analysis

https://github.com/medcl/elasticsearch-analysis-ik/releases，选择对应es版本的ik版本。

把插件zip解压到%ES_HOME%/plugins/ 文件夹下，然后重启es即可。注意：zip解压后需要删除

查看分词效果：

# 使用ik后的中文分词
GET _analyze
{
  "analyzer": "ik_smart",
  "text":"宝马 X5"
}
# 结果
{
  "tokens" : [
    {
      "token" : "宝马",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "x5",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "LETTER",
      "position" : 1
    }
  ]
}

# 使用es默认的分词
GET _analyze
{
  "analyzer": "default",
  "text":"宝马 X5"
}
# 结果
{
  "tokens" : [
    {
      "token" : "宝",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "马",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "x5",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 2
    }
  ]
}

1.3.1.1.1. 创建索引时，指定分词逻辑

PUT 索引名
{
  "settings":{
  	"index.analysis.analyzer.default.type":"ik_smart"
  },
  "mappings": {
    "properties": {
      "字段名":{
        "type": "字段数据类型",
        "index": true/false
      }
    }
  }
}

2. 最后

2.1. 基础数据

2.1.1. Mysql


DROP TABLE IF EXISTS `car`;
CREATE TABLE `car`  (
  `id` int(10) NOT NULL AUTO_INCREMENT COMMENT '主键ID',
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '车型',
  `brand` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '品牌',
  `price` double(10, 2) NULL DEFAULT NULL COMMENT '价格',
  `mic` smallint(1) NULL DEFAULT NULL COMMENT '是否国产：1-是 0-否',
  `image` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL DEFAULT NULL COMMENT '照片',
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 9 CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci COMMENT = '汽车表' ROW_FORMAT = Dynamic;

INSERT INTO `car` VALUES (1, '奥迪 A4', '奥迪', 20.10, 0, NULL);
INSERT INTO `car` VALUES (2, '奥迪 A6', '奥迪', 30.10, 0, NULL);
INSERT INTO `car` VALUES (3, '宝马 X3', '宝马', 10.20, 0, NULL);
INSERT INTO `car` VALUES (4, '宝马 X5', '宝马', 40.10, 0, NULL);
INSERT INTO `car` VALUES (5, '比亚迪 秦Plus', '比亚迪', 10.58, 1, NULL);
INSERT INTO `car` VALUES (6, '哈弗 H6', '长城', 8.56, 1, NULL);
INSERT INTO `car` VALUES (7, '长安 CS75', '长安', 9.34, 1, NULL);
INSERT INTO `car` VALUES (8, '五菱宏光', '五菱', 3.20, 1, NULL);

2.1.2. ES

PUT car
{
  "mappings": {
    "properties": {
      "name":{
        "type": "text"
      },
      "brand":{
      	"type":"keyword"
      },
      "price":{
      	"type":"double"
      },
      "mic":{
      	"type":"boolean"
      },
      "image":{
      	"type":"keyword",
      	"index":false
      }
   }
 }
}

PUT _bulk
{"index": {"_index": "car"}}
{"name":"奥迪 A4","brand":"奥迪","price":20.10,"mic":false,"image":""}
{"index": {"_index": "car"}}
{"name":"奥迪 A6","brand":"奥迪","price":30.10,"mic":false,"image":""}
{"index": {"_index": "car"}}
{"name":"宝马 X3","brand":"宝马","price":10.20,"mic":false,"image":""}
{"index": {"_index": "car"}}
{"name":"宝马 X5","brand":"宝马","price":40.10,"mic":false,"image":""}
{"index": {"_index": "car"}}
{"name":"比亚迪 秦Plus","brand":"比亚迪","price":10.58,"mic":true,"image":""}
{"index": {"_index": "car"}}
{"name":"哈弗 H6","brand":"长城","price":8.56,"mic":true,"image":""}
{"index": {"_index": "car"}}
{"name":"长安 CS75","brand":"长安","price":9.34,"mic":true,"image":""}
{"index": {"_index": "car"}}
{"name":"五菱宏光","brand":"五菱","price":3.20,"mic":true,"image":""}

2.2. ES官方文档

https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

Files

2023-02-06-对比Mysql学习ES.md

Latest commit

History