Liker:基于Lucene7.1的全文索引服务,用于替换SQL LIKE - ptxu/Liker GitHub Wiki
工作中遇到这种情况,mysql中有一张表,存储了数千万条数据,但是又必须要对某些列做模糊匹配,考虑到用like查询效率十分低下,但是又不想使用solr等重量级服务,于是使用Lucene7.1类库实现可一个简单的全文索引服务,用于替换SQL LIKE。
删繁就简,围绕核心需求(替换SQL LIKE)主要提供以下功能:实现索引的增删改查,并http接口和thrift接口对外屏蔽实现细节。
通过mvn install命令可打包成一个可执行jar,通过java -jar Liker-1.0.0.jar即可启动程序,http服务绑定80端口,thrift服务绑定81端口,通过SystemConfig.properties配置文件可修改端口;通过http://127.0.0.1/swagger-ui.html ,即可查看在线api文档。
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[{
"doc": {
"fields": [{
"fieldName": "id",
"fieldValue": "123456",
"indexed": false
},
{
"fieldName": "name",
"fieldValue": "你好,世界!",
"indexed": true
}]
},
"type": "Add"
}]' 'http://127.0.0.1/addIndexTask'
type为任务的类型,Add即创建索引、Update为修改索引、Delete为删除索引;doc文档,相当于mysql表中的一行、一条记录,doc中包含的field,相当于mysql 表中的列(主要是需要模糊匹配的列以及主键),若indexed为true,则会进行分词、索引但不存储(存储原值没有意义,例如需要模糊匹配的列),若indexed为false,则会进行不分词、不索引但存储(例如主键)。
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[{
"doc": {
"fields": [{
"fieldName": "id",
"fieldValue": "123456",
"indexed": false
},
{
"fieldName": "name",
"fieldValue": "Hello,World",
"indexed": true
}]
},
"targetField": {
"fieldName": "id",
"fieldValue": "123456"
},
"type": "Update"
}]' 'http://127.0.0.1/addIndexTask'
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[{
"targetField": {
"fieldName": "id",
"fieldValue": "123456"
},
"type": "Delete"
}]' 'http://127.0.0.1/addIndexTask'
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
"currentPage": 1,
"keywords": [
{
"fieldName": "name",
"fieldValue": "你好",
"tokenized": true
}
],
"pageSize": 10,
"responseFieldNames": [
"id"
]
}' 'http://127.0.0.1/fullTextSearch'
Response Body
{
"message": "success",
"data": [
{
"score": 0.8630463,
"doc": {
"fields": [
{
"fieldName": "id",
"fieldValue": "123456",
"indexed": true
}
]
}
}
],
"code": 200
}
/**
* @ClassName: ThriftClient
* @Description: Thrift客户端
* @author xupengtao
* @date 2018年1月24日 上午10:10:06
*
*/
public class ThriftClient {
/**
* @Title: main
* @Description: main
* @param args
* String[]
*/
public static void main(String[] args) {
// testAddIndexTask();
testFullTextSearch();
}
/**
* @Title: testAddIndexTask
* @Description: 测试添加索引任务
*/
private static void testAddIndexTask() {
TTransport transport = null;
try {
// 设置调用的服务地址为本地,端口为 7911
transport = new TSocket("127.0.0.1", SystemConfig.getInstance().getThriftPort());
transport.open();
// 设置传输协议为 TBinaryProtocol
TProtocol protocol = new TBinaryProtocol(transport);
FullTextIndexService.Client client = new FullTextIndexService.Client(protocol);
// 调用服务的 addIndexTask 方法
List<IndexTask> tasks = new ArrayList<>();
IndexTask task = new IndexTask();
task.setType(IndexTaskType.Add);
Document doc = new Document();
List<Field> fields = new ArrayList<>();
Field field = new Field();
field.setFieldName("id");
field.setFieldValue("123456");
field.setIndexed(false);
fields.add(field);
field = new Field();
field.setFieldName("name");
field.setFieldValue("你好,世界!");
field.setIndexed(true);
fields.add(field);
doc.setFields(fields);
task.setDoc(doc);
tasks.add(task);
AddTaskResponseResult response = client.addIndexTask(tasks);
System.out.println(response.toString());
}
catch (Exception e) {
e.printStackTrace();
}
finally {
if (transport != null) {
transport.close();
}
}
}
/**
* @Title: testFullTextSearch
* @Description: 测试全文检索
*/
private static void testFullTextSearch() {
TTransport transport = null;
try {
// 设置调用的服务地址为本地,端口为 7911
transport = new TSocket("127.0.0.1", SystemConfig.getInstance().getThriftPort());
transport.open();
// 设置传输协议为 TBinaryProtocol
TProtocol protocol = new TBinaryProtocol(transport);
FullTextIndexService.Client client = new FullTextIndexService.Client(protocol);
// 调用服务的 fullTextSearch 方法
IndexRequestParam param = new IndexRequestParam();
param.setCurrentPage(1);
param.setPageSize(100);
param.setKeywords(Arrays.asList(new Keyword("name", "你好", true)));
param.setResponseFieldNames(Arrays.asList("id"));
SearchResponseResult resoponse = client.fullTextSearch(param);
System.out.println(resoponse.toString());
}
catch (Exception e) {
e.printStackTrace();
}
finally {
if (transport != null) {
transport.close();
}
}
}
}