Liker:基于Lucene7.1的全文索引服务,用于替换SQL LIKE - ptxu/Liker GitHub Wiki

设计背景

工作中遇到这种情况,mysql中有一张表,存储了数千万条数据,但是又必须要对某些列做模糊匹配,考虑到用like查询效率十分低下,但是又不想使用solr等重量级服务,于是使用Lucene7.1类库实现可一个简单的全文索引服务,用于替换SQL LIKE。

主要功能

删繁就简,围绕核心需求(替换SQL LIKE)主要提供以下功能:实现索引的增删改查,并http接口和thrift接口对外屏蔽实现细节。

安装部署

通过mvn install命令可打包成一个可执行jar,通过java -jar Liker-1.0.0.jar即可启动程序,http服务绑定80端口,thrift服务绑定81端口,通过SystemConfig.properties配置文件可修改端口;通过http://127.0.0.1/swagger-ui.html ,即可查看在线api文档。

Http接口示例

1. 新建索引

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[{
	"doc": {
		"fields": [{
			"fieldName": "id",
			"fieldValue": "123456",
			"indexed": false
		},
		{
			"fieldName": "name",
			"fieldValue": "你好,世界!",
			"indexed": true
		}]
	},
	"type": "Add"
}]' 'http://127.0.0.1/addIndexTask'

type为任务的类型,Add即创建索引、Update为修改索引、Delete为删除索引;doc文档,相当于mysql表中的一行、一条记录,doc中包含的field,相当于mysql 表中的列(主要是需要模糊匹配的列以及主键),若indexed为true,则会进行分词、索引但不存储(存储原值没有意义,例如需要模糊匹配的列),若indexed为false,则会进行不分词、不索引但存储(例如主键)。

2.修改索引

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[{
	"doc": {
		"fields": [{
			"fieldName": "id",
			"fieldValue": "123456",
			"indexed": false
		},
		{
			"fieldName": "name",
			"fieldValue": "Hello,World",
			"indexed": true
		}]
	},
	"targetField": {
		"fieldName": "id",
		"fieldValue": "123456"
	},
	"type": "Update"
}]' 'http://127.0.0.1/addIndexTask'

3.删除索引

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[{
	
	"targetField": {
		"fieldName": "id",
		"fieldValue": "123456"
	},
	"type": "Delete"
}]' 'http://127.0.0.1/addIndexTask'

4.全文检索

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
  "currentPage": 1,
  "keywords": [
    {
      "fieldName": "name",
      "fieldValue": "你好",
      "tokenized": true
    }
  ],
  "pageSize": 10,
  "responseFieldNames": [
    "id"
  ]
}' 'http://127.0.0.1/fullTextSearch'

Response Body

{
  "message": "success",
  "data": [
    {
      "score": 0.8630463,
      "doc": {
        "fields": [
          {
            "fieldName": "id",
            "fieldValue": "123456",
            "indexed": true
          }
        ]
      }
    }
  ],
  "code": 200
}

Thrift接口示例

/**
 * @ClassName: ThriftClient
 * @Description: Thrift客户端
 * @author xupengtao
 * @date 2018年1月24日 上午10:10:06
 *
 */
public class ThriftClient {

    /**
     * @Title: main
     * @Description: main
     * @param args
     *            String[]
     */
    public static void main(String[] args) {
        // testAddIndexTask();
        testFullTextSearch();
    }

    /**
     * @Title: testAddIndexTask
     * @Description: 测试添加索引任务
     */
    private static void testAddIndexTask() {
        TTransport transport = null;
        try {
            // 设置调用的服务地址为本地,端口为 7911
            transport = new TSocket("127.0.0.1", SystemConfig.getInstance().getThriftPort());
            transport.open();
            // 设置传输协议为 TBinaryProtocol
            TProtocol protocol = new TBinaryProtocol(transport);
            FullTextIndexService.Client client = new FullTextIndexService.Client(protocol);
            // 调用服务的 addIndexTask 方法
            List<IndexTask> tasks = new ArrayList<>();
            IndexTask task = new IndexTask();
            task.setType(IndexTaskType.Add);
            Document doc = new Document();
            List<Field> fields = new ArrayList<>();
            Field field = new Field();
            field.setFieldName("id");
            field.setFieldValue("123456");
            field.setIndexed(false);
            fields.add(field);

            field = new Field();
            field.setFieldName("name");
            field.setFieldValue("你好,世界!");
            field.setIndexed(true);
            fields.add(field);
            doc.setFields(fields);
            task.setDoc(doc);

            tasks.add(task);
            AddTaskResponseResult response = client.addIndexTask(tasks);
            System.out.println(response.toString());
        }
        catch (Exception e) {
            e.printStackTrace();
        }
        finally {
            if (transport != null) {
                transport.close();
            }
        }
    }

    /**
     * @Title: testFullTextSearch
     * @Description: 测试全文检索
     */
    private static void testFullTextSearch() {
        TTransport transport = null;
        try {
            // 设置调用的服务地址为本地,端口为 7911
            transport = new TSocket("127.0.0.1", SystemConfig.getInstance().getThriftPort());
            transport.open();
            // 设置传输协议为 TBinaryProtocol
            TProtocol protocol = new TBinaryProtocol(transport);
            FullTextIndexService.Client client = new FullTextIndexService.Client(protocol);
            // 调用服务的 fullTextSearch 方法
            IndexRequestParam param = new IndexRequestParam();
            param.setCurrentPage(1);
            param.setPageSize(100);
            param.setKeywords(Arrays.asList(new Keyword("name", "你好", true)));
            param.setResponseFieldNames(Arrays.asList("id"));
            SearchResponseResult resoponse = client.fullTextSearch(param);
            System.out.println(resoponse.toString());
        }
        catch (Exception e) {
            e.printStackTrace();
        }
        finally {
            if (transport != null) {
                transport.close();
            }
        }
    }
}
⚠️ **GitHub.com Fallback** ⚠️