elasticsearch rails - mindpin/docs GitHub Wiki
ElasticSearch是一个基于Lucene构建的开源,分布式,RESTful搜索引擎。设计用于 云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。支持通过HTTP使用JSON进行数据索引。
我们建立一个网站或应用程序,并要添加搜索功能,令我们受打击的是:搜索工作是很难的。我们希望我们的搜索解决方案要快,我们希望有一个零配置和一个完全免费的搜索模式,我们希望能够简单地使用JSON通过HTTP的索引数据,我们希望我们的搜索服务器始终可用,我们希望能够一台开始并扩展到数百,我们要实时搜索,我们要简单的多租户,我们希望建立一个云的解决方案。Elasticsearch旨在解决所有这些问题和更多的。
brew install elasticsearch
elasticsearch --config=/usr/local/opt/elasticsearch/config/elasticsearch.yml
访问 http://localhost:9200,访问成功就表示安装完成了。
在Gemfile中加入
gem 'elasticsearch-model'
gem 'elasticsearch-rails'
注意:es-model自带了分页插件,如果你在gemfile中有分页,如will_paginate 或者 kaminari,要把他们放到es-model和es-rails的前面。
在需要添加搜索的model添加以下代码:
class University < ActiveRecord::Base
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
end
完成引用后,我们可以编写search方法了:
def self.search(search)
response = __elasticsearch__.search(search)
end
这是一个很简单的search,通过传入的参数直接进行检索。我们可以使用DSL来使我们的检索语句更加满足我们的业务需要,以下是我需要检索一个状态为1,并且从栏目名为name的一个检索:
def self.search_filter(params)
response = __elasticsearch__.search(
"query": {
"filtered": {
"filter": {
"bool": {
"must": { "term": { "status": 1 }},
"must": {
"query": {
"match": { "name": params }
}
}
}
}
}
}
)
end
然后我们为model创建index, 主要给es使用:
mapping dynamic: false do
indexes :name
indexes :tag
end
我们继续往下走,model是可以serialized成json的,我们使用as_indexed_json这个方法。我们可以这样写:
def as_indexed_json(options={})
self.as_json(
only: [:id, :name, :description, :status],
include: { tags: { only: [:name]}}
)
end
include的部分是处理association的,only是model本身的字段属性。完成了以上调整,我们的model搜索基本完成了。如果你现在使用搜索,我估计还是搜索不出数据。我们要把数据导入给es,使用这个命令
rake environment elasticsearch:import:model CLASS='your_model_name' FORCE=y
app/models/concerns/searchable.rb
module Searchable
extend ActiveSupport::Concern
included do
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
Searchable.enabled_models.add(self)
end
def self.enabled_models
@_enabled_models ||= Set.new
end
def as_indexed_json(options={})
as_json(except: [:id, :_id])
end
end
app/models/concerns/standard_search.rb
module StandardSearch
extend ActiveSupport::Concern
included do
include Searchable
end
module ClassMethods
def standard_search(q)
param = {
:query => {
:multi_match => {
:fields => standard_fields,
:type => "cross_fields",
:query => q,
:analyzer => "standard",
:operator => "and"
}
}
}
self.search(param).records.all
end
def standard(*fields)
standard_fields.merge fields
settings :index => {:number_of_shards => 1} do
mappings :dynamic => "false" do
fields.each do |f|
indexes f, :analyzer => "chinese"
end
end
end
end
def standard_fields
@_standard_fields ||= Set.new
end
end
end
app/models/concerns/pinyin_search.rb
module PinyinSearch
extend ActiveSupport::Concern
included do
include StandardSearch
delegate :pinyin_fields, :to => :class
before_save :save_pinyin_fields
end
private
def save_pinyin_fields
self.pinyin_fields.each do |field|
value = self.send field
self.send "#{field}_pinyin=", PinYin.of_string(value).join
self.send "#{field}_abbrev=", PinYin.abbr(value)
end
end
module ClassMethods
def pinyin(*fields)
standard(*fields)
ext_fields = fields.select do |field|
self.fields.include?(field.to_s) &&
self.fields[field.to_s].type == String
end.each do |f|
pinyin_fields_from(f).each do |fd|
field fd, :type => String
end
index_pinyin_field(f)
end
pinyin_fields.concat ext_fields
end
def pinyin_analysis
{
:analyzer => {
:pinyin => {
:type => "custom",
:tokenizer => "lowercase",
:filter => ["kc_ngram"]
}
},
:filter => {
:kc_ngram => {
:type => "nGram",
:min_gram => 1,
:max_gram => 128
}
}
}
end
def pinyin_search(q)
fields = pinyin_fields.map {|f| pinyin_fields_from(f)}.flatten
param = {
:query => {
:multi_match => {
:fields => fields,
:type => "phrase",
:query => q,
:analyzer => "standard"
}
}
}
self.search(param).records.all
end
def pinyin_fields
@_pinyin_fields ||= []
end
private
def pinyin_fields_from(field)
%W[#{field}_pinyin #{field}_abbrev]
end
def index_pinyin_field(field)
ext_fields = pinyin_fields_from(field)
settings :index => {:number_of_shards => 1}, :analysis => self.pinyin_analysis do
mappings :dynamic => "false" do
ext_fields.each do |f|
indexes f, :analyzer => "pinyin"
end
end
end
end
end
end
app/models/需要加入拼音搜索的model.rb
class KnowledgeNetStore::Point
include PinyinSearch
# 需要加入全文搜索的字段
pinyin :name
end
KnowledgeNetStore::Point.class_eval do
include PinyinSearch
pinyin :name
end
即在脑图节点中使用,对 name 字段进行拼音搜索的支持(含经典搜索)
app/models/concerns/searchable.rb
module Searchable
extend ActiveSupport::Concern
included do
include Elasticsearch::Model
__elasticsearch__.client = Elasticsearch::Client.new host: "http://localhost:9200", log: true
Searchable.enabled_models.add(self)
after_create {Indexer.perform_async(:index, self.id.to_s, self.class.name)}
after_update {Indexer.perform_async(:update, self.id.to_s, self.class.name)}
after_destroy {Indexer.perform_async(:delete, self.id.to_s, self.class.name)}
end
def self.enabled_models
@_enabled_models ||= Set.new
end
def as_indexed_json(options={})
as_json(except: [:id, :_id])
end
module ClassMethods
def custom_analysis
{
:analyzer => {
:chargram => {
:type => :custom,
:tokenizer => :chargram,
:filter => [:lowercase]
}
},
:tokenizer => {
:chargram => {
:type => :nGram,
:min_gram => 1,
:max_gram => 20,
:token_chars => [:letter, :digit]
}
}
}
end
end
end
app/models/concerns/vote_search_config.rb
module VoteSearchConfig
extend ActiveSupport::Concern
included do
include Searchable
settings :index => {:number_of_shards => 1}, :analysis => custom_analysis do
mappings :dynamic => "false" do
indexes :title, :analyzer => :chargram
end
end
end
def as_indexed_json(options={})
as_json(only: [:title])
end
module ClassMethods
def page_search(query,page = 1, per = 20)
page = 1 if page.blank?
param = {
:from => page-1,
:size => per,
:query => {
:multi_match => {
:type => :best_fields,
:query => query,
:fields => [:title]
}
},
:highlight => {
:pre_tags => ["<em class='highlight'>"],
:post_tags => ["</em>"],
:fields => {:title=>{}}
}
}
Vote.search(param)
end
end
end
pinIdea属于搜索开发在KC之前,可以看出很杂乱,让刚接手的人员,很难去理解,并修改。
由两个项目可以看出,我们使用ES,除基本的全文搜索外,还希望能进行拼音搜索。
从扩展的角度来说,我们可能还会做以下的设定:pre_tags post_tags(搜索词前后,用于高亮)、多字段搜索(现有项目无实例)
至于其余扩展,可根据实际需求再行添加。
优化上来说,从功能上来说,基本已经满足了需求,暂时没有什么需要扩展的。单纯转为Rails Engine即可。
然后给 gemspec 添加 elasticsearch 相关的 gem 依赖,则更好。
还有就是可以添加Controller相应的方法,或提供默认搜索的页面以及JSON返回。
更甚还可以提供Helper生成搜索框以及提交表单等
建议做个 Rails Engine Gem, 主要用于简化集成,降低重复开发时间。
主要功能为常规搜索、拼音搜索。(都已实现)
至于复杂的扩展,我们可以根据具体需求,逐步添加至此Gem内。