solr optimize - yaokun123/php-wiki GitHub Wiki

solr的optimize

一、solr build索引时commit与optimize概念

大家都知道,solr在提交索引的时候有commit和optimize的概念,今天来分析一下:

1、commit

当你像solr提交索引更新时,只有运行了commit,索引才会发生变化。当然也并不意味着你每次提交都要commit 如果不是那么紧急,你可以多次提交之后,再执行commit操作。

The <commit> operation writes all documents loaded since the last commit to one or more segment files on
the disk. Before a commit has been issued, newly indexed content is not visible to searches. The commit
operation opens a new searcher, and triggers any event listeners that have been configured.
--------------------- 
commit操作将所有需要更新的文档全部写入索引中,但是新进入的索引不会立即生效。

2、optimize

optimize有点像硬盘上整理磁盘碎片的操作。为了提高搜索速度,它会将索引重组在一起, 然后移除需要被删除删除或是更新的文档,请注意,solr是没有update的这种操作的,只有增加与删除。 solr在优化时,将需要删除或是被替换的索引标记为deleted,然后再创建新的文档替换掉需要被替换的。 optimize就是执行此操作。所以在优化的时候,你的索引会增大,然后再减小。 optimize操作会创建一个全新的的索引结构,所以,你需要预备出2倍于你commit时索引大小的空间。

The <optimize> operation requests Solr to merge internal data structures in order to improve search
performance. For a large index, optimization will take some time to complete, but by merging many small
segment files into a larger one, search performance will improve. If you are using Solr’s replication mechanism 
to distribute searches across many systems, be aware that after an optimize, a complete index will need to be 
transferred. In contrast, post-commit transfers are usually much smaller.
--------------------- 
optimize操作是合并内部的数据结构来提供搜索性能。对于大型的索引,optimize耗时较多,
但是通过合并一些索引结构,到一个大的,那么索引性能会得到提高,需要注意的是一个完整的索引需要传送,
对比来说,以post方式进行的提交会更小。

参考:http://xiaofeng.iteye.com/blog/1299148

二、solr的statistics页面optimized

此外,他解释一些运行参数:

Optional Attribute Description
waitSearch Default is true. Blocks until a new searcher is opened and registered as the main query searcher, making the changes visible
expungeDeletes (commit only)Default is false. Merges segments that have more than 10% deleted docs,expunging them in the process.
maxSegments (optimize only)Default is 1. Merges the segments down to no more than this number of segments

example: http://solr5.test.pingansec.com/solr/db/update?optimize=true&waitFlush=true&wt=json&_=1541554644867&maxSegments=8&expungeDeletes=true

⚠️ **GitHub.com Fallback** ⚠️