FUN WITH INDICES - pantheon-systems/search_api_pantheon GitHub Wiki

Home | Best Practices | Fun With Indices | Installation | Jargon | Local Development | Processors | Troubleshooting Indices

REQUIREMENTS

  • a drupal 8/9 site hosted at Pantheon

  • A local development environment as described by the LOCAL DEVELOPMENT ENVIRONMENT docs page.

  • The following modules installed and you have privileges to make changes to their configurations:

    • devel
    • facets
    • search_api
    • search_api_solr
    • search_api_page
    • views and views_ui
    • search_api_pantheon
    • search_api_solr_admin
    • search_api_solr_devel
    • search_api_pantheon_admin
    • search_api_spellcheck
    • search_api_autocomplete
    • search_api_attachments

SETUP

  1. Navigating to Configuration ➡️ Search ➡️ Search API. Clicking on "Pantheon Search" allows us to view the settings.

    Simple Content Index

  2. Clicking on the "Pantheon Search Admin" tab gives us access to all the current configuration files on the server, including synonyms, stopwords, accents, individual language versions of those files and the schema.xml and supporting files.

    Search Admin

  3. Let's update these config files to the latest version by clicking "Post Solr Schema". You should be rewarded with a "solr schema updated". When updating the schema locally, it will post schema to the local docker instance. When on the pantheon server, it will post the schema to pantheon search.

    Search Admin

A simple content index

  1. Navigate to Configuration ➡️ Search ➡️ Search API and click "Add Index".

    Add Index

  2. Call this "Simple Content Index". Check the datasource "Content", and ensure it's enabled and using the "Pantheon search" server. Click "Save and add fields" at the bottom of the page.

    Simple Content Index

  3. On the "Fields" admin page, click "add fields".

    Simple Content Index

  4. Add a "rendered content" field, configuring the field to render content using the "search index" display.

    Add Rendered HTML Field

  5. Add a "URL" field, configuring the field use an absolute URL.

    Add a URL field generating an absolute URL

  6. When you return to the fields admin page, you will have to refresh to see your added fields and you will be notified that you have unsaved changes. Save your changes.

    Add a URL field generating an absolute URL

  7. Create an excerpt. Under the "processors" tab choose "Highlight" then configure it at the bottom of the page to "always" returning highted fields. Check the "create Excerpt" and exlude any field that is not "rendered item".

  8. Save your changes to the index and reindex the content from the "view" page.

ReIndexing content

  1. On the VIEW tab, drupal will show you a graph of how much of your content is indexed. Right now it should be zero. If it's not zero, Clear the content out of the index and then "index now" until the graph is at 100%.

A simple search page

  1. Navigate to Configuration ➡️ Search ➡️ Search API Pages.

  2. Choose "Add Search Page".

  3. Title your page "Simple Content Search" and choose the SIMPLE CONTENT INDEX you created in the previous section. Click "Next".

  4. Seat up the search page to search rendered output and give it a path of "simple-search". The Parse mode should be "direct query" and it should display content in "Search Result highlighting input". The field it searches should be "rendered html output".

    Search Api Page Configuration

  5. Navigate to Simple Search Page and searching to "thai" should bring back at least one result.

Boosting newer documents

  1. Navigate back to your Simple Content Index Edit the list of fields and in the content section add a "Authored On/Created" and "Changed" date fields.

  2. On the "Processors" tab check the "Boost more recent dates" box. At the bottom of the page will be a config tab for for the processor and under each "created" and "changed" will be an opportunity to give more weight to newly-created documents. Change the boost to newer dates to a value of "5.0" each and save.

  3. Whenever the field list changes, you will need to reindex content before searching. Follow the directions above to reindex content.

Autocomplete

  1. To add Autocomplete to the simple search page, navigate to your Simple Content Index select the "autocomplete" tab and check the box beside the Simple Search page and save. Clearing the drupal cache and returning to the simple search page, you will notice that the search field auto-completes suggestions now.

A simple View-based search page

This tutorial makes the assumption that you have a working knowledge of views and creating them in Drupal.

Create a new view for your search index with a "page" display, making sure of the following:

  1. The view should be querying the content index you created. Indicies will have the word INDEX in the source's title. In this case, it's called "Index simple content index".

  2. The index should return a list of fields and choose the field that is an excerpt highligting the query

  3. Sort the results by RELEVANCY in DESCENDING order.

  4. Make sure, until we get it returning what we want, that the results are not cached.

  5. Be careful with languages. Languages and search don't always work like you think they do. If you're not getting any search results, try adding your site default language to the search URL.

Spellcheck/"Did You Mean…" functionality

  1. Validate that you have the "SEARCH_API_SPELLCHECK" module installed and it is enabled and that you have admin access to its configuration.

  2. Navigate back to your Simple Content Index

  3. Edit your fields and add an "aggregated field":

    • Aggregation type: concatination

    • Contained Fields: Content => body (and any other field from which you want to draw suggestions)

  4. Once the field is added to the list, change the field type to "spellcheck".

  5. Once that field is set up, add a similar field for suggestions but change the field type to "Suggester"

  6. Anytime you change the fields config, you will need to re-index.

  7. Edit your view and add a header or foorter containing the "Search API Spellcheck" and/or "Search API Suggestions".

"More Like This" functionality

"More like this" is a block that locates similarly keyed items from a source item ID. To add "More Like this" create a view from the index in question and instead of displaying the results in a page, just display the results in a block. Under "Contextual Filters" add the "more like this" context ual filter that creates the default argument with the "content id from URL". You can now place this block on a page and it will show related items where placed.

Alternate Query Types

Editing your view and choosing your "Search Fulltext" filter, there's a select list labled "Parse Mode". Changing this parse mode will allow you to query the index in non-traditional ways.

The Extended DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case (independent of user input) -- In other words making the query more like google with the ability to include and exlude items using a simple pharase syntax, e.g. "this NOT that BUT ALSO the other" would return THIS and THE OTHER as long as THAT is not included in the results.

You can use EDisMax in conjunction with "fuzziness" and "Sloppiness" to get queries that further process natural languages and mispellings more efficiently. Tweak the settings and indicies using trial and error to find the best combination of features for your site's content and user base.

Searching Attached Files

You can use the SEARCH_API_ATTACHMENTS module to search inside files like PDF's, MS Word and text files. Add it to the composer file and the Tikka library to search the files is already installed on the server.