Tipps & Tricks: How do I configure the shop search with Smartstore MegaSearch?
Tipps & Tricks: How do I configure the shop search with Smartstore MegaSearch?
Smartstore

Tipps & Tricks: How do I configure the shop search with Smartstore MegaSearch?

For introduction we recommend first the article Plug-in: Smartstore MegaSearch "Find instead of search". to read:

https://smartstore.com/de/smartstore-megasearch-finden-statt-suchen

The MegaSearch plug-in replaces the simple standard search in Smartstore and offers the multiple possibilities of a search engine based on Lucene.Net based full text search. In contrast to the standard search, MegaSearch does not work directly with the database, but switches a file-based search index in between with the aim of making the search as fast and flexible as possible. Currently includes MegaSearch an index for catalog data and one for forum data. The MegaSearch Plus plug-in extends MegaSearch with localized language dependent data, MultiStore and access restrictions, and product and specification attributes. In this post explains the various and sometimes complex setting options of the search using MegaSearch.

A note on the terms used. Lucene.Net is a .NET framework port of the original Apache Lucene search engine library. Product attributes (also known as product variants) are defined by the buyer selectable characteristics of a product, such as. Size and color. Specification attributes, on the other hand, are not selectable and are set to Product detail pages are displayed purely textually. Both types of attributes can be filtered within the search.

A word in advance on the subject Speed of the search. This depends on several factors, besides said configuration also on the data volume and especially the data quality. If large amounts of uncleaned data are rashly pumped into a store, this can cause the performance of the search considerably. In tests, we tended to get good results when searching just under three million products, but received performance degradations when faceting over 40.000 Specification attributes or commodity groups, which sense behind the latter may be. In other words, an increasing amount of data to be faceted affects the search performance more than a large amount of product or catalog data the actual full text search. The term speed is to be understood relatively here. With a usual amount of data the search is very fast and when changing a search setting a difference in speed is hardly noticeable, since it is in the millisecond range.

The settings that affect the search are essentially divided into three areas: the general search settings, the MegaSearch configuration and - in individual cases - the object processing like e.g. of specification attributes.

General search settings

Apply to both the standard search and MegaSearch. Can be found at Configuration > Settings > Search.

The Search Mode determines when there is a match with the search term and accordingly a search hit is obtained. The setting affects both the amount of hits and the speed of the search. "Is equal to" Match) is fastest in this case with comparatively few hits, "Starts with" with significantly more hits somewhat slower and "Includes" slowest with very many hits.

About Search Fields the data fields to be searched are defined. The fewer fields to search, the faster the Search, where the speed increase is usually is comparatively low. The product name (resp. the topic title in forums) is not selectable, because it is always searched. The field for product tags is of special importance here: if a product is available via a but it does not appear in any of the available fields, then a can be entered for it. product tag can be created and assigned to the product. If in the search logging (plug-in Smartstore search logging) you should find you keep coming across a searched term that has no hits or too few hits, then it's worth thinking about giving the corresponding product tag to the product in question.

The Default sort order sets the default sort order of the search hits. "Best Results" ranks hits with high before those with lower relevance. In the case of relevance or the scoring value of a search hit is not an percentage match to the search term. There is no such thing. Therefore it is also not possible to search hits with e.g. a match of 70 or more percent to filter.

"Open product at SKU, MPN or GTIN directly". checks before searching the database if the search term matches one of the mentioned Product specifier. In case of a match, the product page in question will be opened directly, so no search hit list is displayed.

We will skip the settings for instant search (Search-As-You-Type) (since it is self-explanatory) and go directly to the Result filtering. MegaSearch offers a so-called Drill-down faceting, where the search result is narrowed down more and more, the more filters are applied, including display of the number of hits to be expected when applying the respective filter. Very many commodity groups, product or specification attributes to be faceted can slow down the speed of faceting and thus search because all combinations of these values have to be formed and checked for search hits. A reduction of the value for "Maximum number of filters" (default is 20) can speed up this process, if this upper limit is faster. is reached. Assume that there are tens of thousands of product groups with no or very few products in them. Then it can be Depending on the current search term, the faceting may have to check thousands of empty commodity groups before it can access the first with a relevant product, and this on all pages on which facets are offered, i.e. in addition to the search also on all Merchandise group and manufacturer pages. This effect is further enhanced if the catalog setting is "Products from subcategories Include" is activated.

About "Inactive" facets can also be hidden completely, whereby those for product groups are always displayed. In addition exists via "Include unavailable products" you have the possibility to display them by default.

MegaSearch Configuration

Can be found at Plugins > MegaSearch.

The grid shows information about the catalog and forum index. On the right there is a menu with commands like "Reindex" {#index rebuild) and "Refresh" (transfer data of changed products to the index). A scheduled task ensures that the respective index is updated in a certain time interval. This interval should not be too small for large amounts of data because besides product data also further metadata of product groups, product and specification attributes etc. should be selected. to the index must be transferred. "Show Settings" shows index specific settings, where some changes can only be made after a reindexing takes effect.

Filters for product and specification attributes can be activated at this point (assuming MegaSearch Plus), allowing their data is included in the search index and the corresponding filters are displayed in the frontend. The option "Allow filtering on Ignore product level" is a special case for specification attributes. Whether in the frontend after a certain specification attribute can be filtered can be set at both attribute and product level. About said option MegaSearch is instructed "Allow filtering" to basically ignore filtering at the product level, which would allow faceting of very many specification attributes can speed up. We recommend "Allow filtering" if possible only on attribute and not on product level, because it makes it easier to work with (many) specification attributes.

Top commodity groups and Top manufacturers will be displayed as links in the instant search to directly link the search hits to the desired product group or Manufacturer to narrow down. The Search field weighting is an important tool to determine the order of search hits. A higher weighting value of a field ranks the product higher in the search results, provided that a hit was achieved via the field in question. The distances between the individual weighting values should not be chosen too large as this can lead to unwanted results.

"Active Indexes" sets the activated indexes. If you don't use a forum in the frontend, you can set the forum index to at this point and the associated scheduled task to update the Forum Index as well. About "Index always new create" MegaSearch is instructed to always perform a reindexing instead of an update. This must be done in the No longer need to check and respond to product data updates in the background.

Alternative suggestions for a search term are optionally displayed under "Did you mean?" will be displayed. For this purpose, the product names will be displayed as so-called N-Grams are stored in a separate search index. With hundreds of thousands of products, this indexing can be disproportionately time. Therefore it makes sense to use "Maximum proposals to index" to set an upper limit.

By default, when searching for multiple terms, search hits must contain at least one of the searched terms (logical OR operation). The more terms are entered, the more search hits are obtained. The MegaSearch option Hits contain all searched terms causes the opposite. All terms must be included in the search hit (logical AND linkage). The more terms are entered, the fewer search hits are obtained. This option is especially useful for very large Datasets with very many search hits of interest. The default sort "Best Results" positions though the most relevant hits high up, but this is often not very helpful in practice when there are very many hits, which is mainly due to the very rough Lucene scoring (on the "Best Results" is based on), where as a rule. relatively few scores are calculated. M.a.W. for many hits, there are also many that achieve the same score. When activating Hits contain all searched terms is to note that it increases the likelihood of cases where the instant search returns no hits, but the search page very much does. The is the case, if the hits come about via a field, which is disregarded in the instant search for performance reasons. is ignored for performance reasons, e.g. the long description. Furthermore: generally, the more search terms you enter, the slower the Faceting. When activating Hits contain all searched terms however, this effect is stronger.

The settings for Text Analysis are aimed at advanced users and control how the texts to be searched internally . are processed by Lucene.Net. Lucene.Net does not directly compare search term with e.g. the product name, but the resulting derived from it Terms. Consequently, it is not the original text that is used, but rather the text extracted from it (with the help of so-called "text"). Analyzer) determined term is stored in the index. Similarly, when a search term is entered, it is also broken down into terms to make it comparable with the index to make it comparable with the index. One speaks or therefore also distinguishes between search and indexing phases with regard to internal processing.

The setting "Text Analysis" sets the "default analyzer" which will always be used when a field does not have a special analyzer is used for a field (fallback). By default, MegaSearch always processes texts based on language (recommended) -, only in special cases, a different analyzer should be selected here (experimental). For the fields SKU, EAN and manufacturer product number this has no effect, because these texts are always analyzed by keywords (KeywordAnalyzer). The Minimum word length specifies the minimum length of the terms emitted by the analyzer. Terms of smaller length are used during search and indexing phase are ignored. In case of a very large search index resp. very many products, it can be useful to increase this value to avoid "hit noise". to reduce "hit noise", i.e. to achieve fewer, but more accurate hits.

At "Advanced Text Analysis." further options are enabled for even finer control. "Deviant Textual Analysis" specifies a word separation and filtering different from the "Standard analyzer" above and is used especially for the text analysis performed for the product name which is important for the search. The product name is a special case in terms of text analysis, which differs in terms of text type from Store to store can differ significantly. "Red living room blanket with check pattern for cozy evenings" would be for example a descriptive and "XLB-A9.Cistus Incanus Powder" a descriptive name. If the product name is predominantly descriptive in nature, then it may be make sense to use "Divergent Text Analysis" in order to avoid that the names might be. chopped up too much or individual terms are filtered out. Example: For the product name "XLB-A9.Cistus Incanus Powder" will with MegaSearch default settings (without advanced text analysis) the following terms are generated: xlb, a9, cistus, incanus, pulv. That can with respect to the first two terms tends to lead to too many hits that are perceived as too inaccurate, provided that after "xlb-a9.cistus" is searched for. Instead "Divergent Text Analysis." with the value Whitespace following terms emit: xlb-a9.cistus, incanus, pulv. If the search term is the same, a more accurate hit list should be expected here. "Advanced Text Analysis" also allows you to maintain lists for special cases, including for abbreviations, synonyms, dictionary additions and exceptions for word compounds. This can be important for special, frequently occurring terms. A store for Computer hardware could be used here as synonyms e.g. "notebook,laptop,convertible,mobilecomputer" maintain.

Search settings when editing objects

Using MegaSearch Plus, you can specify for product and specification attributes whether filters should be used in the frontend for the relevant attribute should be offered in the frontend. In the case of specification attributes, this is also the case for product editing at individual assignments of attributes to products. However, it is recommended to make this setting directly at the attribute, because it makes it easier to work with attributes. If you observe performance degradation as a result of too many attributes, then you should use the already mentioned option "Ignore filtering at product level" activate and "Allow filtering" disable for those specification attributes that are not mandatory to filter for in the frontend.

At the Display of the search filters can be selected between control boxes, color boxes, image boxes and "Numeric range". The latter is only available for specification attributes and requires numeric values to be stored in the options. About this Filter type can be used to e.g. Color ranges resp. Define color shades, which are then filtered in the frontend via a from-to selection can be defined. This type is also useful for attributes with a lot of options, where a single selection would be too confusing. The Setting "Index Option Names." causes the names of options to be included in the search index so that the related products are also found via them.

Outlook

The currently further options and extensions for MegaSearch that are being planned.

  • Sorting by recommendation, according to the order specified by the dealer. This option currently exists only for product lists of Product groups, not in the search.
  • Search index for Page Builder stories.
  • Department search provides link to the department.