Today I had the job to check out, why the search results of a customer are so weird – and even weirder if you use more concrete search terms.
The consequence is, that if you search for “washing machine” you get a ton of washing machines, but if you search for “washing machine $brand”, you get all washing machines and all machines from $brand.
To debug this I finally found a way to get an explanation what is happening: The explain query. With this query you can ask ES/OS how the scoring for a search term and a document is calculated and therefore get an impression how to optimise. The linked article is great, two small additions:
- You don’t need a special field, but can use a simple query for all fields
- The query is a POST query
POST /$indexName/_explain/$documentId
{
"query": {
"query_string": {
"query": "my search query"
}
}
}
Turns out: Elastic search gives great search results and the order is awesome too!
BUT…
But Shopware not only once, but in my opinion twice throws the scoring away:
// \Shopware\Core\Content\Product\SalesChannel\Listing\ProductListingLoader::load public function load(Criteria $origin, SalesChannelContext $context): EntitySearchResult { $origin->addState(Criteria::STATE_ELASTICSEARCH_AWARE); $criteria = clone $origin; $this->addGrouping($criteria); $this->handleAvailableStock($criteria, $context); $ids = $this->productRepository->searchIds($criteria, $context); /** @var list<string> $keys */ $keys = $ids->getIds(); $aggregations = $this->productRepository->aggregate($criteria, $context); // no products found, no need to continue if (empty($keys)) { return new EntitySearchResult( ProductDefinition::ENTITY_NAME, 0, new ProductCollection(), $aggregations, $origin, $context->getContext() ); } $mapping = array_combine($keys, $keys);
In line 10 the score is in $ids
, but latest in line 27 they are gone.
And in the hydrator, the same happens:
// \Shopware\Elasticsearch\Framework\DataAbstractionLayer\ElasticsearchEntitySearchHydrator::hydrate public function hydrate(EntityDefinition $definition, Criteria $criteria, Context $context, array $result): IdSearchResult { if (!isset($result['hits'])) { return new IdSearchResult(0, [], $criteria, $context); } $hits = $this->extractHits($result); $data = []; foreach ($hits as $hit) { $id = $hit['_id']; $data[$id] = [ 'primaryKey' => $id, 'data' => array_merge( $hit['_source'] ?? [], ['id' => $id, '_score' => $hit['_score']] ), ]; }
I don’t now exactly why but my understanding is, that $result has greatly ordered products, but after the foreach they are gone. I assume because PHP is handling the explicit key set somehow and orders the keys. MAYBE(!) this is only a debugging artefact!
I hope I am allowed to debug this further and fix it. Stay tuned for more.
Great post, Fabian!
Although I cannot judge the scoring topic you are addressing, maybe I can add some more stuff regarding the growing result count with more specific queries. Reason for this is the default “OR” search used by shopware.
If you can influence the elastic query and maybe even use a stage approach for your search, results will become much more relevant even before scoring.
some more details are explained here: https://www.searchhub.io/quick-start-with-ocss-creating-a-silver-bullet