feat(container): update image getmeili/meilisearch to v1.12.0 #1156

spicerabot · 2024-12-21T10:06:36Z

This PR contains the following updates:

Package	Update	Change
getmeili/meilisearch	minor	`v1.10.3` -> `v1.12.0`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

meilisearch/meilisearch (getmeili/meilisearch)

`v1.12.0`: 🦗

Compare Source

Meilisearch v1.12 introduces significant indexing speed improvements, almost halving the time required to index large datasets. This release also introduces new settings to customize and potentially further increase indexing speed.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Improve indexing speed

Indexing time is improved across the board!

Performance is maintained or better on smaller machines
On bigger machines with multiple cores and good IO, Meilisearch v1.12 is much faster than Meilisearch v1.11
- More than twice as fast for raw document insertion tasks.
- More than x4 as fast for incrementally updating documents in a large database.
- Embeddings generation was also improved up to x1.5 for some workloads.

The new indexer also makes task cancellation faster.

Done by @dureuill, @ManyTheFish, and @Kerollmops in #4900.

New index settings: use `facetSearch` and `prefixSearch` to improve indexing speed

v1.12 introduces two new index settings: facetSearch and prefixSearch.

Both settings allow you to skip parts of the indexing process. This leads to significant improvements to indexing speed, but may negatively impact search experience in some use cases.

Done by @ManyTheFish in #5091

`facetSearch`

Use this setting to toggle facet search:

curl \
  -X PUT 'http://localhost:7700/indexes/books/settings/facet-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'true'

The default value for facetSearch is true. When set to false, this setting disables facet search for all filterable attributes in an index.

`prefixSearch`

Use this setting to configure the ability to search a word by prefix on an index:

curl \
  -X PUT 'http://localhost:7700/indexes/books/settings/prefix-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'disabled'

prefixSearch accepts one of the following values:

"indexingTime": enables prefix processing during indexing. This is the default Meilisearch behavior
"disabled": deactivates prefix search completely

Disabling prefix search means the query he will no longer match the word hello. This may significantly impact search result relevancy, but speeds up the indexing process.

New API route: `/batches`

The new /batches endpoint allow you to query information about task batches.

GET /batches returns a list of batch objects:

curl  -X GET 'http://localhost:7700/batches'

This endpoint accepts the same parameters as GET /tasks route, allowing you to narrow down which batches you want to see. Parameters used with GET /batches apply to the tasks, not the batches themselves. For example, GET /batches?uid=0 returns batches containing tasks with a taskUid of 0 , not batches with a batchUid of 0.

You may also query GET /batches/:uid to retrieve information about a single batch object:

curl  -X GET 'http://localhost:7700/batches/BATCH_UID'

/batches/:uid does not accept any parameters.

Batch objects contain the following fields:

{
  "uid": 160,
  "progress": {
    "steps": [
      {
        "currentStep": "processing tasks",
        "finished": 0,
        "total": 2
      },
      {
        "currentStep": "indexing",
        "finished": 2,
        "total": 3
      },
      {
        "currentStep": "extracting words",
        "finished": 3,
        "total": 13
      },
      {
        "currentStep": "document",
        "finished": 12300,
        "total": 19546
      }
    ],
    "percentage": 37.986263
  },
  "details": {
    "receivedDocuments": 19547,
    "indexedDocuments": null
  },
  "stats": {
    "totalNbTasks": 1,
    "status": {
      "processing": 1
    },
    "types": {
      "documentAdditionOrUpdate": 1
    },
    "indexUids": {
      "mieli": 1
    }
  },
  "duration": null,
  "startedAt": "2024-12-12T09:44:34.124726733Z",
  "finishedAt": null
}

Additionally, task objects now include a new field, batchUid. Use this field together with /batches/:uid to retrieve data on a specific batch.

{
  "uid": 154,
  "batchUid": 142,
  "indexUid": "movies_test2",
  "status": "succeeded",
  "type": "documentAdditionOrUpdate",
  "canceledBy": null,
  "details": {
    "receivedDocuments": 1,
    "indexedDocuments": 1
  },
  "error": null,
  "duration": "PT0.027766819S",
  "enqueuedAt": "2024-12-02T14:07:34.974430765Z",
  "startedAt": "2024-12-02T14:07:34.99021667Z",
  "finishedAt": "2024-12-02T14:07:35.017983489Z"
}

Done by @irevoire in #5060, #5070, #5080

Other improvements

New query parameter for GET /tasks: reverse. If reverse is set to true, tasks will be returned in reversed order, from oldest to newest tasks. Done by @irevoire in #5048
Phrase searches withshowMatchesPosition set to true give a single location for the whole phrase @flevi29 in #4928
New Prometheus metrics by @PedroTurik in #5044
When a query finds matching terms in document fields with array values, Meilisearch now includes an indices field to _matchesPosition specifying which array elements contain the matches by @LukasKalbertodt in #5005
⚠️ Breaking vectorStore change: field distribution no longer contains _vectors. Its value used to be incorrect, and there is no current use case for the fixed, most likely empty, value. Done as part of #4900
Improve error message by adding index name in #5056 by @airycanon

Fixes 🐞

Return appropriate error when primary key is greater than 512 bytes, by @flevi29 in #4930
Fix issue where numbers were segmented in different ways depending on tokenizer, by @dqkqd in https://github.com/meilisearch/charabia/pull/311
Fix pagination when embedding fails by @dureuill in https://github.com/meilisearch/meilisearch/pull/5063
Fix issue causing Meilisearch to ignore stop words in some cases by @ManyTheFish in #5062
Fix phrase search with attributesToSearchOn in #5062 by @ManyTheFish

Misc

Dependencies updates
- Update benchmarks to match the new crates subfolder by @Kerollmops in #5021
- Fix the benchmarks by @irevoire in #5037
- Bump Swatinem/rust-cache from 2.7.1 to 2.7.5 in #5030
- Update charabia v0.9.2 by @ManyTheFish in #5098
- Update mini-dashboard to v0.2.16 version by @curquiza in #5102
CIs and tests
- Improve performance of delete_index.rs by @DerTimonius in #4963
- Improve performance of create_index.rs by @DerTimonius in #4962
- Improve performance of get_documents.rs by @PedroTurik in #5025
- Improve performance of formatted.rs by @PedroTurik in #5043
- Fix the path used in the flaky tests CI by @Kerollmops in #5049
Misc
- Rollback the Meilisearch Kawaii logo by @Kerollmops in #5017
- Add image source label to Dockerfile by @wuast94 in #4990
- Hide code complexity into a subfolder by @Kerollmops in #5016
- Internal tool: implement offline upgrade from v1.10 to v1.11 by @irevoire in #5034
- Internal tool: implement offline upgrade from v1.11 to v1.12 by @ManyTheFish in #5146
- Meilisearch is now able to retrieve Katakana words from a Hiragana query by @tats-u in https://github.com/meilisearch/charabia/pull/312
- Improve error handling when writing into LMDB by @Kerollmops in https://github.com/meilisearch/meilisearch/pull/5089

❤️ Thanks again to our external contributors:

`v1.11.3`: 🐿️

Compare Source

What's Changed

For REST/OpenAI/ollama autoembedders users: Retry if deserialization of remote response failed by @dureuill in https://github.com/meilisearch/meilisearch/pull/5058

Full Changelog: meilisearch/meilisearch@v1.11.2...v1.11.3

What's Changed

Add 3s timeout to embedding requests made during search by @dureuill in https://github.com/meilisearch/meilisearch/pull/5039

Full Changelog: meilisearch/meilisearch@v1.11.0...v1.11.1

`v1.11.0`: 🐿️

Compare Source

Meilisearch v1.11 introduces AI-powered search performance improvements thanks to binary quantization and various usage changes, all of which are steps towards a future stabilization of the feature. We have also improved federated search usage following user feedback.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Experimental - AI-powered search improvements

This release is Meilisearch's first step towards stabilizing AI-powered search and introduces a few breaking changes to its API. Consult the PRD for full usage details.

Done by @dureuill in #4906, #4920, #4892, and #4938.

⚠️ Breaking changes

When performing AI-powered searches, hybrid.embedder is now a mandatory parameter in GET and POST /indexes/{:indexUid}/search
As a consequence, it is now mandatory to pass hybrid even for pure semantic searches
embedder is now a mandatory parameter in GET and POST /indexes/{:indexUid}/similar
Meilisearch now ignores semanticRatio and performs a pure semantic search for queries that include vector but not q

Addition & improvements

The default model for OpenAI is now text-embedding-3-small instead of text-embedding-ada-002
This release introduces a new embedder option: documentTemplateMaxBytes. Meilisearch will truncate a document's template text when it goes over the specified limit
Fields in documentTemplate include a new field.is_searchable property. The default document template now filters out both empty fields and fields not in the searchable attributes list:

v1.11:

{% for field in fields %}
  {% if field.is_searchable and not field.value == nil %}
    {{ field.name }}: {{ field.value }}\n
  {% endif %}
{% endfor %}

v1.10:

{% for field in fields %}
  {{ field.name }}: {{ field.value }}\n
{% endfor %}

Embedders using the v1.10 document template will continue working as before. The new default document template will only work with newly created embedders.

Vector database indexing performance improvements

v1.11 introduces a new embedder option, binaryQuantized:

curl \
  -X PATCH 'http://localhost:7700/indexes/movies/settings' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "embedders": {
      "image2text": {
        "binaryQuantized": true
      }
    }
  }'

Enable binary quantization to convert embeddings of floating point numbers into embeddings of boolean values. This will negatively impact the relevancy of AI-powered searches but significantly improve performance in large collections with more than 100 dimensions.

In our benchmarks, this reduced the size of the database by a factor of 10 and divided the indexing time by a factor of 6 with little impact on search times.

[!WARNING]
Enabling this feature will update all of your vectors to contain only 1s or -1s, significantly impacting relevancy.

You cannot revert this option once you enable it. Before setting binaryQuantized to true, Meilisearch recommends testing it in a smaller or duplicate index in a development environment.

Done by @irevoire in #4941.

Federated search improvements

Facet distribution and stats for federated searches

This release adds two new federated search options, facetsByIndex and mergeFacets. These allow you to request a federated search for facet distributions and stats data.

Facet information by index

To obtain facet distribution and stats for each separate index, use facetsByIndex when querying the POST /multi-search endpoint:

POST /multi-search
{
  "federation": {
    "limit": 20,
    "offset": 0,
	"facetsByIndex": {
	  "movies": ["title", "id"],
	  "comics": ["title"],
	}
  },
  "queries": [
    {
      "q": "Batman",
      "indexUid": "movies"
    },
    {
      "q": "Batman",
      "indexUid": "comics"
    }
  ]
}

The multi-search response will include a new field, facetsByIndex with facet data separated per index:

{
  "hits": […],
  …
  "facetsByIndex": {
      "movies": {
        "distribution": {
          "title": {
            "Batman returns": 1
          },
          "id": {
            "42": 1
          }
        },
        "stats": {
          "id": {
            "min": 42,
            "max": 42
          }
        }
      },
     …
  }
}

Merged facet information

To obtain facet distribution and stats for all indexes merged into a single, use both facetsByIndex and mergeFacets when querying the POST /multi-search endpoint:

POST /multi-search
{

  "federation": {
    "limit": 20,
    "offset": 0,
	  "facetsByIndex": {
	    "movies": ["title", "id"],
	    "comics": ["title"],
	  },
	  "mergeFacets": {
	    "maxValuesPerFacet": 10,
	  }
  }
  "queries": [
    {
      "q": "Batman",
      "indexUid": "movies"
    },
    {
      "q": "Batman",
      "indexUid": "comics"
    }
  ]
}

The response includes two new fields, facetDistribution and facetStarts:

{
  "hits": […],
  …
  "facetDistribution": {
    "title": {
      "Batman returns": 1
      "Batman: the killing joke":
    },
    "id": {
      "42": 1
    }
  },
  "facetStats": {
    "id": {
      "min": 42,
      "max": 42
    }
  }
}

Done by @dureuill in #4929.

Experimental — New `STARTS WITH` filter operator

Enable the experimental feature to use the STARTS WITH filter operator:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "containsFilter": true
  }'

Use the STARTS WITH operator when filtering:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "filter": "hero STARTS WITH spider"
  }'

🗣️ This is an experimental feature, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @Kerollmops in #4939.

Other improvements

Language support and localizedAttributes settings by @ManyTheFish in #4937
- Add ISO-639-1 variants
- Convert ISO-639-1 into ISO-639-3
Add a German language tokenizer by @luflow in meilisearch/charabia#303 and in #4945
Improve Turkish language support by @tkhshtsh0917 in meilisearch/charabia#305 and in #4957
Upgrade "batch failed" log to error level in #4955 by @dureuill.
Update the search UI: remove the forced capitalized fields, by @curquiza in #4993

Fixes 🐞

⚠️ When using federated search, query.facets was silently ignored at the query level, but should not have been. It now returns the appropriate error. Use federation.facetsByIndex instead if you want facets to be applied during federated search.
Prometheus /metrics return the route pattern instead of the real route when returning the HTTP requests total by @irevoire in #4839
Truncate values at the end of a list of facet values when the number of facet values is larger than maxValuesPerFacet. For example, setting maxValuesPerFacet to 2 could result in ["blue", "red", "yellow"], being truncated to ["blue", "yellow"] instead of ["blue", "red"]`. By @dureuill in #4929
Improve the task cancellation when vectors are used, by @irevoire in #4971
Swedish support: the characters å, ä, ö are no longer normalized to a and o. By @ManyTheFish in #4945
Update rhai to fix an internal error when updating documents with a function (experimental) by @irevoire in #4960
Fix the bad experimental search queue size by @irevoire in #4992
Do not send empty edit document by function by @irevoire in #5001
Display vectors when no custom vectors were ever provided by @dureuill in #5008

Misc

Dependencies updates
- Security dependency upgrade: bump quinn-proto from 0.11.3 to 0.11.8 by @dependabot in #4911
CIs and tests
- Make the tests run faster by @irevoire in #4808
Documentation
- Fix broken links in README by @iornstein in #4943
Misc
- Allow Meilitool to upgrade from v1.9 to v1.10 without a dump in some conditions, by @dureuill in #4912
- Fix bench by adding embedder by @dureuill in #4954
- Revamp analytics by @irevoire in #5011

❤️ Thanks again to our external contributors:

Meilisearch: @iornstein.
Charabia: @luflow, @tkhshtsh0917.

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

spicerabot · 2024-12-21T10:07:11Z

--- kubernetes/apps/selfhosted/hoarder/app Kustomization: flux-system/hoarder HelmRelease: selfhosted/hoarder

+++ kubernetes/apps/selfhosted/hoarder/app Kustomization: flux-system/hoarder HelmRelease: selfhosted/hoarder

@@ -66,13 +66,13 @@

               MEILI_NO_ANALYTICS: 'true'
             envFrom:
             - secretRef:
                 name: hoarder-secret
             image:
               repository: getmeili/meilisearch
-              tag: v1.10.3@sha256:9d1b9b02fe6c68f60b54ce40092d8078f051b9341c400c90f907607636b7c9c1
+              tag: v1.12.0@sha256:27f831c41cb735a9c2314b61a33a14552367af6b3e4bcf840f21fd0e64e37a8a
             resources:
               limits:
                 memory: 128Mi
               requests:
                 cpu: 10m
     ingress:

spicerabot · 2024-12-21T10:07:12Z

--- HelmRelease: selfhosted/hoarder Deployment: selfhosted/hoarder

+++ HelmRelease: selfhosted/hoarder Deployment: selfhosted/hoarder

@@ -79,13 +79,13 @@

       - env:
         - name: MEILI_NO_ANALYTICS
           value: 'true'
         envFrom:
         - secretRef:
             name: hoarder-secret
-        image: getmeili/meilisearch:v1.10.3@sha256:9d1b9b02fe6c68f60b54ce40092d8078f051b9341c400c90f907607636b7c9c1
+        image: getmeili/meilisearch:v1.12.0@sha256:27f831c41cb735a9c2314b61a33a14552367af6b3e4bcf840f21fd0e64e37a8a
         name: meilisearch
         resources:
           limits:
             memory: 128Mi
           requests:
             cpu: 10m

spicerabot bot added renovate/container type/minor labels Dec 21, 2024

spicerabot bot requested a review from spiceratops as a code owner December 21, 2024 10:06

spicerabot bot assigned spiceratops Dec 21, 2024

spicerabot bot added the area/kubernetes label Dec 21, 2024

feat(container): update image getmeili/meilisearch to v1.12.0

b21bee4

spicerabot bot force-pushed the renovate/getmeili-meilisearch-1.x branch from 9000ac0 to b21bee4 Compare December 23, 2024 11:06

spicerabot bot changed the title ~~feat(container): update image getmeili/meilisearch to v1.11.3~~ feat(container): update image getmeili/meilisearch to v1.12.0 Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(container): update image getmeili/meilisearch to v1.12.0 #1156

feat(container): update image getmeili/meilisearch to v1.12.0 #1156

spicerabot bot commented Dec 21, 2024 •

edited

Loading

spicerabot bot commented Dec 21, 2024 •

edited

Loading

spicerabot bot commented Dec 21, 2024 •

edited

Loading

feat(container): update image getmeili/meilisearch to v1.12.0 #1156

Are you sure you want to change the base?

feat(container): update image getmeili/meilisearch to v1.12.0 #1156

Conversation

spicerabot bot commented Dec 21, 2024 • edited Loading

Release Notes

v1.12.0: 🦗

New features and updates 🔥

Improve indexing speed

New index settings: use facetSearch and prefixSearch to improve indexing speed

facetSearch

prefixSearch

New API route: /batches

Other improvements

Fixes 🐞

Misc

v1.11.3: 🐿️

What's Changed

v1.11.2: 🐿️

What's Changed

v1.11.1: 🐿️

What's Changed

v1.11.0: 🐿️

New features and updates 🔥

Experimental - AI-powered search improvements

⚠️ Breaking changes

Addition & improvements

Vector database indexing performance improvements

Federated search improvements

Facet distribution and stats for federated searches

Facet information by index

Merged facet information

Experimental — New STARTS WITH filter operator

Other improvements

Fixes 🐞

Misc

Configuration

spicerabot bot commented Dec 21, 2024 • edited Loading

spicerabot bot commented Dec 21, 2024 • edited Loading

spicerabot bot commented Dec 21, 2024 •

edited

Loading

`v1.12.0`: 🦗

New index settings: use `facetSearch` and `prefixSearch` to improve indexing speed

`facetSearch`

`prefixSearch`

New API route: `/batches`

`v1.11.3`: 🐿️

`v1.11.2`: 🐿️

`v1.11.1`: 🐿️

`v1.11.0`: 🐿️

Experimental — New `STARTS WITH` filter operator

spicerabot bot commented Dec 21, 2024 •

edited

Loading

spicerabot bot commented Dec 21, 2024 •

edited

Loading