Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(container): update image getmeili/meilisearch to v1.12.0 #1156

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

spicerabot[bot]
Copy link
Contributor

@spicerabot spicerabot bot commented Dec 21, 2024

This PR contains the following updates:

Package Update Change
getmeili/meilisearch minor v1.10.3 -> v1.12.0

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.


Release Notes

meilisearch/meilisearch (getmeili/meilisearch)

v1.12.0: 🦗

Compare Source

Meilisearch v1.12 introduces significant indexing speed improvements, almost halving the time required to index large datasets. This release also introduces new settings to customize and potentially further increase indexing speed.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Improve indexing speed

Indexing time is improved across the board!

  • Performance is maintained or better on smaller machines
  • On bigger machines with multiple cores and good IO, Meilisearch v1.12 is much faster than Meilisearch v1.11
    • More than twice as fast for raw document insertion tasks.
    • More than x4 as fast for incrementally updating documents in a large database.
    • Embeddings generation was also improved up to x1.5 for some workloads.

The new indexer also makes task cancellation faster.

Done by @​dureuill, @​ManyTheFish, and @​Kerollmops in #​4900.

New index settings: use facetSearch and prefixSearch to improve indexing speed

v1.12 introduces two new index settings: facetSearch and prefixSearch.

Both settings allow you to skip parts of the indexing process. This leads to significant improvements to indexing speed, but may negatively impact search experience in some use cases.

Done by @​ManyTheFish in #​5091

facetSearch

Use this setting to toggle facet search:

curl \
  -X PUT 'http://localhost:7700/indexes/books/settings/facet-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'true'

The default value for facetSearch is true. When set to false, this setting disables facet search for all filterable attributes in an index.

prefixSearch

Use this setting to configure the ability to search a word by prefix on an index:

curl \
  -X PUT 'http://localhost:7700/indexes/books/settings/prefix-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'disabled'

prefixSearch accepts one of the following values:

  • "indexingTime": enables prefix processing during indexing. This is the default Meilisearch behavior
  • "disabled": deactivates prefix search completely

Disabling prefix search means the query he will no longer match the word hello. This may significantly impact search result relevancy, but speeds up the indexing process.

New API route: /batches

The new /batches endpoint allow you to query information about task batches.

GET /batches returns a list of batch objects:

curl  -X GET 'http://localhost:7700/batches'

This endpoint accepts the same parameters as GET /tasks route, allowing you to narrow down which batches you want to see. Parameters used with GET /batches apply to the tasks, not the batches themselves. For example, GET /batches?uid=0 returns batches containing tasks with a taskUid of 0 , not batches with a batchUid of 0.

You may also query GET /batches/:uid to retrieve information about a single batch object:

curl  -X GET 'http://localhost:7700/batches/BATCH_UID'

/batches/:uid does not accept any parameters.

Batch objects contain the following fields:

{
  "uid": 160,
  "progress": {
    "steps": [
      {
        "currentStep": "processing tasks",
        "finished": 0,
        "total": 2
      },
      {
        "currentStep": "indexing",
        "finished": 2,
        "total": 3
      },
      {
        "currentStep": "extracting words",
        "finished": 3,
        "total": 13
      },
      {
        "currentStep": "document",
        "finished": 12300,
        "total": 19546
      }
    ],
    "percentage": 37.986263
  },
  "details": {
    "receivedDocuments": 19547,
    "indexedDocuments": null
  },
  "stats": {
    "totalNbTasks": 1,
    "status": {
      "processing": 1
    },
    "types": {
      "documentAdditionOrUpdate": 1
    },
    "indexUids": {
      "mieli": 1
    }
  },
  "duration": null,
  "startedAt": "2024-12-12T09:44:34.124726733Z",
  "finishedAt": null
}

Additionally, task objects now include a new field, batchUid. Use this field together with /batches/:uid to retrieve data on a specific batch.

{
  "uid": 154,
  "batchUid": 142,
  "indexUid": "movies_test2",
  "status": "succeeded",
  "type": "documentAdditionOrUpdate",
  "canceledBy": null,
  "details": {
    "receivedDocuments": 1,
    "indexedDocuments": 1
  },
  "error": null,
  "duration": "PT0.027766819S",
  "enqueuedAt": "2024-12-02T14:07:34.974430765Z",
  "startedAt": "2024-12-02T14:07:34.99021667Z",
  "finishedAt": "2024-12-02T14:07:35.017983489Z"
}

Done by @​irevoire in #​5060, #​5070, #​5080

Other improvements

  • New query parameter for GET/tasks: reverse. If reverse is set to true, tasks will be returned in reversed order, from oldest to newest tasks. Done by @​irevoire in #​5048
  • Phrase searches withshowMatchesPosition set to true give a single location for the whole phrase @​flevi29 in #​4928
  • New Prometheus metrics by @​PedroTurik in #​5044
  • When a query finds matching terms in document fields with array values, Meilisearch now includes an indices field to _matchesPosition specifying which array elements contain the matches by @​LukasKalbertodt in #​5005
  • ⚠️ Breaking vectorStore change: field distribution no longer contains _vectors. Its value used to be incorrect, and there is no current use case for the fixed, most likely empty, value. Done as part of #​4900
  • Improve error message by adding index name in #​5056 by @​airycanon

Fixes 🐞

Misc

❤️ Thanks again to our external contributors:

v1.11.3: 🐿️

Compare Source

What's Changed

Full Changelog: meilisearch/meilisearch@v1.11.2...v1.11.3

v1.11.2: 🐿️

Compare Source

What's Changed

Full Changelog: meilisearch/meilisearch@v1.11.1...v1.11.2

v1.11.1: 🐿️

Compare Source

What's Changed

Full Changelog: meilisearch/meilisearch@v1.11.0...v1.11.1

v1.11.0: 🐿️

Compare Source

Meilisearch v1.11 introduces AI-powered search performance improvements thanks to binary quantization and various usage changes, all of which are steps towards a future stabilization of the feature. We have also improved federated search usage following user feedback.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Experimental - AI-powered search improvements

This release is Meilisearch's first step towards stabilizing AI-powered search and introduces a few breaking changes to its API. Consult the PRD for full usage details.

Done by @​dureuill in #​4906, #​4920, #​4892, and #​4938.

⚠️ Breaking changes
  • When performing AI-powered searches, hybrid.embedder is now a mandatory parameter in GET and POST /indexes/{:indexUid}/search
  • As a consequence, it is now mandatory to pass hybrid even for pure semantic searches
  • embedder is now a mandatory parameter in GET and POST /indexes/{:indexUid}/similar
  • Meilisearch now ignores semanticRatio and performs a pure semantic search for queries that include vector but not q
Addition & improvements
  • The default model for OpenAI is now text-embedding-3-small instead of text-embedding-ada-002
  • This release introduces a new embedder option: documentTemplateMaxBytes. Meilisearch will truncate a document's template text when it goes over the specified limit
  • Fields in documentTemplate include a new field.is_searchable property. The default document template now filters out both empty fields and fields not in the searchable attributes list:

v1.11:

{% for field in fields %}
  {% if field.is_searchable and not field.value == nil %}
    {{ field.name }}: {{ field.value }}\n
  {% endif %}
{% endfor %}

v1.10:

{% for field in fields %}
  {{ field.name }}: {{ field.value }}\n
{% endfor %}

Embedders using the v1.10 document template will continue working as before. The new default document template will only work with newly created embedders.

Vector database indexing performance improvements

v1.11 introduces a new embedder option, binaryQuantized:

curl \
  -X PATCH 'http://localhost:7700/indexes/movies/settings' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "embedders": {
      "image2text": {
        "binaryQuantized": true
      }
    }
  }'

Enable binary quantization to convert embeddings of floating point numbers into embeddings of boolean values. This will negatively impact the relevancy of AI-powered searches but significantly improve performance in large collections with more than 100 dimensions.

In our benchmarks, this reduced the size of the database by a factor of 10 and divided the indexing time by a factor of 6 with little impact on search times.

[!WARNING]
Enabling this feature will update all of your vectors to contain only 1s or -1s, significantly impacting relevancy.

You cannot revert this option once you enable it. Before setting binaryQuantized to true, Meilisearch recommends testing it in a smaller or duplicate index in a development environment.

Done by @​irevoire in #​4941.

Federated search improvements

Facet distribution and stats for federated searches

This release adds two new federated search options, facetsByIndex and mergeFacets. These allow you to request a federated search for facet distributions and stats data.

Facet information by index

To obtain facet distribution and stats for each separate index, use facetsByIndex when querying the POST /multi-search endpoint:

POST /multi-search
{
  "federation": {
    "limit": 20,
    "offset": 0,
	"facetsByIndex": {
	  "movies": ["title", "id"],
	  "comics": ["title"],
	}
  },
  "queries": [
    {
      "q": "Batman",
      "indexUid": "movies"
    },
    {
      "q": "Batman",
      "indexUid": "comics"
    }
  ]
}

The multi-search response will include a new field, facetsByIndex with facet data separated per index:

{
  "hits": [],
  
  "facetsByIndex": {
      "movies": {
        "distribution": {
          "title": {
            "Batman returns": 1
          },
          "id": {
            "42": 1
          }
        },
        "stats": {
          "id": {
            "min": 42,
            "max": 42
          }
        }
      },}
}
Merged facet information

To obtain facet distribution and stats for all indexes merged into a single, use both facetsByIndex and mergeFacets when querying the POST /multi-search endpoint:

POST /multi-search
{

  "federation": {
    "limit": 20,
    "offset": 0,
	  "facetsByIndex": {
	    "movies": ["title", "id"],
	    "comics": ["title"],
	  },
	  "mergeFacets": {
	    "maxValuesPerFacet": 10,
	  }
  }
  "queries": [
    {
      "q": "Batman",
      "indexUid": "movies"
    },
    {
      "q": "Batman",
      "indexUid": "comics"
    }
  ]
}

The response includes two new fields, facetDistribution and facetStarts:

{
  "hits": [],
  
  "facetDistribution": {
    "title": {
      "Batman returns": 1
      "Batman: the killing joke":
    },
    "id": {
      "42": 1
    }
  },
  "facetStats": {
    "id": {
      "min": 42,
      "max": 42
    }
  }
}

Done by @​dureuill in #​4929.

Experimental — New STARTS WITH filter operator

Enable the experimental feature to use the STARTS WITH filter operator:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "containsFilter": true
  }'

Use the STARTS WITH operator when filtering:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "filter": "hero STARTS WITH spider"
  }'

🗣️ This is an experimental feature, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @​Kerollmops in #​4939.

Other improvements

Fixes 🐞

  • ⚠️ When using federated search, query.facets was silently ignored at the query level, but should not have been. It now returns the appropriate error. Use federation.facetsByIndex instead if you want facets to be applied during federated search.
  • Prometheus /metrics return the route pattern instead of the real route when returning the HTTP requests total by @​irevoire in #​4839
  • Truncate values at the end of a list of facet values when the number of facet values is larger than maxValuesPerFacet. For example, setting maxValuesPerFacet to 2 could result in ["blue", "red", "yellow"], being truncated to ["blue", "yellow"] instead of ["blue", "red"]`. By @​dureuill in #​4929
  • Improve the task cancellation when vectors are used, by @​irevoire in #​4971
  • Swedish support: the characters å, ä, ö are no longer normalized to a and o. By @​ManyTheFish in #​4945
  • Update rhai to fix an internal error when updating documents with a function (experimental) by @​irevoire in #​4960
  • Fix the bad experimental search queue size by @​irevoire in #​4992
  • Do not send empty edit document by function by @​irevoire in #​5001
  • Display vectors when no custom vectors were ever provided by @​dureuill in #​5008

Misc

❤️ Thanks again to our external contributors:


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

@spicerabot
Copy link
Contributor Author

spicerabot bot commented Dec 21, 2024

--- kubernetes/apps/selfhosted/hoarder/app Kustomization: flux-system/hoarder HelmRelease: selfhosted/hoarder

+++ kubernetes/apps/selfhosted/hoarder/app Kustomization: flux-system/hoarder HelmRelease: selfhosted/hoarder

@@ -66,13 +66,13 @@

               MEILI_NO_ANALYTICS: 'true'
             envFrom:
             - secretRef:
                 name: hoarder-secret
             image:
               repository: getmeili/meilisearch
-              tag: v1.10.3@sha256:9d1b9b02fe6c68f60b54ce40092d8078f051b9341c400c90f907607636b7c9c1
+              tag: v1.12.0@sha256:27f831c41cb735a9c2314b61a33a14552367af6b3e4bcf840f21fd0e64e37a8a
             resources:
               limits:
                 memory: 128Mi
               requests:
                 cpu: 10m
     ingress:

@spicerabot
Copy link
Contributor Author

spicerabot bot commented Dec 21, 2024

--- HelmRelease: selfhosted/hoarder Deployment: selfhosted/hoarder

+++ HelmRelease: selfhosted/hoarder Deployment: selfhosted/hoarder

@@ -79,13 +79,13 @@

       - env:
         - name: MEILI_NO_ANALYTICS
           value: 'true'
         envFrom:
         - secretRef:
             name: hoarder-secret
-        image: getmeili/meilisearch:v1.10.3@sha256:9d1b9b02fe6c68f60b54ce40092d8078f051b9341c400c90f907607636b7c9c1
+        image: getmeili/meilisearch:v1.12.0@sha256:27f831c41cb735a9c2314b61a33a14552367af6b3e4bcf840f21fd0e64e37a8a
         name: meilisearch
         resources:
           limits:
             memory: 128Mi
           requests:
             cpu: 10m

@spicerabot spicerabot bot force-pushed the renovate/getmeili-meilisearch-1.x branch from 9000ac0 to b21bee4 Compare December 23, 2024 11:06
@spicerabot spicerabot bot changed the title feat(container): update image getmeili/meilisearch to v1.11.3 feat(container): update image getmeili/meilisearch to v1.12.0 Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant