Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from milvus-io:master #1

Open
wants to merge 102 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
1cb5a19
build(deps): bump tqdm from 4.65.0 to 4.66.3 (#2064)
dependabot[bot] May 7, 2024
b183480
fix: Remove params for property vars (#2068)
XuanYang-cn May 7, 2024
0a7327d
feat: Support major compaction in ManualCompaction (#2015)
wayblink May 7, 2024
97f12ae
Change sparse related errors to ParamError (#2066)
zhengbuqian May 7, 2024
77bff51
modified example_tls1.py through MilvusClient (#2065)
nish112022 May 8, 2024
f4d11e3
remove scipy dependency for sparse while still supporting scipy spars…
zhengbuqian May 8, 2024
7e8187a
Support milvuslite (#2073)
junjiejiangjjj May 8, 2024
b362af0
enhance: Enable set_properties and describe_database api (#2082)
weiliu1031 May 13, 2024
502ef37
modified example_tls2.py through MilvusClient (#2077)
nish112022 May 14, 2024
3198bd2
enhance: Make bulk_writer's requirments optional (#2086)
XuanYang-cn May 14, 2024
2772640
fix sparse: accpet int/float wrapped in string (#2094)
zhengbuqian May 15, 2024
59154d7
enhance: Expand grpcio version to latest (#2093)
XuanYang-cn May 15, 2024
fa67aa0
fix the str function of the extra list (#2099)
SimFG May 17, 2024
473f62f
Added grpc as a valid protocol for uri (#2090)
brunocfnba May 17, 2024
02fa0f8
feat: allowing search iterator on sparse float vector field (#2104)
zhengbuqian May 27, 2024
8b3b32e
fix: wrong expr handling due to and/or priorities(#2113) (#2114)
MrPresent-Han Jun 4, 2024
977c0cd
Accept list of single row scipy.sparse object as input for insert/sea…
zhengbuqian Jun 4, 2024
e95f5a5
fix: use the existed index_name (#2107)
smellthemoon Jun 5, 2024
70a2a86
enhance: print search result more elegantly (#2123)
longjiquan Jun 12, 2024
50d2e19
change the load status check interval to 0.2s (#2121)
SimFG Jun 12, 2024
bdd07cc
Support float16/bfloat16/sparse vector for bulkwriter (#2127)
yhmo Jun 14, 2024
02c7472
Fix a bug of bulkwriter (#2133)
yhmo Jun 14, 2024
21844e9
feat: disable installation of milvus-lite on windows platform (#2131)…
Raysilience Jun 17, 2024
d0047c7
enhance: Check PyMilvus on Windows platform (#2136)
XuanYang-cn Jun 17, 2024
cddbf61
fix: drop_index got multiple values for keyword argument (#2139)
XuanYang-cn Jun 17, 2024
f7a4839
enhance: update milvus-proto and correct index.drop() (#2142)
XuanYang-cn Jun 18, 2024
08eff03
enhance: Update readme to the latest (#2146)
XuanYang-cn Jun 24, 2024
7c09c5c
feat: support group_size parameter for search_group_by (#2130)
MrPresent-Han Jun 24, 2024
97b8ca3
Export indexed rows for describe_index (#2148)
xiaocai2333 Jun 25, 2024
0318663
build(deps): bump urllib3 from 1.26.18 to 1.26.19 (#2140)
dependabot[bot] Jun 26, 2024
99ad63e
Refine the error message for type mismatches during data insertion (#…
xiaocai2333 Jun 27, 2024
a5281f3
fix: Coding style by latest ruff (#2159)
XuanYang-cn Jul 2, 2024
9a5e4aa
Add database operations to MilvusClient (#2152)
ashkrisk Jul 2, 2024
131989a
enhance: Remove the logic to set replica_number=1 by default (#2163)
weiliu1031 Jul 4, 2024
8e0a27b
enhance: enable setting properties during create database (#2168)
weiliu1031 Jul 11, 2024
6625af7
enhance: upsert support autoid (#2173)
smellthemoon Jul 12, 2024
1b0215c
fix: Edit setuptools upper version for py380 (#2182)
XuanYang-cn Jul 18, 2024
567e2c0
Fix comment in iterator implementation (#2187)
ashkrisk Jul 22, 2024
62434ca
build(deps): bump certifi from 2023.7.22 to 2024.7.4 (#2170)
dependabot[bot] Jul 22, 2024
2a76769
enhance:expose reduce_stop_for_best to users(#2181) (#2183)
MrPresent-Han Jul 22, 2024
8c55e62
Don't export rows info in index params (#2190)
xiaocai2333 Jul 25, 2024
8b3488d
fix: remove limitation clustering key can not be primary key (#2194)
wayblink Jul 25, 2024
9f3214c
enhance: hide zero values ​​when printing (#2200)
SimFG Jul 26, 2024
05aeb1f
fix: unclear error msg for varchar field
XuanYang-cn Aug 1, 2024
3029ed3
enhance: use info level for retry message (#2212)
XuanYang-cn Aug 5, 2024
9ff4b2c
Limit milvus-lite files to end with .db (#2214)
junjiejiangjjj Aug 5, 2024
66c2362
fix: select a single column consisting of a list of column names
Aug 2, 2024
76d4085
Bulkinsert supports importing binlog (#2222)
yhmo Aug 15, 2024
caf8e1f
feat: add page_retain_order param during search with offset
PwzXxm Aug 15, 2024
38fa108
Rename is_major to is_clustering (#2219)
wayblink Aug 17, 2024
42a66f1
enhance: Support load with Field Partial load (#2228)
congqixia Aug 17, 2024
00c922d
enhance: support null and default value (#2234)
smellthemoon Aug 27, 2024
4a65d03
feat: support the mmap_enable param in the field schema (#2238)
SimFG Aug 27, 2024
6d36396
enhance: loose the upper limit for grpcio (#2241)
XuanYang-cn Aug 29, 2024
560adde
enhance: Make load parameter naming normal (#2243)
congqixia Aug 30, 2024
6cc2e55
fix: move page_retain_order to the same level as radius (#2249)
PwzXxm Sep 3, 2024
67f9883
add strict_group_size and rank_group_scorer for hybrid_search(#2253) …
MrPresent-Han Sep 5, 2024
c967c39
fix: not report error when setting default_value=None (#2251)
smellthemoon Sep 5, 2024
87da0f1
enhance: Bulkinsert support csv (#2247)
OxalisCu Sep 5, 2024
baaca92
add hybrid_search for MilvusClient (#2258)
czs007 Sep 10, 2024
11d2fc0
feat: support keyword text match (#2256)
longjiquan Sep 13, 2024
da51ba1
enhance: support print iterator info(#2261) (#2262)
MrPresent-Han Sep 23, 2024
24d8bc6
fix: not put 'default_value' in dict (#2272)
smellthemoon Oct 8, 2024
a7c20e2
fix not check None Type in json and support no need to pass in None w…
smellthemoon Oct 8, 2024
f288c74
remove analyzer_params, added enable_tokenizer and tokenizer_params (…
zhengbuqian Oct 8, 2024
d255ef1
feat(pymilvus/settings.py): Load configuration without altering the e…
laipz8200 Oct 8, 2024
7784050
Update proto to get is_sorted field for GetQuerySegmentInfo (#2280)
xiaocai2333 Oct 8, 2024
f068b1a
support mvcc and break-down-continue for iterator(#2278) (#2279)
MrPresent-Han Oct 9, 2024
b51ebce
fix: upsert rows when set autoid==true fail (#2286)
smellthemoon Oct 10, 2024
4ed5d3e
support new Function feature (#2257)
zhengbuqian Oct 11, 2024
21b3760
Upgrade the bulkWriter cloud API call from v1 to v2 (#2245)
lentitude2tk Oct 11, 2024
60ce3af
feat: allow empty sparse row (#2291)
zhengbuqian Oct 11, 2024
077a045
fix: reset offset to zero after seek(#2292) (#2293)
MrPresent-Han Oct 12, 2024
70118a4
fix: remove duplicate field name check in pymilvus (#2294)
zhengbuqian Oct 12, 2024
967f94a
enhance: enable compatible for iterator(#2278) (#2297)
MrPresent-Han Oct 12, 2024
76de0ab
enhance: Enable bulkwriter to support import v2 (#2295)
bigsheeper Oct 12, 2024
d5a3e59
fix: row based insert/upsert when there is Function in schema (#2298)
zhengbuqian Oct 15, 2024
2040ac6
Improve embedding retrieval performance (#2300)
yhmo Oct 16, 2024
0a31f3b
fix: add BM25 to supported metric type of search iterator (#2301)
zhengbuqian Oct 18, 2024
bb69c1d
fix: simplified the logic to check if the insert/request data matches…
zhengbuqian Oct 21, 2024
c6e0326
fix: Passing messages to code in ParamError (#2304)
XuanYang-cn Oct 21, 2024
4f64af7
fix: Fix the f16 and bf16 dump error & add all type test for csv & cs…
OxalisCu Oct 21, 2024
0fc6b28
enhance: handling cp file for query iterator(#2306) (#2307)
MrPresent-Han Oct 22, 2024
fffeabe
enhance: handling abormal iterator cp file(#2306) (#2310)
MrPresent-Han Oct 23, 2024
377ad60
fix: entity_is_sparse_matrix: each row should be dict or list (#2309)
zhengbuqian Oct 24, 2024
cb4cbc6
Supports filling elements through templates for expression (#2317)
xiaocai2333 Oct 29, 2024
3ee9e10
Fixed typo in comment of step 1 (#2302)
Armaggheddon Oct 29, 2024
47c71af
fix: bulkwriter to skip function output fields (#2319)
zhengbuqian Nov 4, 2024
55800a6
enhance: Rename tokenizer_params to analyzer_params (#2323)
aoiasd Nov 5, 2024
23ca4e3
feat: Add compact, get_server_version and flush api (#2326)
czs007 Nov 6, 2024
650f7cd
Corrected grammar and consistency in error messages (#2289)
Ahmetyasin Nov 6, 2024
43c7a09
modify strict_group_size parameter name(#2328) (#2329)
MrPresent-Han Nov 7, 2024
5bca197
fix queryIterator question (#2316)
lentitude2tk Nov 7, 2024
2d9d661
enhance: Update the template expression proto to improve transmission…
xiaocai2333 Nov 7, 2024
0bf680f
Fix typo and correct grammar (#2333)
CaoHaiNam Nov 7, 2024
2d58eef
Update return type of describe_role to Dict (#2332) (#2337)
CaoHaiNam Nov 8, 2024
2279517
Add text embedding function (#2335)
junjiejiangjjj Nov 8, 2024
e4505ef
Use filter_params for milvus client (#2320)
xiaocai2333 Nov 8, 2024
3110139
enhance: Reorganize the examples (#2340)
XuanYang-cn Nov 11, 2024
5ec0a60
enhance: RBAC Custom Privilege Group API (#2342)
shaoting-huang Nov 13, 2024
47d40d0
fix insert to avoid describe collection on every insert call (#2347)
zhengbuqian Nov 14, 2024
016ff55
Remove unnecessary invoke of describe_collection() (#2345)
yhmo Nov 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions .github/mergify.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,12 @@ pull_request_rules:
- or:
- base=master
- base~=2\.\d
- "status-success=Run Python Tests (3.8)"
- "status-success=Run Python Tests (3.8, windows-latest)"
- "status-success=Run Python Tests (3.12, windows-latest)"
- "status-success=Run Python Tests (3.8, ubuntu-latest)"
- "status-success=Run Python Tests (3.12, ubuntu-latest)"
- "status-success=Run Check Proto (3.8)"
- "status-success=Code lint check (3.8)"
- "status-success=Run Python Tests (3.12)"
- "status-success=Run Check Proto (3.12)"
- "status-success=Code lint check (3.12)"
actions:
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/check_milvus_proto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,11 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e .
pip install -e ".[dev]"

- name: Try generate proto
run: |
git submodule update --init
make gen_proto
make check_proto_product

4 changes: 2 additions & 2 deletions .github/workflows/code_checker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: check pyproject.toml install
- name: Check pyproject.toml install
run: |
pip install -e .
- name: Install requirements
run: |
pip install -r requirements.txt
pip install -e ".[dev]"
- name: Run pylint
shell: bash
run: |
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,11 @@ on:
jobs:
build:
name: Run Python Tests
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.12]
os: [ubuntu-latest, windows-latest]
runs-on: ${{ matrix.os }}

steps:
- name: Checkout code
Expand All @@ -28,7 +29,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[test]"
pip install -e ".[dev]"

- name: Test with pytest
run: |
Expand Down
9 changes: 5 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@ unittest:
PYTHONPATH=`pwd` python3 -m pytest tests --cov=pymilvus -v

lint:
PYTHONPATH=`pwd` black pymilvus --check
PYTHONPATH=`pwd` ruff check pymilvus
PYTHONPATH=`pwd` python3 -m black pymilvus --check
PYTHONPATH=`pwd` python3 -m ruff check pymilvus

format:
PYTHONPATH=`pwd` black pymilvus
PYTHONPATH=`pwd` ruff check pymilvus --fix
PYTHONPATH=`pwd` python3 -m black pymilvus
PYTHONPATH=`pwd` python3 -m ruff check pymilvus --fix

codecov:
PYTHONPATH=`pwd` pytest --cov=pymilvus --cov-report=xml tests -x -v -rxXs
Expand All @@ -25,6 +25,7 @@ get_proto:
git submodule update --init

gen_proto:
pip install -e ".[dev]"
cd pymilvus/grpc_gen && ./python_gen.sh

check_proto_product: gen_proto
Expand Down
11 changes: 5 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The following collection shows Milvus versions and recommended PyMilvus versions
| 2.1.\* | 2.1.3 |
| 2.2.\* | 2.2.15 |
| 2.3.\* | 2.3.7 |
| 2.4.\* | 2.4.0 |
| 2.4.\* | 2.4.9 |


## Installation
Expand All @@ -37,12 +37,13 @@ You can install PyMilvus via `pip` or `pip3` for Python 3.8+:
```shell
$ pip3 install pymilvus
$ pip3 install pymilvus[model] # for milvus-model
$ pip3 install pymilvus[bulk_writer] # for bulk_writer
```

You can install a specific version of PyMilvus by:

```shell
$ pip3 install pymilvus==2.3.7
$ pip3 install pymilvus==2.4.9
```

You can upgrade PyMilvus to the latest version by:
Expand All @@ -62,8 +63,6 @@ $ git submodule update --init

Q2. How to generate python files from milvus-proto?

**Before generating python files, please install requirements in `requirements.txt`**

A2.
```shell
$ make gen_proto
Expand Down Expand Up @@ -94,10 +93,10 @@ Q6. How to run unittests?

A6
```shell
$ pip install ".[test]"
$ pip install ".[dev]"
$ make unittest
```
Q7. `zsh: no matches found: pymilvus[model]` in mac, how do I solve this?
Q7. `zsh: no matches found: pymilvus[model]`, how do I solve this?

A7
```shell
Expand Down
1 change: 1 addition & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Examples
File renamed without changes.
89 changes: 89 additions & 0 deletions examples/bm25.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
from pymilvus import (
MilvusClient,
Function,
FunctionType,
DataType,
)

fmt = "\n=== {:30} ===\n"
collection_name = "doc_in_doc_out"
milvus_client = MilvusClient("http://localhost:19530")

has_collection = milvus_client.has_collection(collection_name, timeout=5)
if has_collection:
milvus_client.drop_collection(collection_name)

schema = milvus_client.create_schema()
schema.add_field("id", DataType.INT64, is_primary=True, auto_id=False)
schema.add_field("document_content", DataType.VARCHAR, max_length=9000, enable_analyzer=True)
schema.add_field("sparse_vector", DataType.SPARSE_FLOAT_VECTOR)

bm25_function = Function(
name="bm25_fn",
input_field_names=["document_content"],
output_field_names="sparse_vector",
function_type=FunctionType.BM25,
)
schema.add_function(bm25_function)

index_params = milvus_client.prepare_index_params()
index_params.add_index(
field_name="sparse_vector",
index_name="sparse_inverted_index",
index_type="SPARSE_INVERTED_INDEX",
metric_type="BM25",
params={"bm25_k1": 1.2, "bm25_b": 0.75},
)

ret = milvus_client.create_collection(collection_name, schema=schema, index_params=index_params, consistency_level="Strong")
print(ret)

print(fmt.format(" all collections "))
print(milvus_client.list_collections())

print(fmt.format(f"schema of collection {collection_name}"))
print(milvus_client.describe_collection(collection_name))

rows = [
{"id": 1, "document_content": "hello world"},
{"id": 2, "document_content": "hello milvus"},
{"id": 3, "document_content": "hello zilliz"},
]

print(fmt.format("Start inserting entities"))
insert_result = milvus_client.insert(collection_name, rows, progress_bar=True)
print(fmt.format("Inserting entities done"))
print(insert_result)

texts_to_search = ["zilliz"]
search_params = {
"metric_type": "BM25",
"params": {}
}
print(fmt.format(f"Start search with retrieve several fields."))
result = milvus_client.search(collection_name, texts_to_search, limit=3, output_fields=["document_content"], search_params=search_params)
for hits in result:
for hit in hits:
print(f"hit: {hit}")

print(fmt.format("Start query by specifying primary keys"))
query_results = milvus_client.query(collection_name, ids=[3])
print(query_results[0])

upsert_ret = milvus_client.upsert(collection_name, {"id": 2 , "document_content": "hello milvus again"})
print(upsert_ret)

print(fmt.format("Start query by specifying filtering expression"))
query_results = milvus_client.query(collection_name, filter="document_content == 'hello milvus again'")
for ret in query_results:
print(ret)

print(f"start to delete by specifying filter in collection {collection_name}")
delete_result = milvus_client.delete(collection_name, ids=[3])
print(delete_result)

print(fmt.format("Start query by specifying filtering expression"))
query_results = milvus_client.query(collection_name, filter="document_content == 'hello zilliz'")
print(f"Query results after deletion: {query_results}")

milvus_client.drop_collection(collection_name)
Loading