-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRIVERS-2926] [PYTHON-4577] BSON Binary Vector Subtype Support #1813
Merged
blink1073
merged 48 commits into
mongodb:master
from
caseyclements:DRIVERS-2926-BSON-Binary-Vectors
Oct 1, 2024
Merged
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
245c869
First commit on DRIVERS-2926-BSON-Binary-Vectors
caseyclements 031cd8c
Turns dtype into enum. Adds handling of padding, __eq__. Removal of n…
caseyclements 8d4e8a2
Added docstring and comments
caseyclements 2df0d6b
Changed order of BinaryVector and Binary in bson._ENCODERS to get tes…
caseyclements 315a115
Changed order of BinaryVector and Binary in bson._ENCODERS to get tes…
caseyclements d74314d
json_util dumps/loads of BinaryVector
caseyclements 27f13c8
Added bson_corpus tests. Needs more, and review of json_util
caseyclements 263f8c7
Removed BinaryVector as separate class. Instead, Binary includes as_v…
caseyclements f8bcdef
Stop setting _USD_C to False
caseyclements 5435785
mypy fixes
caseyclements 5c4d152
Removed stub vector.json for bson_corpus tests
caseyclements f86d040
More tests
caseyclements adcb945
Added description of subtype 9 to bson.Binary docstring
caseyclements 7986cc5
Addressed comments in docstrings.
caseyclements 26b8398
Eased string comparison of exception in xfail in test_bson
caseyclements 28de28a
Updates to docstrings of BinaryVector and BinaryVectorDtype
caseyclements 68235b8
Simplified expected exeption case. Will be refactored with yaml anyway..
caseyclements e2a1a3c
Added draft of test runner
caseyclements bf9758a
Added test cases: padding, and overflow
caseyclements e1590aa
Merge branch 'master' into DRIVERS-2926-BSON-Binary-Vectors
caseyclements c4c7af7
Cast Path to str
caseyclements de5a245
Simplified as_vector API
caseyclements 43bcce4
Added test case: list of floats with dtype int8 raises exception
caseyclements 41ee0bb
Set default padding to 0 in test runner
caseyclements 9d52aeb
Updated test_bson for new as_vector API
caseyclements 0d34464
Updated resync-specs.sh to include bson-binary-vector
caseyclements 1d49656
Updated resync-specs.sh and test cases
caseyclements 2af0ca4
Broke tests into 3 files by dtype
caseyclements c93bae1
Update bson/binary.py
caseyclements f374b5a
Removed json from test_bson_binary_vector and its jsons
caseyclements 0db9866
Addition of Provision (BETA) specifiers change references to 4.10
caseyclements 0532803
Add references to from_vector() and as_vector()
caseyclements 3edeef6
Add subtype number in changelog
caseyclements d199597
Raise ValueErrors not AssertionErrors. Bumped from 4.9 to 4.10
caseyclements abc7cd3
Docstring for as_vector
caseyclements 4550c20
Add slots for BinaryVector
caseyclements 99d44e1
Check subtype before decoding
caseyclements 001636d
Try slots with default padding
caseyclements 637c474
Removed slots arg
caseyclements 2d511f6
Update dataclass
caseyclements 17e1d33
Remove unompressed kwarg from as_vector
caseyclements ce5f3e3
Changed TypeError to ValueError
caseyclements edfe972
Updates after removing uncompressed
caseyclements 8aaa2f6
Fixed expected exceptions in invalid test cases
caseyclements dfb322c
Merge branch 'master' into DRIVERS-2926-BSON-Binary-Vectors
blink1073 8946daf
padding in now Optional[int] = None
caseyclements 9397129
padding does need to be an integer
caseyclements 913403b
Removed unneeded ugly TYPE_FROM_HEX = {key.value: key for key in Bina…
caseyclements File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -195,9 +195,9 @@ class UuidRepresentation: | |
|
||
|
||
VECTOR_SUBTYPE = 9 | ||
"""BSON binary subtype for densely packed vector data. | ||
"""**(BETA)** BSON binary subtype for densely packed vector data. | ||
|
||
.. versionadded:: 4.9 | ||
.. versionadded:: 4.10 | ||
""" | ||
|
||
|
||
|
@@ -207,7 +207,7 @@ class UuidRepresentation: | |
|
||
|
||
class BinaryVectorDtype(Enum): | ||
"""Datatypes of vector subtype. | ||
"""**(BETA)** Datatypes of vector subtype. | ||
|
||
:param FLOAT32: (0x27) Pack list of :class:`float` as float32 | ||
:param INT8: (0x03) Pack list of :class:`int` in [-128, 127] as signed int8 | ||
|
@@ -233,7 +233,7 @@ class BinaryVectorDtype(Enum): | |
|
||
@dataclass | ||
class BinaryVector: | ||
"""Vector of numbers along with metadata for binary interoperability. | ||
"""**(BETA)** Vector of numbers along with metadata for binary interoperability. | ||
|
||
:param data: Sequence of numbers representing the mathematical vector. | ||
:param dtype: The data type stored in binary | ||
|
@@ -257,7 +257,7 @@ class Binary(bytes): | |
the difference between what should be considered binary data and | ||
what should be considered a string when we encode to BSON. | ||
|
||
Subtype 9 provides a space-efficient representation of 1-dimensional vector data. | ||
**(BETA)** Subtype 9 provides a space-efficient representation of 1-dimensional vector data. | ||
Its data is prepended with two bytes of metadata. | ||
The first (dtype) describes its data type, such as float32 or int8. | ||
The second (padding) prescribes the number of bits to ignore in the final byte. | ||
|
@@ -278,8 +278,8 @@ class Binary(bytes): | |
.. versionchanged:: 3.9 | ||
Support any bytes-like type that implements the buffer protocol. | ||
|
||
.. versionchanged:: 4.9 | ||
Addition of vector subtype. | ||
.. versionchanged:: 4.10 | ||
**(BETA)** Addition of vector subtype. | ||
""" | ||
|
||
_type_marker = 5 | ||
|
@@ -405,7 +405,7 @@ def from_vector( | |
dtype: BinaryVectorDtype, | ||
padding: Optional[int] = 0, | ||
) -> Binary: | ||
"""Create a BSON :class:`~bson.binary.Binary` of Vector subtype from a list of Numbers. | ||
"""**(BETA)** Create a BSON :class:`~bson.binary.Binary` of Vector subtype from a list of Numbers. | ||
|
||
To interpret the representation of the numbers, a data type must be included. | ||
See :class:`~bson.binary.BinaryVectorDtype` for available types and descriptions. | ||
|
@@ -435,7 +435,7 @@ def from_vector( | |
return cls(metadata + data, subtype=VECTOR_SUBTYPE) | ||
|
||
def as_vector(self, uncompressed: Optional[bool] = False) -> BinaryVector: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
"""From the Binary, create a list of numbers, along with dtype and padding. | ||
"""**(BETA)** From the Binary, create a list of numbers, along with dtype and padding. | ||
|
||
|
||
:param uncompressed: If true, return the true mathematical vector. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional[int]
->int