Releases: Blosc/python-blosc2
Release 3.0.0 beta4
Changes from 3.0.0-beta.3 to 3.0.0-beta.4
-
Many new examples in the documentation. Now, the documentation is more complete and has a better structure.
Have a look at our new docs at: https://www.blosc.org/python-blosc2/index.html
For a guide on using UDFs, check out: https://www.blosc.org/python-blosc2/reference/autofiles/lazyarray/blosc2.lazyudf.html
If interested in asynchronously fetching parts of an array, take a look at: https://www.blosc.org/python-blosc2/reference/autofiles/proxy/blosc2.Proxy.afetch.html
Finally, there is a new tutorial on optimizing reductions in large NDArray objects: https://www.blosc.org/python-blosc2/getting_started/tutorials/04.reductions.html
Special thanks @omaech and @martaiborrar for the excellent work on the documentation and examples, and to @numfocus for their support in making this possible! -
New CParams, DParams and Storage dataclasses for better handling of parameters in the library. Now, you can use these dataclasses to pass parameters to the library, and get a better error handling. See here. Thanks to @martaiborra for the excellent implementation.
-
Better support for CParams in Proxy and C2Array instances. This allows to better propagate compression parameters from Caterva2 datasets to the Proxy and C2Array instances, improving the perception of codecs and filters used originally in datasets. Thanks to @FrancescAlted for the implementation.
-
Many improvements in ruff linting and code style. Thanks to @DimitriPapadopoulos for the excellent work in this area.
Release 3.0.0 beta3
Changes from 3.0.0-beta.1 to 3.0.0-beta.3
-
Revamped documentation. Now, it is more complete and has a better structure. Thanks to Oumaima Ech Chdig (@omaech), our newcomer to the Blosc team. Also, thanks to NumFOCUS for their support in this task.
-
New
Proxy
class to access other arrays, while providing caching. This is useful for example when you have a big array, and you want to access a small part of it, but you want to cache the accessed data for later use. See its doc. -
Lazy expressions can accept proxies as operands.
-
Read-ahead support for reading super-chunks from disk. This allows for overlapping reads and computations, which can be a big performance boost for some workloads.
-
New BLOSC_LOW_MEM envar for keeping memory under a minimum while evaluating expressions. This makes it possible to evaluate expressions on very large arrays, even if the memory is limited (at the expense of performance).
-
Fine tune block sizes for the internal compute engine.
-
Better CPU cache size guessing for linux and macOS.
-
Build tooling has been modernized and now uses
pyproject.toml
andscikit-build-core
for managing dependencies and building the package. Thanks to @LecrisUT for the excellent guidance in this area. -
Many code cleanup and syntax improvements in code. Thanks to @DimitriPapadopoulos.
Release 2.7.1
Changes from 2.7.0 to 2.7.1
- Updated to latest C-Blosc2 2.15.1.
Fixes SIGKILL issues when using theblosc2
library in old Intel CPUs.
Release 3.0.0 beta 1
Changes from 2.6.2 to 3.0.0-beta.1
-
New evaluation engine (based on numexpr) for NDArray instances. Now, you can evaluate expressions like
a + b + 1
wherea
andb
are NDArray instances. This is a powerful feature that allows for efficient computations on compressed data, and supports advanced features like reductions, filters, user-defined functions and broadcasting (still in beta). See this example. -
As a consequence of the above, there are many new functions to operate with, and evaluate NDArray instances. See the function section docs for more information.
-
Support for NumPy 2.0.0 is here! Now, the wheels are built with NumPy 2.0.0. If you want to use NumPy 1.x, you can still use it by installing NumPy 1.23 and up.
-
Support for memory mapping in
SChunk
andNDArray
instances. This allows to map super-chunks stored in disk and access them as if they were in memory. If curious, see some benchmarks here. Thanks to @JanSellner for the excellent implementation, both in the C and the Python libraries. -
Internal C-Blosc2 updated to 2.15.0.
-
32-bit platforms are officially unsupported now. If you need support for 32-bit platforms, please use python-blosc 1.x series.
Release 2.7.0
Changes from 2.6.2 to 2.7.0
-
Updated to latest C-Blosc2 2.15.0.
-
Deprecated
LazyExpr.evaluate()
. -
Fixed
_check_rc
function. See #187.
Release 2.6.2
Changes from 2.6.1 to 2.6.2
-
Protection when platforms have just one CPU. This caused the
internal number of threads to be 0, producing a division by zero. -
Updated to latest C-Blosc2 2.14.3.
Release 2.6.1
Changes from 2.6.0 to 2.6.1
- Updated to latest C-Blosc2 2.14.1. This was necessary to be able to
load dynamics plugins on Windows.
Release 2.6.0
Changes from 2.5.1 to 2.6.0
-
[EXP] New evaluation engine (based on numexpr) for NDArray instances.
Now, you can evaluate expressions likea + b + 1
wherea
andb
are NDArray instances. This is a powerful feature that allows for
efficient computations on compressed data. See this example to see how this works.
Thanks to @omaech for her help in thepow
function. -
As a consequence of the above, there are many new functions to operate with
NDArray instances. See the function section in NDArray API for more information. -
Support for NumPy 2.0.0 is here! Now, the wheels are built with NumPy 2.0.0rc1.
Please tell us in case you see any issues with this new version. -
Add
**kwargs
toload_tensor()
function. This allows to pass additional parameters
to the deserialization function. Thanks to @jasam-sheja. -
Fix
vlmeta.to_dict()
not honoring tuple encoding. Thanks to @ivilata. -
Check that chunks/blocks computation does not allow a 0 in blocks. Thanks to @ivilata.
-
Many improvements in ruff rules and others. Thanks to @DimitriPapadopoulos.
-
Remove printing large arrays in notebooks (they use too much RAM in recent versions of Jupyter notebook).
-
Updated to latest C-Blosc2 2.14.0.
Release 2.5.1
Changes from 2.5.0 to 2.5.1
-
Updated to latest C-Blosc2 2.13.1.
-
Fixed bug in
b2nd.h
.
Changes from 2.4.0 to 2.5.0
-
Updated to latest C-Blosc2 2.13.0.
-
Added the filter
INT_TRUNC
for integer truncation. -
Added some optimizations for zstd.
-
Now the grok library is initialized when loading the
plugin from C-Blosc2. -
Improved doc.
-
Support for slices in
blosc2.get_slice_nchunks()
when using SChunk
objects.
Release 2.4.0
Changes from 2.3.2 to 2.4.0
-
Updated to latest C-Blosc2 2.12.0.
-
Added
blosc2.get_slice_nchunks()
to get array of chunk
indexes needed to get a slice of a Blosc2 container. -
Added grok codec plugin.
-
Added imported target with pkg-config to support windows.