Skip to content

Releases: unionai-oss/pandera

Release v0.21.0: Reduce import and schema creation runtime, add docsearch search bar

15 Nov 00:48
9667234
Compare
Choose a tag to compare

⭐️ Highlights

This release optimizes the import and schema creation runtime so that importing pandera and creating a schema (without doing any validation) happens in ~5 ms (before it would be >800ms). It also updates the docs to use docsearch for a better search experience.

What's Changed

New Contributors

Full Changelog: v0.20.4...v0.21.0

Release v0.20.4: Bugfixes to polars & pyspark backends and more

03 Sep 15:32
ba6cd87
Compare
Choose a tag to compare

What's Changed

  • Bugfix/1732: Fix misleading error when columns are missing and lazy=True by @benlee1284 in #1752
  • Bugfix/1644: refactor geopandas and pyarrow dtypes to avoid top-level import by @cosmicBboy in #1753
  • regex column errors should report the correct column name by @cosmicBboy in #1754
  • bugfix/1657: use rename instead of select in polars check backend by @cosmicBboy in #1757
  • make sure registered checks supports error kwarg by @cosmicBboy in #1756
  • make sure optional generic types are supported by @cosmicBboy in #1758
  • fix: SQLModel table model not validated by @AlpAribal in #1696
  • Restore accidentally-deleted use of "breakpoint()" by @deepyaman in #1763
  • Swap types-pkg_resources with types-setuptools by @deepyaman in #1779
  • Add support for Spark Connect dataframes by @filipeo2-mck in #1775
  • feat: select_columns reorders columns by default by @ldacey in #1783
  • Update Polars dtype test to generate more examples by @deepyaman in #1770
  • bugfix/1784 polars DataFrameModel.to_json_schema() fails on DateTime column by @AlpAribal in #1789
  • fix pd.ArrowDtype use in pandera engine for old pd versions by @cosmicBboy in #1792
  • Reexport polars function to match pyright expectation by @gab23r in #1797

New Contributors

Full Changelog: v0.20.3...v0.20.4

Release v0.20.3: polars integration cleanup, docs updates, bugfixes

17 Jul 19:53
9b30a8e
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.20.2...v0.20.3

Release v0.20.2: Complete pyarrow coverage, support polars v1

16 Jul 13:08
50949b0
Compare
Choose a tag to compare

⭐️ Highlights:

What's Changed

New Contributors

Full Changelog: v0.20.1...v0.20.2

Release v0.20.1: Bugfix for pyarrow dependency error

27 Jun 18:41
44a9763
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.20.0...v0.20.1

Release v0.20.0: Pyarrow dtype support

26 Jun 16:58
a041e06
Compare
Choose a tag to compare

⭐️ Highlights

  • Pandera now supports pyarrow datatypes in the pandera validation engine! Big shoutout to @aaravind100 for the heavy lifting here.
  • Added compatibility for numpy v2.
  • Add compatibility for polars v1
  • pandera.SchemaModel is now deprecated, use pandera.DataFrameModel instead.

What's Changed

New Contributors

Full Changelog: v0.19.2...v0.20.0

Release 0.19.3: Polars dtype bugfixes

14 May 02:45
d2bfed0
Compare
Choose a tag to compare

What's Changed

  • bugfix: timezone-agnostic datetime in polars works in DataFrameModel by @cosmicBboy #1638
  • Bugfix/1631: Series[Annotated[...]] DataFrameModel types should correctly create a DataFrameSchema by @cosmicBboy in #1633
  • Add missing pandas import line. by @kyleweise in #1635

New Contributors

Full Changelog: v0.19.2...v0.19.3

Release v0.19.2: Bugfix on correctly checking nullable Floats

08 May 21:28
63140c9
Compare
Choose a tag to compare

What's Changed

  • bugfix: nullable check float dtype handles nan and null by @cosmicBboy in #1627

Full Changelog: v0.19.1...v0.19.2

Release 0.19.1: Bugfixes and docs fixes

08 May 02:09
0faae07
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.19.0...v0.19.1

Release 0.19.0: Polars validation support

06 May 14:28
612d25c
Compare
Choose a tag to compare

✨ Highlights ✨

📣 Pandera now supports validation of polars.DataFrame and polars.LazyFrame 🐻‍❄️!

You can now do this:

import pandera.polars as pa
import polars as pl


class Schema(pa.DataFrameModel):
    state: str
    city: str
    price: int = pa.Field(in_range={"min_value": 5, "max_value": 20})


lf = pl.LazyFrame(
    {
        'state': ['FL','FL','FL','CA','CA','CA'],
        'city': [
            'Orlando',
            'Miami',
            'Tampa',
            'San Francisco',
            'Los Angeles',
            'San Diego',
        ],
        'price': [8, 12, 10, 16, 20, 18],
    }
)
Schema.validate(lf).collect()

And of course you can do functional validation with decorators like so:

from pandera.typing.polars import LazyFrame

@pa.check_types
def function(lf: LazyFrame[Schema]) -> LazyFrame[Schema]:
    return lf.filter(pl.col("state").eq("CA"))

function(lf).collect()

You can read more about the integration here. Not all pandera features are supported at this point, but depending on community demand/contributions we'll slowly add them. To learn more about what's currently supported, check out this table.

Special shoutout to @AndriiG13 and @FilipAisot for their contributions on the built-in checks and polars datatypes, respectively, and to @evanrasmussen9, @baldwinj30, @obiii, @Filimoa, @philiporlando, @r-bar, @alkment, @jjfantini, and @robertdj for their early feedback and bug reports during the 0.19.0 beta.

What's Changed

New Contributors

Full Changelog: v0.18.3...v0.19.0