Skip to content

Releases: unionai-oss/pandera

0.7.1: Add unique option to DataFrameSchema

13 Sep 00:28
f0ddcbf
Compare
Choose a tag to compare

Enhancements

  • add support for Any annotation in schema model (#594)
  • add support for timezone-aware datetime strategies (#595)
  • unique keyword arg: replace and deprecate allow_duplicates (#580)
  • Add support for empty data type annotation in SchemaModel (#602)
  • support frictionless primary keys with multiple fields (#608)

Bugfixes

  • unify typing.DataFrame class definitions (#576)
  • schemas with multi-index columns correctly report errors (#600)
  • strategies module supports undefined checks in regex columns (#599)
  • fix validation of check raising error without message (#613)

Docs Improvements

  • Tutorial: docs/scaling - Bring Pandera to Spark and Dask (#588)

Repo Improvements

  • use virtualenv instead of conda in ci (#578)

Dependency Changes

  • remove frictionless from core pandera deps (#609)
  • docs/requirements.txt pin setuptools (#611)

Contributors

🎉🎉 Big shout out to all the contributors on this release 🎉🎉

0.7.0: Pandera Type System Overhaul

06 Aug 02:30
Compare
Choose a tag to compare

Enhancements

  • Add support for frictionless schemas (#454) [docs]
  • decouple pandera and pandas dtypes (#559) [docs]
  • Unify dataframe definitions to fix auto-complete #576
  • Report all failure cases when coercing dtypes fails (#584)

Bugfixes

  • Handle case of pandas.DataFrame with pandas.MultiIndex in pandera.error_formatters.reshape_failure_cases (#560)
  • Add 'ordered.setter' decorator (#567)
  • Fix decorators on classmethods (#568)
  • better handling of datetime/timedelta in serialize/deserialize (#585)

Docs Improvements

  • Update contributing guide ccca82f
  • Add documentation build to contributing guide 361fec0
  • Fix virtualenv instructions in contributing guide ed74a65
  • Feature/coroutines docs (#570)
  • Add frictionless documentation (#579)
  • use python primitive types in docs where possible (#581)

Repo Improvements

  • Add typing to un-annotated functions (#569)
  • use virtualenv instead of conda in ci (#578)

Contributors

Big shout out to ✨ @mattHawthorn, @vinisalazar, @cristianmatache, @TColl, @jeffzi, @admackin, and @benkeesey ✨ for your contributions on this release 🎉🎉🎉

0.6.5: Support coroutines, regex matching on non-str column names, bugfixes

13 Jul 19:21
Compare
Choose a tag to compare

Enhancements

  • Raise error if check_obj.index is MultiIndex when using pandera.Index (#483)
  • support decorators for coroutines (#546)
  • added py.typed and typed Series descriptor (#543)
  • select non-str column names with regex=True (#551)

Bugfixes

  • check decorators support non-DataFrame types (#510)
  • lazy validation correctly reports all errors (#528)
  • don't drop duplicates for series failure cases (#535)
  • custom dataframe-level checks don't corrupt data-synthesis strategy #550

Contributors

Thanks to @jekwatt @cristianmatache @lkadin for your first-time contributions! 🎉🎉🎉

0.6.4: Support dataframe-level checks in SchemaModel Config, Bugfixes

08 May 16:08
Compare
Choose a tag to compare

New Features

  • Allow attaching registered dataframe checks by using Config field names (#478)

Bugfixes

  • alias propagation works correctly on empty subclass (#446)
  • Add missing inplace arg to SchemaModel's validate (#450)
  • fix check_types decorator should return results from validate (#458)
  • Dataframe schemas in yaml do not require any field (#479)
  • coerce=True and pandas_dtype=None should be a noop (#476)

Doc Improvement

  • update documentation css to fit mobile (#447)
  • add copy button to docs (#448)
  • link documentation to github (#449)

Infrastructure Changes

0.6.3: Bugfixes, update docs

28 Mar 02:28
Compare
Choose a tag to compare

New Features

  • add new method SchemaModel.to_yaml to serialize SchemaModels to yaml #428

Bugfixes

  • preserve pandas extension types during validation (#443)
  • Fix to_yaml serialization dropping global checks (#428) 🎉 first contribution @antonl 🎉
  • fix empty data type not supported for serialization (#435)
  • fix empty SchemaModel (#434)
  • add doc about attributes excluded by SchemaModel (#436) @jeffzi
  • fix DataFrameSchema comparison with non-DataFrameSchema (#431) @jeffzi
  • schema serialization handles non-PandasDtype (#424)
  • pa.Object coerce should preserve object type (#423)

Documentation

0.6.2: SchemaModel and synthesis bugfixes

16 Feb 17:17
Compare
Choose a tag to compare

New Feature

  • Add SchemaModel column name access through class attributes (#388) @jespercodes @jeffzi 🎉
  • Parametrized PandasExtensionType types (#389) @jeffzi 🎉
  • adding filter argument to strict parameter (#401) @ktroutman
  • feature/341: improve str and repr methods for schemas (#413)

Bugfixes

  • fix py3.6 optional + literal dtypes in SchemaModel (#379) @jeffzi 🎉
  • Fix minimally required packaging version (#380) contribution #1️⃣ @probberechts 🎉
  • prevent mypy Check getattr error for registered checks 920a98c
  • Compatibility with numpy 1.20 (#395) @jeffzi
  • dataframe strategies can generate regex columns (#402)
  • bugfix: df data synthesis with size=None, fix CI (#410)
  • bugfix: SeriesSchema raises SchemaErrors on lazy validation (#412)

Repo Improvements

  • improvements to local CI (#409) @jeffzi
  • feature/414: improve contributing docs and add to sphinx docs (#416)

0.6.1: coercion and required column bugfixes

07 Jan 01:03
bfdb118
Compare
Choose a tag to compare

Bugfix Release

This release contains two bugfixes:

  • coerce nullable str column handles all na (#366)
  • non-required columns that are not in dataframe are not coerced (#368)

0.6.0: Data Synthesis Strategies, Schema Enhancements

17 Dec 20:07
Compare
Choose a tag to compare

🎉🎉🎉 Thanks to @jeffzi, @ktroutman, @m1so for your contributions! 🎉🎉🎉

Enhancements

  • Improve memory efficiency of validation process (#360)
  • Add column order validation (#352)
  • Implement data synthesis strategies using hypothesis (#344)
  • Add support for aliases in SchemaModel (#329)
  • Add support for optional name validation of single-index (#326)
  • Move columns to multiindex: add reset_index, set_index method to DataFrameSchema (#319)
  • Add support for Python 3.9 (#307)

Bugfixes

  • typing.DataFrame should expect annotation input (#318)

Deprecations

  • SchemaErrors.schema_errors has been changed to failure_cases, and the schema_errors attribute now contains a list of dicts containing schema errors and reason codes. This is a breaking change, but is a minor part of the API and is fairly straightforward to fix (#360).

Documentation Improvements

  • Add required columns documentation for schema models (#362)
  • Fix docs: schema examples (#347)
  • Add documentation for dataframeschema transformations (#333)
  • Fix deprecated SchemaErrorReport references in docs (#310)
  • Fix SchemaModel dtype example (#309)

Repo Improvements

  • Update logo 69c6e56
  • Add flynt to pre-commit hooks (#325)
  • Use generic zenodo link for citation information c4f4fe7

0.5.1: bugfix - add packaging dependency

28 Nov 21:15
Compare
Choose a tag to compare

pandera relied on the packaging package to get version information to determine pandas legacy status. This was an implicit sub-dependency of one of pandera's dependencies, which was apparently dropped and led to a bug: #335. This bugfix version explicitly adds packaging.

0.5.0: Class-based API for DataFrame Typing

25 Oct 13:24
db31b10
Compare
Choose a tag to compare

Enhancements

  • Implement class-based API for pydantic-style schema definitions 786b504. Big thanks to @jeffzi 🎉
  • Add inplace=False argument to schema.validate method to prevent mutation of original dataframe 586ebf3.
  • Make pandera optional extensions [hypothesis], [io], [all] available c4716a0. Thanks @amitripshtos and @jeffzi 🎉
  • Add support for complex number data types 50e86e4 thanks @ferhah 🎉
  • Add support for numpy scalar types a519db5
  • Add check_io decorator for check inputs and outputs of a function 913cbd7
  • Throw SchemaError with column name instead of ValueError for nulls in int series f7b03e3 thanks @TheCleric 🎉

Bugfixes

  • Bugfix io.to_script and to_yaml: Ignoring serializing Checks with lambda functions da9c3a5 thanks @ferhah 🎉

Deprecations

  • Drop support for Python 3.5 91e21a2
  • Deprecate transformers argument in DataFrameSchema init 89c3c91

Documentation Improvements

Repo Improvements