Releases: scikit-hep/awkward
Version 2.0.7
- feat: add ability to forget length of typetracer created with
typetracer_from_report
by @douglasdavis in #2141 - feat: start hardening nplike signatures by @agoose77 in #2148
- feat: implement all ufuncs on TypeTracer. by @jpivarski in #2149
- feat: use
None
for unknown lengths (1 of 2) by @agoose77 in #2168 - feat: coerce backends to same zero-copy type (2 of 2) by @agoose77 in #2175
- feat: growable buffer move_to method by @ianna in #2178
- feat: add
ak.merge_union_of_records
by @agoose77 in #2185 - feat: add support for histogram module by @agoose77 in #2190
- feat: add
ak.approx_equal
by @agoose77 in #2198
Bug-fixes and performance
- fix: re-order cases to handle NumPy scalar types properly by @agoose77 in #2125
- fix: specify
dtype
for buffers infrom_rdataframe
. by @agoose77 in #2145 - fix: unify typestr with
_repr
by @agoose77 in #2158 - fix: update
type_to_name
for Layout buildercxx_14
target by @ianna in #2165 - fix: Layout builders clean and length bug fixes by @ianna in #2171
- fix: support
mask_identity=True
foraxis=None
inptp
,std
, etc. by @agoose77 in #2172 - fix: preserve dimensions for
keepdims=True
,axis=None
reductions by @agoose77 in #2177 - fix: some usages of
len(layout)
under typetracer by @agoose77 in #2181 - fix: rdataframe memory check by @ianna in #2155
- fix: rework parameter merging rules by @agoose77 in #2179
- fix: don't raise
NotImplementedError
when reading empty array from Parquet by @dsavoiu in #2194 - fix: ignore object arrays by @agoose77 in #2206
- fix: ak.values_astype now turns 'unknown' type into the requested type. by @jpivarski in #2196
Other
- refactor: remove
to_arraylib
by @agoose77 in #2128 - refactor: move
Singleton
to its own module by @agoose77 in #2131 - refactor: move kernel logic to _kernels by @agoose77 in #2132
- refactor: use
nplike.asarray
by @agoose77 in #2134 - refactor: remove dead code by @agoose77 in #2139
- refactor: make
nplike.zeros
et al. use Array API type signatures by @agoose77 in #2137 - refactor: move typetracer ufunc handling to backend [1 of 2] by @agoose77 in #2150
- refactor: split
_nplikes.py
into_nplikes/*.py
[2 of 2] by @agoose77 in #2152 - refactor: drop
UnknownScalar
, harden unknown scalar behavior by @agoose77 in #2154 - docs: fix bold docstring due to indent by @agoose77 in #2122
- docs: disable IPyParallel & load dependencies in reverse order by @agoose77 in #2126
- docs: improve error message by @agoose77 in #2201
- docs: add dsavoiu as a contributor for code by @allcontributors in #2204
- ci: test NumPy < 1.17 by @agoose77 in #2142
- ci: run header-only tests by @agoose77 in #2169
- chore: apply changes from flake8 by @agoose77 in #2130
- chore: isolate nox dependencies in
noxfile
by @agoose77 in #2136 - chore: drop unused nplike functions by @agoose77 in #2138
- chore: add helpful message if cpp install is not prepared by @agoose77 in #2146
- chore: move to Ruff by @henryiii in #2153
- chore(deps): bump pypa/cibuildwheel from 2.11.4 to 2.12.0 by @dependabot in #2133
- chore: update Ruff version by @henryiii in #2174
- chore: ruff rewrite
not in
by @henryiii in #2176 - chore: ruff rewrite dicts by @henryiii in #2183
New Contributors
Full Changelog: v2.0.6...v2.0.7
Version 2.0.6
New features
- feat: expose typetracer in public backend API by @agoose77 in #2066
- feat: add byteorder argument to
to_buffers
by @agoose77 in #2095 - feat: add exception for missing field by @agoose77 in #2120
Bug-fixes and performance
- fix: support scalars in tuple (and list) arguments provided to
__array_function__
by @agoose77 in #2045 - fix: support option-in-record for
fill_none
by @agoose77 in #2065 - fix: support unzipping
ak.Record
by @agoose77 in #2077 - fix: render keyword and varargs by @agoose77 in #2074
- fix: don't try to re-wrap
array_function
overload results by @agoose77 in #2079 - fix: support merging of
RegularArray
andNumpyArray
by @agoose77 in #2063 - fix: correct NumPy zero-size broadcasting by @agoose77 in #2083
- fix: implement explicit translation for NEP-18 by @agoose77 in #2089
- fix: listarray - slicing expects scalars by @ianna in #2069
- fix: off-by-one error in
run_lengths
by @agoose77 in #2093 - fix: broken links due to cpp split by @agoose77 in #2087
- fix: unflatten should accept non-packed
counts
by @agoose77 in #2097 - fix: remove string casting from
ak.to_layout
by @agoose77 in #2098 - fix: support categorical counts in
ak.unflatten
by @agoose77 in #2099 - fix: use pickleable closure for
ak.mixin_class_method
by @agoose77 in #2102 - fix: be more permissive with sort translation by @agoose77 in #2112
- fix: merging 1D
NumpyArray
with option by @agoose77 in #2105 - fix: support
is_indexed
types inak.fill_none
by @agoose77 in #2111 - fix: use
object.__new__(ak.Array)
for pickling constructor by @agoose77 in #2113 - fix: remove Long64_t from common header-only by @ianna in #2084
- fix:
TypeTracerArray
binary operators,ak.Array.__str__
with a typetracer, attempts to calltouch_data
on non-typetracers, ...? by @jpivarski in #2115 - fix: add
ScalarType
and treat bare strings as char arrays by @agoose77 in #2116 - fix: ensure Exception if branch evaluates for Awkward type by @agoose77 in #2019
Other
- refactor: add
@final
to contents, types, and forms by @agoose77 in #2033 - refactor: remove
kind
andorder
args to sorting protocols by @agoose77 in #2090 - docs: remove reference to sorting implementation by @agoose77 in #2114
- test: fix on win32 by @agoose77 in #2117
- ci: remove link checker by @agoose77 in #2075
- chore: update pyodide-build by @agoose77 in #2060
- chore: update pre-commit hooks by @pre-commit-ci in #2039
- chore(deps): bump pypa/cibuildwheel from 2.11.3 to 2.11.4 by @dependabot in #2038
- chore(deps): bump mymindstorm/setup-emsdk from 11 to 12 by @dependabot in #2119
Full Changelog: v2.0.5...v2.0.6
Version 2.0.5
New features
(none!)
Bug-fixes and performance
- fix: remove unused keyword arg by @agoose77 in #2046
- fix: support
regular_to_jagged
inContent._recursively_apply
/ak.transform
by @agoose77 in #2048 - fix: read behavior from highlevel
ak.ArrayBuilder
by @agoose77 in #2052 - fix: rebuild invalid
check
pointers inArrayBuilder
by @agoose77 in #2055
Other
- docs: correct canonical URL by @agoose77 in #2040
- docs: correct docstring for
ak.metadata_from_parquet
by @agoose77 in #2050 - test: rename tests to use identifiers by @agoose77 in #2044
- chore: fix poor rename by @agoose77 in #2049
- chore: increase awkward-cpp version for #2055. by @jpivarski in #2056
Full Changelog: v2.0.4...v2.0.5
Version 2.0.4
This follows quickly on 2.0.3, which removed a feature and a function argument. Removing the feature is still the right thing to do (see the 2.0.3 release notes), but the function argument needs to go through a deprecation cycle, since libraries like dask-awkward pass arguments through to Awkward. Removing flatten_records
as an argument introduces an error, even if the surviving case of flatten_records=False
is desired.
This will also be a good exercise of the deprecation schedule in 2.x.
New features
(none!)
Bug-fixes and performance
Other
(none!)
Full Changelog: v2.0.3...v2.0.4
Version 2.0.3
Backward-incompatible changes
- The
flatten_records
argument of all reducers (ak.all
,ak.any
, ...,ak.var
) has effectively been removed: setting it now raises an error (PR #2020). This argument applies a reducer to all contents of a record, merging fields, and it had to be removed to properly implementaxis=None
. The old default,flatten_records=False
, is now the only behavior, and to get the equivalent offlatten_records=True
, you can use ak.ravel:
ak.sum(array, flatten_records=True)
becomes
ak.sum(ak.ravel(array))
Note: yanked from PyPI in favor of 2.0.4.
New features
- feat: add data-touch reporting to the type-tracer. by @jpivarski in #2027
Bug-fixes and performance
- fix: extend TypeTracerArray with eq, ne, and array_ufunc. by @jpivarski in #2021
- fix: add support for Long64_t by @ianna in #2023
- fix: replace protocol with direct subclass by @agoose77 in #2029
- fix: support
UnknownLength
inak.types.ArrayType
by @agoose77 in #2031 - refactor!: use exclusively
axis=-1
reduction foraxis=None
by @agoose77 in #2020
Other
- refactor: add array comparison test helper by @agoose77 in #2024
- docs: add sitemap by @agoose77 in #2026
- ci: drop pages deployment by @agoose77 in #2025
- ci: fix flake8 warning by @agoose77 in #2030
- chore: update pre-commit hooks by @pre-commit-ci in #2022
Full Changelog: v2.0.2...v2.0.3
Version 2.0.2
Version 2.0.1
New features
Bug-fixes and performance
- fix: missed a NumpyArray.raw call without an underscore. by @jpivarski in #1993
- fix: add
Record.copy
by @agoose77 in #1996 - fix: support empty record arrays in
ak.to_numpy
by @agoose77 in #2012 - fix: widen input support for
ak.type()
by @agoose77 in #2009
Other
- docs: restore branch preview by @agoose77 in #1994
- docs: add note in README.md about pip installing through git. by @jpivarski in #2005
- docs: redirect paths for user-guide by @agoose77 in #2007
- docs: remove mention of numexpr by @agoose77 in #2011
- docs: first pass on subset of user guide by @agoose77 in #2010
- ci: output linkcheck information by @agoose77 in #1987
- ci: wip for deployment on AWS by @agoose77 in #2002
- chore: update pre-commit hooks by @pre-commit-ci in #1999
- chore: remove old
num
kernels by @agoose77 in #1998 - chore: add ABI version number to AwkwardForth by @jpivarski in #2001
Full Changelog: v2.0.0...v2.0.1
Version 2.0.0
Version 2.0.0 of Awkward Array
The Awkward Array version 2 project started in June of 2021 and has been developed alongside version 1 updates. For most of that time, it was available as a submodule, awkward._v2
, so that it could be tested with the same tests as version 1 and could be experimented upon by early adopters.
The usual list of pull request titles would not be useful as release notes because the changes from 1.10.2 to 2.0.0 are too extensive. But here's a list of their PR numbers:
#884, #895, #896, #914, #957, #958, #959, #962, #977, #1025, #1031, #1036, #1045, #1059, #1063, #1072, #1073, #1074, #1079, #1082, #1092, #1099, #1101, #1109, #1110, #1111, #1116, #1117, #1119, #1121, #1122, #1123, #1124, #1125, #1130, #1131, #1132, #1134, #1135, #1137, #1138, #1140, #1141, #1142, #1143, #1145, #1146, #1147, #1148, #1149, #1150, #1153, #1154, #1156, #1159, #1160, #1161, #1162, #1164, #1165, #1183, #1184, #1201, #1203, #1204, #1206, #1207, #1211, #1214, #1215, #1217, #1218, #1219, #1220, #1221, #1222, #1225, #1226, #1227, #1228, #1229, #1233, #1234, #1240, #1242, #1245, #1248, #1259, #1270, #1276, #1279, #1289, #1290, #1292, #1293, #1294, #1296, #1297, #1300, #1301, #1304, #1306, #1307, #1309, #1312, #1317, #1321, #1327, #1329, #1338, #1340, #1346, #1347, #1351, #1352, #1354, #1355, #1356, #1359, #1360, #1364, #1365, #1367, #1368, #1369, #1370, #1372, #1373, #1374, #1376, #1378, #1380, #1381, #1383, #1384, #1385, #1387, #1390, #1392, #1393, #1394, #1395, #1397, #1398, #1399, #1401, #1404, #1407, #1408, #1409, #1410, #1412, #1413, #1415, #1416, #1418, #1419, #1421, #1422, #1425, #1426, #1427, #1428, #1429, #1430, #1431, #1432, #1433, #1434, #1435, #1437, #1440, #1443, #1444, #1445, #1446, #1447, #1449, #1455, #1456, #1457, #1458, #1462, #1464, #1465, #1467, #1468, #1469, #1470, #1474, #1475, #1476, #1478, #1484, #1485, #1486, #1487, #1490, #1491, #1492, #1493, #1494, #1496, #1497, #1498, #1499, #1502, #1503, #1505, #1508, #1510, #1513, #1514, #1515, #1516, #1518, #1519, #1520, #1521, #1523, #1524, #1527, #1531, #1532, #1533, #1535, #1536, #1537, #1538, #1539, #1540, #1541, #1542, #1543, #1544, #1550, #1555, #1556, #1559, #1560, #1561, #1562, #1564, #1565, #1566, #1567, #1568, #1572, #1573, #1576, #1579, #1581, #1589, #1593, #1597, #1598, #1602, #1603, #1604, #1605, #1607, #1609, #1610, #1613, #1614, #1615, #1616, #1617, #1618, #1619, #1620, #1621, #1625, #1627, #1629, #1632, #1636, #1641, #1642, #1645, #1649, #1650, #1651, #1652, #1653, #1661, #1665, #1666, #1671, #1673, #1674, #1675, #1677, #1679, #1689, #1691, #1692, #1695, #1698, #1699, #1700, #1706, #1708, #1712, #1715, #1716, #1717, #1721, #1722, #1723, #1725, #1730, #1731, #1732, #1733, #1739, #1740, #1743, #1744, #1746, #1748, #1749, #1750, #1751, #1752, #1754, #1757, #1758, #1759, #1760, #1761, #1763, #1768, #1769, #1770, #1773, #1774, #1776, #1777, #1779, #1781, #1783, #1787, #1788, #1795, #1796, #1797, #1798, #1800, #1801, #1803, #1804, #1811, #1812, #1813, #1815, #1816, #1822, #1825, #1826, #1827, #1829, #1830, #1831, #1832, #1835, #1836, #1837, #1838, #1841, #1844, #1845, #1848, #1851, #1852, #1853, #1854, #1856, #1857, #1858, #1859, #1860, #1861, #1863, #1867, #1869, #1871, #1873, #1876, #1877, #1878, #1880, #1881, #1891, #1892, #1894, #1895, #1897, #1898, #1900, #1905, #1907, #1908, #1911, #1912, #1913, #1915, #1919, #1920, #1921, #1922, #1928, #1930, #1934, #1938, #1939, #1940, #1942, #1943, #1946, #1948, #1949, #1950, #1951, #1952, #1953, #1954, #1955, #1956, #1959, #1960, #1962, #1965, #1966, #1968, #1970, #1971, #1972, #1974, #1976, #1977, #1979, #1981, #1982, #1983, #1985, #1986
Full Changelog: v1.10.2...v2.0.0
Despite the long list of PRs, the high-level interface changes from version 1 to version 2 were kept at a minimum. For the most part, the Awkward 1.x API is fine, but the internal implementation needed an overhaul to prevent technical debt.
The work was done by the Awkward Array developers:
- @agoose77
- @henryiii
- @ianna
- @ioanaif
- @jpivarski
- @ManasviGoyal
- @swishdiff
In particular, most of the translation from version 1 to version 2 was the work of @ioanaif, the build/deployment was from @henryiii and @agoose77, the Awkward-RDataFrame bridge and other C++ interface from @ianna, GrowableBuffer/LayoutBuilder from @ManasviGoyal, and the CUDA and JAX foundations were laid by @swishdiff.
Additionally, we had help from:
Summary of changes
Nearly all of the code is written in Python now. Exceptions are the "kernel" functions, GrowableBuffer, LayoutBuilder, ArrayBuilder, AwkwardForth, and dynamically generated C++ code for RDataFrame.
Maintains performance because any algorithms that scale with the size of an array are implemented in compiled "kernel" functions.
Split into two packages: awkward-cpp
for the C++ part (infrequently updated, binary distribution for most platforms and Python versions) and awkward
, the Python part (frequently updated).
Virtual arrays and Partitions (collectively, "lazy arrays") have been removed in favor of dask-awkward.
Awkward Arrays can be converted to and from RDataFrame, generating C++ for ROOT to JIT-compile so that iteration over Awkward Array input is fast (adapted from the Numba implementation).
Auto-differentiation of functions on Awkward Arrays using JAX. (But not JAX JIT-compilation.)
Suite of header-only C++ that does not depend on Awkward Arrays, but can be used to produce them and quickly get them from C++ to Python. The header-only suite includes GrowableBuffer and LayoutBuilder.
New documentation website (https://awkward-array.org/), based on JupyterBooks, the NumPy/SciPy/Pandas style and organization, as well as a notebook that can be executed in your web browser.
More expressive error-messages, highlighting the ak.*
function that was in progress when the error occurred, with its arguments. (That is, highlighting ak.*
functions as the granularity of feedback to users of Awkward Array, rather than making you search through the stack trace to the hand-off from your code to ours.)
Brackets are always balanced in the console representation of an array:
>>> ak.Array([
... [{"x": 1.1, "y": [1]}, {"x": 2.2, "y": [1, 2]}],
... [],
... [{"x": 3.3, "y": [1, 2, 3]}],
... ])
<Array [[{x: 1.1, y: [1]}, {...}], ...] type='3 * var * {x: float64, y: var...'>
as opposed to
<Array [[{x: 1.1, y: [1]}, ... y: [1, 2, 3]}]] type='3 * var * {"x": float64, "y...'>
in version 1. Also, show
methods for values
[[{x: 1.1, y: [1]}, {x: 2.2, y: [1, 2]}],
[],
[{x: 3.3, y: [1, 2, 3]}]]
and types
3 * var * {
x: float64,
y: var * int64
}
This extended show
output is the default representation in Jupyter.
Round-trip fidelity in ak.to_arrow
/ak.from_arrow
: no Awkward Array metadata is lost. Same for ak.to_parquet
/ak.from_parquet
, to the extent that pyarrow can read and write Parquet.
Parquet column selection using wildcards.
Data exported with version 1 ak.to_buffers
can be imported by version 2 ak.from_buffers
, with custom buffer_keys
.
The majority of version 1 tests have been ported to version 2, to ensure that the interface and functionality doesn't change, except where intended (e.g. organizing naming conventions).
Consistent handling of date-time and time-delta types (matches NumPy's system).
Improved ak.to_json
/ak.from_json
arguments (for converting non-JSON types NaN, infinity, complex numbers) and using a known JSONSchema to accelerate ak.from_json
. Removed ambiguities about newline-delimited JSON (requires explicit argument).
The world's fastest Avro file reader in Python, ak.from_avro_file
(uses AwkwardForth).
"nan" versions of NumPy functions, such as np.nansum
, np.nanmean
, np.nanstd
.
Renamed ak.to_pandas
→ ak.to_dataframe
, to clarify distinction from awkward-pandas.
Organized Type
and Form
objects better, more consistent.
Clear specification of NumPy dtypes that can be used in Awkward Arrays (bool, numbers, including complex, and date-time/time-delta).
Organized naming conventions throughout the codebase, such as keys
versus fields
versus recordlookup
.
Carefully examined the public API (all modules, functions, classes, and methods that don't start with an underscore) to be sure that we can support it going forward. Any removal or change of an interface will require a minor version number increase and a deprecation cycle, on the order of months. (New features and bug-fixes can be immediate, on patch releases.)
Flags and "configuration" function arguments are now keyword-only (order independent).
Started adding Python type hints (nowhere near complete, but started).
Removed the Identities
from array nodes. They were never fully implemented—a placeholder for a feature that won't be developed within Awkward Array (SQL-style JOINs).
TypeTracerArray does a "dry run" of a calculation to predict its type at the end. Used to build a computation graph for dask-awkward.
Equivalent but ungainly type combinations, such as "option-type of option-type of X" or "union-type containing union-types," have been outlawed with tools to squash them into a canonical layout. Operations on the data now have fewer possibilities to worry about.
Simplified the semantics of nbytes
.
Clarified ak.ravel
and ak.flatten
's treatments of missing data.
Added missing ArrayBuilder methods in Numba.
Set up framework for performing ak.*
operations i...
Version 2.0.0rc8
This will very likely be the last pre-release before the final 2.0.0 release. If all goes well, that will be six hours after now. ("Now" is 16:00 UTC, December 9, 2022, so the final release will likely be at 22:00 UTC.)
New features
(none!)
Bug-fixes and performance
- fix!: always broadcast
with_field
assignments against existing array by @agoose77 in #1962 - fix: replace
axis_wrap_if_negative
withmaybe_posaxis
, simpler and more correct by @jpivarski in #1986
Other
- refactor: hide L3 API by @agoose77 in #1983
- docs: revamp README.md, CONTRIBUTING.md, and move 'papers and talks'. by @jpivarski in #1985
Full Changelog: v2.0.0rc7...v2.0.0rc8
Version 2.0.0rc7
New features
(none!)
Bug-fixes and performance
- fix: EmptyArray.is_numpy should be False. by @jpivarski in #1971
- fix: add return_value='simplified' to ak.transform and revamp ak.firsts/ak.singletons by @jpivarski in #1968
- fix: don't try to support Awkward 1.x pickles. by @jpivarski in #1974
- fix: ak_from_parquet by @ioanaif in #1977
- fix: ak.Record dict constructor should retain type. by @jpivarski in #1981
Other
- refactor: rename
Form
toform_cls
by @agoose77 in #1976 - refactor: hide Content recursion entry points in
ak._do
submodule. by @jpivarski in #1972 - docs: Jim's documentation touch-ups (API sidebar, obsolete kernels intro) by @jpivarski in #1982
- ci: build(deps): bump pypa/gh-action-pypi-publish from 1.5.1 to 1.6.1 by @dependabot in #1948
- ci: build(deps): bump pypa/cibuildwheel from 2.11.2 to 2.11.3 by @dependabot in #1965
- ci: build(deps): bump pypa/gh-action-pypi-publish from 1.6.1 to 1.6.4 by @dependabot in #1970
- chore: fix RTD configuration by @agoose77 in #1979
Full Changelog: v2.0.0rc6...v2.0.0rc7