Skip to content

Python Polars 1.17.0

Compare
Choose a tag to compare
@github-actions github-actions released this 08 Dec 10:26
5f6bc77

🚀 Performance improvements

  • Add fast paths for series.arg_sort and dataframe.sort (#19872)
  • Much faster Series construction from subclasses of standard Python types (#20166)
  • Utilize the RangedUniqueKernel for Enum/Categorical (#20150)
  • Reduce memory copy when scanning from Python objects (#20142)
  • Construct Series for bytes/binary data 10x faster when dtype not explicitly set (#20157)
  • Don't instantiate validity mask when unneeded in Parquet (#20149)

✨ Enhancements

  • Retry with reloaded credentials on cloud error (#20185)
  • Support reading Enum dtype from csv (#20188)
  • Improve dtype inference and load for DataFrame cols constructed from Python Enum values (#20180)
  • Allow sorting of lists and arrays (#20169)
  • Add maintain_order parameter to joins (#20026)
  • Allow for to_datetime / strftime to automatically parse dates with single-digit hour/minute/second (#20144)
  • Issue warning when using to_struct() without a list of field names (#20158)
  • Experimental cloud write support (#20129)
  • Add lazy support for pl.select (#20091)
  • Enable view arrow export in write_delta (#20092)

🐞 Bug fixes

  • Don't trigger length check in array construction (#20205)
  • Allow row encoding for 32-bit architectures (e.g. WASM) (#20186)
  • Properly project unordered column in parquet prefiltered (#20189)
  • Csv stop simd cache if eol char is hit (#20199)
  • Estimated size for object (#20191)
  • Respect parallel argument in parquet (#20187)
  • Only validate UTF-8 for selected items when all below len 128 (#20183)
  • Serialize categories of Enum in arrow metadata (#20181)
  • Don't use RLE encoding for Parquet Boolean (#20172)
  • Invalid bitwise_xor for ScalarColumn (#20140)
  • Series construct with large nested u64 (#20167)
  • Add temporal feature gate in is_elementwise_top_level (#20177)
  • Column name mismatch or not found in Parquet scan with filter (#20178)
  • Raise if apply returns different types (#20168)
  • Deal with masked out list elements (#20161)
  • Fix index out of bounds in uniform_hist_count (#20133)
  • Implement arg_sort for Null series (#20135)
  • Handle slice pushdown in PythonUDF GroupBy (#20132)
  • Check shape for *_horizontal functions (#20130)
  • Properly coerce types in lists (#20126)
  • Incorrect aggregation of empty groups after slice (#20127)
  • DataFrame .get_column after drop_in_place (#20120)
  • Subtraction with underflow on empty FixedSizeBinaryArray (#20109)
  • Materialize smallest dyn ints to use feature gate for i8/i16 (#20108)
  • Return null instead of 0. for rolling_std when window contains a single element and ddof=1 and there are nulls elsewhere in the Series (#20077)
  • Only slice after sort when slice is smaller than frame length (#20084)
  • Preserve Series name in __rpow__ operation (#20072)
  • Allow nested is_in() in when()/then() for full-streaming (#20052)

📖 Documentation

  • Add more Rust examples to User Guide (#20194)
  • Expand plotting docs (#19719)
  • Fix Rust examples in user guide (#20075)
  • Update by param description for rolling_*_by functions (#19715)
  • Correct supported compression formats (#20085)
  • Specify strictness in cast (#20067)

📦 Build system

  • Upgrade sqlparser-rs from version 0.49 to 0.52 (#20110)
  • Bump memmap2 to version 0.9 (#20105)
  • Bump object_store to version 0.11 (#20102)
  • Bump fs4 to version 0.12 (#20101)
  • Bump thiserror to version 2 (#20097)
  • Bump atoi_simd to version 0.16 (#20098)
  • Bump chrono-tz to 0.10 (#20094)
  • Update Rust dependency ndarray to 0.16 (#20093)
  • Bump Rust toolchain to nightly-2024-11-28 (#20064)

🛠️ Other improvements

  • Deprecate ddof parameter for correlation coefficient (#20197)
  • Move Bitwise aggregations to FunctionExpr (#20193)
  • Add ragged lines test (#20182)
  • Set delta version check higher (#20153)
  • Fix typo in assertion in datatype copy test (#20121)
  • Move horizontal methods to polars-ops (#20134)
  • Remove useless SeriesTrait::get implementations (#20136)
  • Add a bunch more automated row encoding sortedness tests (#20056)

Thank you to all our contributors for making this release possible!
@DzenanJupic, @MarcoGorelli, @YichiZhang0613, @alexander-beedie, @coastalwhite, @dependabot, @dependabot[bot], @flowlight0, @henryharbeck, @iharthi, @ion-elgreco, @jqnatividad, @lukapeschke, @lukemanley, @mcrumiller, @nameexhaustion, @ptiza, @ritchie46, @siddharth-vi, @stijnherfst, @stinodego and @wsyxbcl