-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds documentation and example recommending Vec<ArrayRef> over ChunkedArray #6527
Conversation
…native to a ChunkedArray abstraction."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this example could be drastically simplified by using the iterators fully, e.g.
let values: Vec<f32> = batches.iter().flat_map(|x| x.column(0).as_primitive::<Float32Type>().values()).collect();
let values: Vec<&str> = batches.iter().flat_map(|x| x.column(1).as_string()).map(Option::unwrap).collect();
I can't see an obvious reason to collect into separate Vecs, this just adds additional overheads, and is less concise.
I think it is also worth highlighting that many use-cases won't need to cast down from the ArrayRef at all, with most kernels accept &dyn Array
.
arrow/examples/chunked_arrays.rs
Outdated
// chunked_array_by_index is an array of two Vec<ArrayRef> where each Vec<ArrayRef> is a column | ||
let mut chunked_array_by_index = [Vec::new(), Vec::new()]; | ||
for batch in &batches { | ||
for (i, array) in batch.columns().iter().enumerate() { | ||
chunked_array_by_index[i].push(array.clone()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collecting into separate Vec like this is unnecessary, see below
Thanks @tustvold - I've updated the text, simplified the example as suggested and moved it to arrow_array. (I also deleted the superfluous example I had created). |
@@ -21,7 +21,7 @@ | |||
|
|||
- [`builders.rs`](builders.rs): Using the Builder API | |||
- [`collect.rs`](collect.rs): Using the `FromIter` API | |||
- [`dynamic_types.rs`](dynamic_types.rs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change - this example's description was missing from the README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you
Which issue does this PR close?
Closes #5295.
Rationale for this change
As described in issue.
What changes are included in this PR?
Adds documentation and an example
Are there any user-facing changes?
Only documentation and example.