Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Website] Improve project description #44474

Closed
ianmcook opened this issue Oct 18, 2024 · 4 comments
Closed

[Website] Improve project description #44474

ianmcook opened this issue Oct 18, 2024 · 4 comments

Comments

@ianmcook
Copy link
Member

ianmcook commented Oct 18, 2024

Currently the Apache Arrow project descriptions that appear prominently at the top of the website and GitHub repo do not match and have not been updated in quite some time. Currently the description on the website is:

A cross-language development platform for in-memory analytics

and the description on GitHub is:

A multi-language toolbox for accelerated data interchange and in-memory processing

Given the immense growth in the adoption of Arrow that has occurred since we last updated these descriptions, and the current status of the Arrow format as a de facto standard with no directly comparable alternatives, I think it would be appropriate for us to be somewhat bolder in how we introduce the project. I also think that the description should include some mention of the fact that Arrow is a format in addition to a toolbox. And I think we should prefer simpler words ("fast" over "accelerated"; "toolbox" over "development platform).

Following this rationale, I propose that we change the description on both the website and GitHub to:

The universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

Thoughts?

Component(s)

Website

@kou
Copy link
Member

kou commented Oct 19, 2024

+1

(We may want to add "zero-copy" as a columnar format modifier.)

@ianmcook
Copy link
Member Author

(We may want to add "zero-copy" as a columnar format modifier.)

I agree that it's important to highlight the fact that Arrow can enable zero-copy data interchange. But it might be difficult to incorporate "zero-copy" into this succinct description in a way that is accurate. Many successful applications of Arrow for data interchange are not truly "zero-copy"; instead they minimize the number of copies made while eliminating slow and computationally expensive data serialization/deserialization and transposition steps. But that's too many words to say in a succinct description. So I think we might be better off explaining this in other text below the description (which we already do to some extent, although maybe it could be improved).

@kou
Copy link
Member

kou commented Oct 21, 2024

It makes sense.

@ianmcook ianmcook added this to the 19.0.0 milestone Oct 21, 2024
kou pushed a commit to apache/arrow-site that referenced this issue Oct 22, 2024
Part 1 of 2 of apache/arrow#44474.

This updates the Apache Arrow project description that appears
prominently at the top of the [website](https://arrow.apache.org).
kou pushed a commit that referenced this issue Oct 22, 2024
Part 2 of 2 of #44474.

This updates the Apache Arrow project description that appears in the GitHub repo **About** information.
* GitHub Issue: #44474

Authored-by: Ian Cook <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
@kou
Copy link
Member

kou commented Oct 22, 2024

Issue resolved by pull request 44492
#44492

@kou kou closed this as completed Oct 22, 2024
kou pushed a commit that referenced this issue Oct 24, 2024
…44522)

This is a follow-up to apache/arrow-site#549 and
#44492. This updates the project
description in a few other places where it appears prominently in the
website and docs.
* GitHub Issue: #44474
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants