-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector arithmetic #153
base: dev
Are you sure you want to change the base?
Vector arithmetic #153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Daniel,
thank you very much for your contribution. The df.Math
is a very interesting function, but is it possible for you to do a little rework? I don't want to reduce it to only math.
-
A footprint like
Apply2Element(resultcol string, f func(e1, e2 Element) Element, col1, col2 string) DataFrame
gives more flexibility and can work even on string Series. -
Maybe
Apply2Float64(esultcol string, f func(e1, e2 float64) Element, col1, col2 string) DataFrame
is also handy. Especially in combination with the "math" package. -
Automatic coercion can cause subtle bugs in user code. Please, don't use it. If needed, this can be done in
f
. -
With df.Filter it is already possible to select one or more Rows of a Dataframe.
-
Maybe we should add a
df.Head
function to select only the first.
@@ -12,6 +12,8 @@ This project adheres to [Semantic Versioning](http://semver.org/). | |||
- Combining filters with AND | |||
- User-defined filters | |||
- Concatination of Dataframes | |||
- Math for vector operations on multiple columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move it to the 0.12.0 section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do 👌
Thanks for your comments. I agree this can be reworked to accommodate more than math. However, I'd be very sorry to see the math-specific string flavors of Counter-proposal:What if instead we split
No coercision?Is the request not to do automatic coercion a gota policy set in stone? I thought there was already automatic coercion in gota. Are you sure you wouldn't want it in, when it's only automatic coercion in one direction ( There's also type coercion in Pandas and R, and people seem to be able to handle it. I think a nice pile of warnings in the documentation would suffice, at least for me, and what we get for it is agility and API clarity. (And the reason I'm using gota in the first place is because idiomatic Go doesn't let me express a complex high level thought succinctly enough to do it often.) All of that said, I'm flexible here, and it's your project. :) FindElemAs for As an aside, for when there are many rows, I was thinking of adding |
Thank you for your detailed explanation. Let me explain my opinion about coercion. In many cases it can be a great thing. Especially when dealing with AI it is useful, because sooner or later all variables are float and a loss of one or two digits precision is not a problem. At other use-cases, this would be not acceptable - think of financial services. And it can cause bugs like #154 . This bug was in gota, other bugs can be in user code. It's all about the use-case. The idea of your counter-proposal is good. You know my opinion about coercion, but let's have a try. Go for it. |
Hmm... I admit, in my particular case, the only reason I'm using Go is because I have to. So at least one benefit would be that people don't have to leave their preferred language (which is great for other purposes) to manipulate tabular data in a readable way. The company I work for has services written in Go, and we need a compact way to express quite a large collection of high level operations on tables. I don't think idiomatic Go is the best language for doing that, (partly because of static typing, but mostly because the
Do you mean "go for it" as-written, or without any automatic |
Adds new
Math
method todataframe.DataFrame
capable of computing n-ary arithmetic functions against entire selected columns, storing the the result in a new column (or replacing an existing one). Supportsint
andfloat64
types. Supports operator specification by string (e.g., "+", "/", etc.) or unary, binary, or trinaryint
orfloat64
function (e.g., for supplying afloat64
function from Go'smath
module). For example:There are more examples in the docs and tests.
This PR also adds new
FindElem
method todataframe.DataFrame
which lets a user pull a particularseries.Element
out of aDataFrame
by specifying a column and value to select a row (assumed to be unique), and another column to find a particular value within that row. For example, the following line will search through the "Metric" column of each row for a value "envoy_cluster_upstream_rq_active", and then it will return theseries.Element
from that row corresponding to the "Value" column: