You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A lot of features matrices in practice have small number of non-zero entries per row. E.g. data that come from one-hot encoding have exactly one non-zero entry per row. These can be handled nicely by CategoricalMatrix if all the non-zero entries are one. However, this is not always the case, e.g. data that comes from sklearn.preprocessing.SplineTransformer. These would be nicely supported by ELLPACK format which is a natural generalization of CategoricalMatrix.
Another option is to support Sliced Ellpack (SELL) format which can support general sparse matrix relatively well and make SplitMatrix consists of just a dense matrix and a SELL matrix.
The text was updated successfully, but these errors were encountered:
A lot of features matrices in practice have small number of non-zero entries per row. E.g. data that come from one-hot encoding have exactly one non-zero entry per row. These can be handled nicely by
CategoricalMatrix
if all the non-zero entries are one. However, this is not always the case, e.g. data that comes from sklearn.preprocessing.SplineTransformer. These would be nicely supported by ELLPACK format which is a natural generalization ofCategoricalMatrix
.Another option is to support Sliced Ellpack (SELL) format which can support general sparse matrix relatively well and make
SplitMatrix
consists of just a dense matrix and a SELL matrix.The text was updated successfully, but these errors were encountered: