Skip to content

Hub Miner v1.1

Latest
Compare
Choose a tag to compare
@datapoet datapoet released this 22 Jan 15:28
· 5 commits to master since this release

This version offers some improvements over the first official release of Hub Miner, a hubness-aware machine learning library for high-dimensional data analysis.

logoidea

This is new in v1.1:

  • BibTex support for all algorithm implementations, making all of them easy to reference (via algref package).
  • Two more hubness-aware approaches (meta-metric-learning and feature construction)
  • An implementation of Hit-Miss networks for analysis.
  • Several minor bug fixes.
  • The following instance selection methods were added: HMScore, Carving, Iterative Case Filtering, ENRBF.
  • The following clustering quality indexes were added: Folkes-Mallows, Calinski-Harabasz, PBM, G+, Tau, Point-Biserial, Hubert's statistic, McClain-Rao, C-root-k.
  • Some more experimental scripts have been included.
  • Extensions in the estimation of hubness risk.
  • Alias and weighted reservoir methods for weight-proportional random selection.

Hub Miner offers detailed experimental frameworks for both supervised and unsupervised learning, as well as metric learning, data reduction, learning with feature or label noise. Many hubness-aware approaches are available for experimentation and there is also a decent set of standard baselines for comparisons.

This release of Hub Miner is OpenML-compatible, as it is possible to perform networked experiments via OpenML in classification experiments, fetch the data and the splits - and upload the raw results.

Many standard data formats are supported in Hub Miner and there is also basic support for handling textual and image data.

Exploratory analysis and data visualization tools are available for many aspects of data analysis, with an emphasis on evaluating the consequences of high dimensionality and hubness in particular.

DOI