A stupidly small and fast programming language detection model.
import { detectLanguage } from 'hylang'
const input = `
function square(x) {
return x * x
}
`
console.log(`Predicted language: ${detectLanguage(input)}`)
Hylang eval is sampled from the starcoderdata dataset.
Hylang is 30.4kb packed and achieves 74.5% accuracy.
The language detector is implemented as a bag of words model trained on the starcoderdata dataset.
Training is done in python with torch. Model weights are exported to params.json
so they can be used in javascript.