You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@georgepar follow below script to extract dataset and i will fix a function that given a word return the its embeddings.In weights folder you will find the trained weights for each of three experiments.The embeddings which will be used are glove42B.300d
importnumpyasnpfromsklearn.datasets.baseimportBunchfrom .utilsimport_get_as_pddeffetch_MEN(which="all", form="natural"):
""" Fetch MEN dataset for testing similarity and relatedness ---------- which : "all", "test" or "dev" form : "lem" or "natural" Returns ------- data : sklearn.datasets.base.Bunch dictionary-like object. Keys of interest: 'X': matrix of 2 words per column, 'y': vector with scores Published at http://clic.cimec.unitn.it/~elia.bruni/MEN.html. """ifwhich=="dev":
data=_get_as_pd('https://www.dropbox.com/s/c0hm5dd95xapenf/EN-MEN-LEM-DEV.txt?dl=1',
'similarity', header=None, sep=" ")
elifwhich=="test":
data=_get_as_pd('https://www.dropbox.com/s/vdmqgvn65smm2ah/EN-MEN-LEM-TEST.txt?dl=1',
'similarity/EN-MEN-LEM-TEST', header=None, sep=" ")
elifwhich=="all":
data=_get_as_pd('https://www.dropbox.com/s/b9rv8s7l32ni274/EN-MEN-LEM.txt?dl=1',
'similarity', header=None, sep=" ")
else:
raiseRuntimeError("Not recognized which parameter")
ifform=="natural":
# Remove last two chars from first two columnsdata=data.apply(lambdax: [yifisinstance(y, float) elsey[0:-2] foryinx])
elifform!="lem":
raiseRuntimeError("Not recognized form argument")
returnBunch(X=data.values[:, 0:2].astype("object"), y=data.values[:, 2:].astype(np.float) /5.0)
The text was updated successfully, but these errors were encountered:
@georgepar follow below script to extract dataset and i will fix a function that given a word return the its embeddings.In weights folder you will find the trained weights for each of three experiments.The embeddings which will be used are glove42B.300d
The text was updated successfully, but these errors were encountered: