forked from facebookresearch/fastText
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MLE-1053 rebase asapp fixes #4
Open
jweese-asapp
wants to merge
29
commits into
ASAPP-fixes
Choose a base branch
from
MLE-1053
base: ASAPP-fixes
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: In the class Dictionary of fastText, merge the two methods named `computeSubwords`, by making the third argument an optional pointer. Reviewed By: EdouardGrave Differential Revision: D9766500 fbshipit-source-id: ab12c432b371cf5b924660e12e79a5d7cea708e2
Summary: Hi everyone, and thanks for this wonderful library. I'm relatively new to it, and I found myself struggling a bit when trying to obtain reproducible results, e.g. in order to find the the best parameters. I found the perfect answer in a 2016 issue here on your repo (facebookresearch#116) and I though it could be useful to add it to the FAQs. I'm sending you two PR: - this one, in which I added the FAQ - a second one, in which I modified the description in src/args.cc for the "thread" param Of course feel free to choose which one to keep (or eventually to trash both of them). Thanks! Leonardo Pull Request resolved: facebookresearch#633 Differential Revision: D9814563 Pulled By: EdouardGrave fbshipit-source-id: 83e4b7a7163b9013aef144dedd9b4bd5945bafdf
Summary: The first link at https://fasttext.cc/docs/en/pretrained-vectors.html doesn't work. This fixes it. Pull Request resolved: facebookresearch#590 Reviewed By: piotr-bojanowski Differential Revision: D9489391 Pulled By: EdouardGrave fbshipit-source-id: f1e1f0fe6a52d3d12d7a3dbf608848d68daa6c3f
Summary: Conforming to Facebook c++ style https://our.intern.facebook.com/intern/wiki/CppStyle Reviewed By: piotr-bojanowski Differential Revision: D10126506 fbshipit-source-id: 8389b652697addf7176d5d8defddbcb22dab3526
Summary: This diff adds a new command to fasttext to display precision/recall score for each individual label : `print-label-scores` It will get predicted labels above given threshold, and compute scores. For example, the question "vinegar softens the bite of raw onions ?" has two labels : "vinegar" and "onions". It will ask fastText to predict labels above given threshold. If there are two such labels : "pickling", "onions", we will obtain : "onions" will have a precision of 100%, "pickling" a precision of 0%, "onions" will have a recall of 100%, "vinegar" will have a recall of 0%. Reviewed By: EdouardGrave Differential Revision: D9991570 fbshipit-source-id: 63cff90f57659d51f5aa1f10243d40e253445aa6
Summary: The python binding for `predict` function was broken by the previous diff. The issue was reported here : facebookresearch#670 Reviewed By: EdouardGrave Differential Revision: D10868209 fbshipit-source-id: 77a2e38a74356973eedb28aa5fa348acd39c0aef
Summary: Pull Request resolved: facebookresearch#610 Reviewed By: EdouardGrave Differential Revision: D12900420 Pulled By: Celebio fbshipit-source-id: 7001549031dbdc904436ae2d2432470e0a5669ff
…tion issues on some platforms Summary: the issue was reported here : facebookresearch#666 Reviewed By: EdouardGrave Differential Revision: D12900614 fbshipit-source-id: 04303eb1442b0ab7956c5a4b56d9d57eeb004961
Summary: Recently, a diff from Celebio added a new feature "test-label" that calculates precision/recall/f1-score for every label. This is very useful feature, however, it makes FastText class overcomplicated. I propose a refactoring of model testing and metrics calculation code. It introduces MetricsAccumulator class, which is responsible for collecting stats on a dataset and calculating final metrics. Moving this functionality to separate class allows to simplify model testing code in FastText class. The same FastText::test method can be used to compute both regular and per-label metrics. This design allows MetricsAccumulator to be extended to implement different types of metrics. As result, it would be much easier to add other kinds of metrics in the future. Pull Request resolved: facebookresearch#672 Reviewed By: EdouardGrave Differential Revision: D12901046 Pulled By: Celebio fbshipit-source-id: 9dcf10de950e7fb9179c4400570d2fd7b9b1879c
…n in fasttext Summary: This diff is following up the pull-request diff `Refactor model testing and metrics code`: - Merging classes LabelMetricsAccumulator and MetricsAccumulator into one : Meter - putting back removed function signatures in fasttext.h and marking them as deprecated - removal of f1 score from results (that will be added again later) - simplifying main.cc thanks to the new api Reviewed By: EdouardGrave Differential Revision: D12903111 fbshipit-source-id: eb4116b207aad1713754c136e2a064e9517fdb57
Summary: fasttext binary has now the command test-label to display the score for each individual label. This diff adds the corresponding python binding. Reviewed By: EdouardGrave Differential Revision: D11589785 fbshipit-source-id: 809dd4a57750f05b68d6e576a58596b13fdc5d31
Summary: coverage option allows to compile in coverage mode in order to get execution metrics Reviewed By: EdouardGrave Differential Revision: D11659859 fbshipit-source-id: 0d831571e00fadf2002d6b074a89ff76fa7dcfe1
Summary: "compute precision/recall for each label" commit removed the function multilinePredict in the python bindings. However this causes performance issues. This diff is putting back the function by adapting it to the new api of `predict` function in fasttext.h. The issue was reported here : facebookresearch#673 Reviewed By: EdouardGrave Differential Revision: D12900565 fbshipit-source-id: 880cc428810e755021958e6427a5e6c4f2b43e79
Summary: Currently circleci tests fail for two reasons: - when the vm tries to install a specific version of pybind for testing fasttext version on pypi - when the vm has a compiler/stl version that needs explicit includes for stdexcept Reviewed By: EdouardGrave Differential Revision: D12956910 fbshipit-source-id: 8272415b41d54d880a37777a81a316741a5b920f
Summary: Please be aware that this pull request was automatically created using [gtf](https://github.com/schneiderl/gtf) - a typo fixing bot. You should be able to merge this with no other problems. In case the proposed changes do not make sense I would be glad to hear about it. Pull Request resolved: facebookresearch#662 Reviewed By: piotr-bojanowski Differential Revision: D12959092 Pulled By: Celebio fbshipit-source-id: dcab01ffb1bad30e17f1ce9cad27d801edf66c99
Summary: The issue was reported here : facebookresearch#678 gcc 4.8.5 seems to not support `auto` as lambda parameters. Reviewed By: piotr-bojanowski Differential Revision: D13136421 fbshipit-source-id: e1770c80f78f1b6578b8750059fe8c9220265f24
Summary: Suggested by user willianpaixao : facebookresearch#674 Reviewed By: piotr-bojanowski Differential Revision: D13136722 fbshipit-source-id: 4ea07342ed659d312280fd6d9087376e9c9a82d0
Summary: This diff removes the print capabilities from fasttext and defines a new api. - `predictLine` extracts predictions from exactly one line of the input stream. - the deprecated `printLabelStats` is removed as [js bindings don't use it]( https://www.facebook.com/groups/1174547215919768/?multi_permalinks=2328051983902613&comment_id=2360179150689896 ) - `ngramVectors` is now deprecated by the addition of `getNgramVectors`. `Vector` class remains copy-free but move semantics has been added. - `analogies` is now deprecated by `getAnalogies`. when called, fastText class lazy-precomputes word vectors - `findNN` is now deprecated by `getNN`. when called, fastText class lazy-precomputes word vectors - `trainThread` and `printInfo` functions are now private. - `supervised`, `cbow`, `skipgram`, `selectEmbeddings`, `precomputeWordVectors` are now deprecated and will be private in the future. - `saveVectors`, `saveOutput` and `saveModel` without arguments are now deprecated by their equivalent with filename as string argument. Reviewed By: EdouardGrave Differential Revision: D13083799 fbshipit-source-id: f557ed7c141a90a6171045fe118ac16c195c824f
Summary: In some environments, `python setup.py install` fails to install pybind11. The solution is given by `pybind11`'s repository which consists on calling `pip install pybind11` via subprocess. The issue and the solution were reported here facebookresearch#512 Reviewed By: EdouardGrave Differential Revision: D13167381 fbshipit-source-id: 4ee7835a07e503d00728857242e085bc7de53c14
Summary: Argument names were missing from the fasttext.h file making it harder to read as an api. This commit adds their names. `loadVectors` function's argument is now a `const std::string &` instead of `std::string`. Reviewed By: EdouardGrave Differential Revision: D13180989 fbshipit-source-id: 81b63763047514ff13b60eb0cf7992601d33f188
Summary: The buffer vector should be normalized when added to the query vector. Reviewed By: EdouardGrave Differential Revision: D13192638 fbshipit-source-id: faa46d339e7cc0d149ccc5826fa7197ccfd81635
Summary: The new option for the `loss` parameter allows to compute the loss as a sum of cross-entropy of each independent unit of the output. Reviewed By: EdouardGrave Differential Revision: D10853638 fbshipit-source-id: dc4c56e25c89c9da1a33bda1b29db781080794fd
Summary: one-vs-all loss option is now available for python Reviewed By: piotr-bojanowski Differential Revision: D13232380 fbshipit-source-id: 08c7f500fd7206132d0905f33b79e2f3dc745db2
Summary: Pull Request resolved: facebookresearch#659 Reviewed By: piotr-bojanowski Differential Revision: D13318793 Pulled By: Celebio fbshipit-source-id: 3b8fd28172b0291b2df93a526de630b01df680e8
Summary: Re-licensing fastText to MIT Reviewed By: piotr-bojanowski Differential Revision: D13415080 fbshipit-source-id: 6708849531fe7559cde273a3024660bc8b3b3750
Summary: Hi Looks like in some cases the `language` is not defined which result in 404 links. <img width="398" alt="capture d ecran 2018-07-23 a 00 24 32" src="https://user-images.githubusercontent.com/124937/43050768-e1209046-8e0e-11e8-8510-beed7b549633.png"> This fix defaults the language to `en` Pull Request resolved: facebookresearch#581 Reviewed By: piotr-bojanowski Differential Revision: D13397407 Pulled By: Celebio fbshipit-source-id: 5604039e9a4104ecadfbd8978ffe5a15317e5c56
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Attempt 2. Strictly a rebase of fixes onto FB master branch.