-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to create code2vec input #186
Comments
Hi @messiGao , I think there is a confusion, because the exception that is raised is coming from TensorFlow, while the java command that you mentioned does not involve TensorFlow at all. May I also ask what kinds of tasks are you looking into? Best, |
I want to use the “--test” command to export <TEST_FILE>.vectors,but I don't know what kind of TEST_FILE is correct。when i ask gpt-4, the answer is use the JavaExtractor to convert my test.java to test.txt。 |
Additionally,My aim is to store a Java codebase in a vector database to run similarity searches and retrieve code files from the db relevant to my query. |
Hi @messiGao , Please see https://github.com/neulab/code-bert-score This will allow you to use the Huggingface library with that model and a BERT-like framework. Best, |
I have a similar dilemma with regards to creating embeddings of csharp code using a code2vec model I have trained. As
|
Hi @asyed79gatech , I believe that you haven't run the However in general, I recommend using the newer https://github.com/neulab/code-bert-score project. It is based on Huggingface, which is actively maintained. Best, |
Hi @urialon Thanks for your prompt response. I thought we only needed to run the preprocess.sh script while training the code2vec model. Right now, I already have a trained model released and want it to generate embeddings for vector store. |
Hello, have you resolved your issue? How can Java source code be converted into the input format required by code2vec? |
hello, I encountered the same issue. Have you resolved it? |
I use command like “{java -cp JavaExtractor-0.0.1-SNAPSHOT.jar JavaExtractor.App --max_path_length 8 --max_path_width 2 --dir test.java >file.txt }“ ,then use ”{python3 code2vec.py --load models/java14_model/saved_model_iter8.release --test file.txt}“,but get error “ {return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expect 201 fields but have 4 in record
[[{{node IteratorGetNext}}]] }”.
The text was updated successfully, but these errors were encountered: