Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ncbi metadata download #18

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from
Open

Ncbi metadata download #18

wants to merge 4 commits into from

Conversation

evezeyl
Copy link
Contributor

@evezeyl evezeyl commented Nov 11, 2022

Please have a fast prettifying/spelling correction and pull this to the github

@karinlag
Copy link
Member

I think I sort of understand what you are doing, but let's chat when you are back

@evezeyl
Copy link
Contributor Author

evezeyl commented Dec 2, 2022

@karinlag Can you have a second look ?
I did the improvements

@karinlag
Copy link
Member

karinlag commented Dec 5, 2022

It is almost there. You lose me with the "this"es where you introduce the yaml files :) sort that out, and I'll give it a final readthrough :)

conda activate ncbimeta
```

To use ncbimeta, you will need to use an API (application programming interface) key from NCBI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An API is a way of asking a different computer to send you information. The key is a way for NCBI to associate you with the specific request.
Just concept sorting/clarification.

per second than you would be able to perform without it.

To get an API key from NCBI :
- login (or create an account)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log in to what?

- login (or create an account)
- account > account settings : there you can create an API key you will use - copy it and keep it safe

We will need to create a configuration file (.yalm) for each metadata download
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml

A trick is to create a simple search in NCBI ex:
![screenshot1](./searchfield1.png)

This search field will help you build your "metadata download configuration file" (.yalm).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml


Compare it to the [.yalm configuration file](./providencia_metadata.yaml)

The `.yalm` file allow to define the destination of the download (OUTPUT_DIR),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml?

your NCBI identification (EMAIL and API_KEY), the name of the sqlite database it will
download the metadata to (DATABASE), the tables it will create within the database
(TABLES) and the columns/fields it will create and download data to for each
table in the sqlite database (TABLE_COLUMNS, indentation: table name you want,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identation?


So, you see that you can define your own fields, that were not provided in the other examples found in the [Biosample example](https://github.com/ktmeaton/NCBImeta/tree/master/schema).

To define those fields you need to access at which hierarchy of the `.xml` file you downloaded as helper.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence does not make sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And on reread I am completely lost here. I have no idea of how to get to the yaml file I need for the search.

conda activate NCBImeta
NCBImeta --config providencia_metadata.yaml --flat
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it also give csv?


To use ncbimeta, you will need to use an API (application programming interface) key from NCBI.
An API key is an authentication code that will serves to tell NCBI that you are the user
making requests to download metadata via ncbimeta software. Using an NCBI API key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..that will serve (singular)

run the software, but it is a configuration file that specifies the search/query
we are asking the software to do: to retrieve the available metadata we are interested in,
and then the specification of the specific of fields we want to download in a database/table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest you somehwere here show a screenshot of what this file should look like, not only link to it.

@novica novica changed the base branch from main to dev July 2, 2024 09:33
@novica
Copy link
Member

novica commented Sep 19, 2024

Are people still working on this?

@evezeyl
Copy link
Contributor Author

evezeyl commented Sep 19, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants