Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert notebooks to Markdown when specified in @file #1033

Open
dlqqq opened this issue Oct 14, 2024 · 2 comments
Open

Convert notebooks to Markdown when specified in @file #1033

dlqqq opened this issue Oct 14, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@dlqqq
Copy link
Member

dlqqq commented Oct 14, 2024

Problem

Jupyter Notebooks (.ipynb) are serialized to disk as a JSON object. However, this object is not in a natural language format preferable to an LLM and includes various metadata that should not be sent to the LLM. When a notebook is passed to an LLM as a JSON object, this may result in lower-quality responses to questions about notebooks, as the contents of the cells are nested within the JSON object and are surrounded by irrelevant metadata.

Proposed Solution

When a notebook is specified in @file, the notebook should be converted to a Markdown string before being injected into the prompt sent to the LLM.

@dlqqq dlqqq added the enhancement New feature or request label Oct 14, 2024
@JasonWeill
Copy link
Collaborator

Might be a good use for jupytext?

@michaelchia
Copy link
Collaborator

.ipynb files are not provided as raw json and are already handled as a special case as per

return FILE_CONTEXT_TEMPLATE.format(
filepath=filepath,
content=self._process_file(content, filepath),
)
def _process_file(self, content: str, filepath: str):
if filepath.endswith(".ipynb"):
nb = nbformat.reads(content, as_version=4)
return "\n\n".join([cell.source for cell in nb.cells])
return content
.

However, is still a good idea to translate it to markdown, or at least handle/indicate different cell types correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants