Open Sourced NoteBookLM

Overview

The Podcast Creator script (open sourced notebookLM) is designed to automate the process of creating a podcast from a PDF document. It extracts text from the PDF, generates a detailed podcast script using OpenAI's GPT-4 model, converts the script to audio, and then combines the audio with images of the PDF pages to create a video. The final output includes both an audio file and a video file with synchronized audio.

Examples

Mistral 7B

LLama2

Attention is all you need

open sourced NoteBookLM Features

PDF Text Extraction: Extracts text content from a PDF document.
Script Generation: Uses OpenAI's GPT-4 model to generate a detailed podcast script based on the extracted text.
Text-to-Speech Conversion: Converts the generated script into audio using OpenAI's text-to-speech capabilities.
Audio Processing: Processes the audio to ensure it meets the desired specifications (e.g., stereo, sample rate).
Video Creation: Converts PDF pages to images and combines them with the audio to create a video.
Environment Configuration: Loads environment variables from a .env file for secure API key management.

Workflow

Extract Text from PDF: The script starts by extracting text from the provided PDF file.
Generate Podcast Script: The extracted text is used to generate a podcast script featuring two hosts, Alice and John, who engage in a detailed conversation about the content.
Convert Script to Audio: The script is converted to audio, with different voices assigned to Alice and John.
Process Audio: The audio is processed to ensure it is in the correct format and quality.
Create Video: Images of the PDF pages are created and combined with the audio to produce a video.
Save Outputs: The final audio and video files are saved to the specified output paths.

How to Use open sourced NoteBookLM

To run the project:

   pip install poetry

   poetry install

fill .env file

  OPENAI_API_KEY=""

cd podcast_creator

To use the script, simply provide the path to the PDF file and run the script. The script will handle the rest, generating the podcast script, converting it to audio, processing the audio, and creating the video.

if name == "main":
pdf_path = "/path/to/your/pdf/document.pdf"
create_podcast_from_pdf(pdf_path)

   poetry run python podcast_creator/main.py

Dependencies

langchain
pydantic
openai
pydub
fitz (PyMuPDF)
numpy
subprocess
tqdm
PIL (Pillow)
textwrap
dotenv

Ensure all dependencies are installed before running the script.

Conclusion

The Podcast Creator (open sourced NoteBookLM) script provides a comprehensive solution for converting PDF documents into engaging podcast episodes, complete with audio and video outputs. By leveraging advanced AI models and audio processing techniques, it automates the entire workflow, making it easy to create high-quality podcast content from textual documents.

License

Let's Have a Chat ;)

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
pdf		pdf
podcast_creator		podcast_creator
README.md		README.md
img.png		img.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Sourced NoteBookLM

Overview

Examples

Mistral 7B

LLama2

Attention is all you need

open sourced NoteBookLM Features

Workflow

How to Use open sourced NoteBookLM

Dependencies

Conclusion

License

About

Releases

Packages

Languages

mehdihosseinimoghadam/open-sourced-nootbookLM

Folders and files

Latest commit

History

Repository files navigation

Open Sourced NoteBookLM

Overview

Examples

Mistral 7B

LLama2

Attention is all you need

open sourced NoteBookLM Features

Workflow

How to Use open sourced NoteBookLM

Dependencies

Conclusion

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages