Youtube Transcript Summarizer using NLP #940

sindhuja184 · 2024-10-22T13:38:24Z

Deep Learning Simplified Repository (Proposing new issue)

🔴 Project Title : Youtube Transcript Summarizer

🔴 Aim :The aim of the YouTube Transcript Summarizer is to provide concise, meaningful summaries by reducing transcript length by 80%, allowing users to quickly grasp the key points of a video.

🔴 Dataset : The dataset used would typically be the transcripts of YouTube videos

🔴 Approach : The YouTube Transcript Summarizer employs Natural Language Processing (NLP) techniques to provide concise summaries of video transcripts. The process begins with extracting the transcript, followed by preprocessing to clean and tokenize the text. The chosen algorithm then analyzes the content to generate a summary, significantly reducing the original length while retaining essential points. This approach enables users to quickly grasp the core message of a video without sifting through lengthy transcripts.(Transcripts are take with the help of youtube transcript summariser)

📍 Follow the Guidelines to Contribute in the Project :

You need to create a separate folder named as the Project Title.
Inside that folder, there will be four main components.
- Images - To store the required images.
- Dataset - To store the dataset or, information/source about the dataset.
- Model - To store the machine learning model you've created using the dataset.
- requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

🔴🟡 Points to Note :

The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
"Issue Title" and "PR Title should be the same. Include issue number along with it.
Follow Contributing Guidelines & Code of Conduct before start Contributing.

✅ To be Mentioned while taking the issue :

Full name : Sindhuja Didugu
GitHub Profile Link : https://github.com/sindhuja184
Email ID :[email protected]
Participant ID (if applicable):
Approach for this Project :The YouTube Transcript Summarizer employs Natural Language Processing (NLP) techniques to provide concise summaries of video transcripts. The process begins with extracting the transcript, followed by preprocessing to clean and tokenize the text. The chosen algorithm then analyzes the content to generate a summary, significantly reducing the original length while retaining essential points. This approach enables users to quickly grasp the core message of a video without sifting through lengthy transcripts.(Transcripts are take with the help of youtube transcript summariser)
What is your participant role? (Mention the Open Source program) GSSOC ext- Participant

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

The text was updated successfully, but these errors were encountered:

github-actions · 2024-10-22T13:38:35Z

Thank you for creating this issue! We'll look into it as soon as possible. Your contributions are highly appreciated! 😊

Abhiiesante · 2024-10-23T04:28:21Z

Can you please assign this issue to me under 𝗚𝗦𝗦𝗼𝗖 '𝟮𝟰 𝗘𝘅𝘁𝗲𝗻𝗱𝗲𝗱, Hacktoberfest-accepted

abhisheks008 · 2024-10-24T10:53:13Z

Can you please assign this issue to me under 𝗚𝗦𝗦𝗼𝗖 '𝟮𝟰 𝗘𝘅𝘁𝗲𝗻𝗱𝗲𝗱, Hacktoberfest-accepted

As this issue is raised by @sindhuja184, this issue can't be assigned to you.

abhisheks008 · 2024-10-24T10:53:40Z

@sindhuja184 can you please elaborate the approach you are planning for this problem statement?

sindhuja184 · 2024-10-24T11:26:40Z

The aim of the project is to summarize the transcripts of the youtube video.

Initially I would extract the transcript of the youtube video with the help of Youtube Transcript API.(Here I would need the video ID of the youtube video).
Then split the text into chunks with each of size some tokens.(Summarization models have a token limit, so spliting is mandatory here.)
Then by using Hugging face transformers I would summarize the text.(I would like to select facebook, bart-large-cnn model).
Then, combine the summaries.

This is the approach I am planning to follow @abhisheks008

abhisheks008 · 2024-10-24T15:30:18Z

Apart from huggingface, any other algorithms you are comfortable with? As the project repository requires at least 3 model implementations for each problem statement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Youtube Transcript Summarizer using NLP #940

Youtube Transcript Summarizer using NLP #940

sindhuja184 commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

Abhiiesante commented Oct 23, 2024

abhisheks008 commented Oct 24, 2024

abhisheks008 commented Oct 24, 2024

sindhuja184 commented Oct 24, 2024

abhisheks008 commented Oct 24, 2024

Youtube Transcript Summarizer using NLP #940

Youtube Transcript Summarizer using NLP #940

Comments

sindhuja184 commented Oct 22, 2024

Deep Learning Simplified Repository (Proposing new issue)

📍 Follow the Guidelines to Contribute in the Project :

github-actions bot commented Oct 22, 2024

Abhiiesante commented Oct 23, 2024

abhisheks008 commented Oct 24, 2024

abhisheks008 commented Oct 24, 2024

sindhuja184 commented Oct 24, 2024

abhisheks008 commented Oct 24, 2024