-
Notifications
You must be signed in to change notification settings - Fork 16.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Video imagery to text (Closed Captioning) #11770
Comments
🤖 Thank you for your interest in contributing to LangChain! Your proposed feature to generate .srt files with both subtitles and visual scene descriptions sounds like a valuable addition to the framework. We appreciate your motivation to enhance video accessibility and provide a more inclusive viewing experience. It's great to see your commitment to addressing the needs of the hearing-impaired community. Please go ahead and work on implementing this feature. We look forward to reviewing your pull request in November. If you have any questions or need any guidance during the process, feel free to ask. Good luck with your school project! This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
@dosu-bot What would the next steps look like? We should probably start by creating an outline for the implementation of this feature. Since Langchain already has an image analyzer and speech-to-text feature, the main goal would be to effectively break down a video into frames and then to make sure the output of each frame is combined into something that is accurate and makes sense. Also, can I add my group members (GitHub accounts) to this issue? Just so we are all on the same page. |
🤖 Hello, It's great to hear that you're interested in contributing to LangChain! Your feature idea sounds very promising and would be a valuable addition to the project. To start implementing this feature, you can follow the steps outlined in the CONTRIBUTING.md file. This includes forking the repository, creating a new branch, implementing the feature, committing and pushing your changes, and finally opening a pull request. As you've mentioned, LangChain already has an image analyzer and speech-to-text feature. Your main task would be to break down a video into frames and then combine the output of each frame into something that makes sense. This will likely involve using the For the .srt file format, you can refer to the
Each subtitle entry consists of a sequence number, the start and end times of the subtitle, the subtitle text itself, and a blank line indicating the end of the subtitle entry. To add your group members to this issue, you can navigate to the issue in the LangChain repository and click on the 'assign yourself' link on the right side of the issue page. If you do not have the necessary permissions to do this, you can comment on the issue asking to be assigned. Please remember that all code contributions must pass the formatting, linting, and testing checks before they can be merged. You can run these checks locally using the commands I hope this helps! If you have any further questions, feel free to ask. Best of luck with your project! SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
@dosu-bot Hello, I am also part of the group contributing to this issue. Can you please assign me to this issue? |
@dosu-bot Hello! When trying to use the ImageCaptionLoader, we were following the doc on it. We get the error that NoneType object is not callable at the line processor = BlipProcessor.from_pretrained(self.blip_processor) |
🤖 The error message "NoneType object is not callable" typically indicates that you're trying to call a None object as if it were a function or method. In your case, the error is happening at the line Here are a few potential reasons for this issue:
If all the above checks pass, then the issue might be with the Unfortunately, I couldn't find specific details about how the SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
@dosu-bot We created a new chain using some new dependencies that the other existing chains don't have. Thus, when running the tests, they fail because they cannot find the specified module. For example, we are using OpenCV to split videos into frames. How should we add the opencv dependency to langchain so they work with other chains? We keep getting the error that ModuleNotFoundError: No module named 'cv2' for every existing chain when trying to run the tests. |
Hi, @A2113S, I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue was raised by you to request the implementation of a feature using Langchain's image_captions.py and audio_speech_to_text.py to generate .srt files for closed captioning, aiming to enhance accessibility for the hearing-impaired community. The issue gained interest from dosubot, LunarECL, nicoledroi1, and ifritzler, who provided guidance and assistance on the implementation steps, dependencies, and error troubleshooting. The students from the University of Toronto are aiming to contribute to Langchain for a school project and have received support and instructions from the maintainers of the repository. Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days. Thank you! |
…14059) Description: Video imagery to text (Closed Captioning) This pull request introduces the VideoCaptioningChain, a tool for automated video captioning. It processes audio and video to generate subtitles and closed captions, merging them into a single SRT output. Issue: #11770 Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers, pillow, torch, openai Tag maintainer: @baskaryan @hwchase17 Hello! We are a group of students from the University of Toronto (@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a contribution to the LangChain community! We have ran make format, make lint and make test locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Thank you for taking the time to review our PR! --------- Co-authored-by: Bagatur <[email protected]>
…angchain-ai#14059) Description: Video imagery to text (Closed Captioning) This pull request introduces the VideoCaptioningChain, a tool for automated video captioning. It processes audio and video to generate subtitles and closed captions, merging them into a single SRT output. Issue: langchain-ai#11770 Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers, pillow, torch, openai Tag maintainer: @baskaryan @hwchase17 Hello! We are a group of students from the University of Toronto (@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a contribution to the LangChain community! We have ran make format, make lint and make test locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Thank you for taking the time to review our PR! --------- Co-authored-by: Bagatur <[email protected]>
…#14059) Description: Video imagery to text (Closed Captioning) This pull request introduces the VideoCaptioningChain, a tool for automated video captioning. It processes audio and video to generate subtitles and closed captions, merging them into a single SRT output. Issue: langchain-ai/langchain#11770 Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers, pillow, torch, openai Tag maintainer: @baskaryan @hwchase17 Hello! We are a group of students from the University of Toronto (@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a contribution to the LangChain community! We have ran make format, make lint and make test locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Thank you for taking the time to review our PR! --------- Co-authored-by: Bagatur <[email protected]>
…14059) Description: Video imagery to text (Closed Captioning) This pull request introduces the VideoCaptioningChain, a tool for automated video captioning. It processes audio and video to generate subtitles and closed captions, merging them into a single SRT output. Issue: #11770 Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers, pillow, torch, openai Tag maintainer: @baskaryan @hwchase17 Hello! We are a group of students from the University of Toronto (@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a contribution to the LangChain community! We have ran make format, make lint and make test locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Thank you for taking the time to review our PR! --------- Co-authored-by: Bagatur <[email protected]>
Feature request
Implement a feature using Langchain's image_captions.py and audio_speech_to_text.py to produce .srt files. This system will provide both subtitles and visual scene descriptions, essentially creating closed captioning.
Motivation
Current closed captioning primarily captures spoken content, omitting crucial visual elements of videos. This lack of comprehensive description hinders a full understanding, especially for the hearing-impaired community. Our goal is to elevate video accessibility by integrating both visual and auditory components in closed captioning, ensuring a more inclusive viewing experience.
Your contribution
We are students at the University of Toronto, interested in contributing to Langchain for a school project. We would like to submit a PR that implements this feature sometime in November.
The text was updated successfully, but these errors were encountered: