Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml.etree.ElementTree.ParseError: no element found: line 1, column 0 #320

Open
KarenPHS opened this issue Aug 22, 2024 · 7 comments
Open

Comments

@KarenPHS
Copy link

DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.

To Reproduce

Steps to reproduce the behavior:

What code / cli command are you executing?

For example: I am running

source = YouTubeTranscriptApi.list_transcripts("SeXZt5hqe6I")
en_caption = source.find_transcript(['en']).fetch()

Which Python version are you using?

Python 3.6.4

Which version of youtube-transcript-api are you using?

youtube-transcript-api 0.6.2

Expected behavior

Describe what you expected to happen.

For example: I expected to receive the english transcript

Actual behaviour

Describe what is happening instead of the Expected behavior. Add error messages if there are any.

For example: Instead I received the following error message:

  File "E:\Python Project\yt-concate-test\yt-concate\venv\lib\site-packages\youtube_transcript_api\_transcripts.py", line 293, in fetch
    _raise_http_errors(response, self.video_id).text,
  File "E:\Python Project\yt-concate-test\yt-concate\venv\lib\site-packages\youtube_transcript_api\_transcripts.py", line 358, in parse
    for xml_element in ElementTree.fromstring(plain_data)
  File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\xml\etree\ElementTree.py", line 1315, in XML
    return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
@jdepoix
Copy link
Owner

jdepoix commented Aug 23, 2024

Hi @KarenPHS, I cannot replicate that error. Does that happen for every video or only SeXZt5hqe6I?

@KarenPHS
Copy link
Author

KarenPHS commented Aug 24, 2024

No, I tried. But it happened at least once when I downloaded captions from videos.

import urllib.request
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api._errors import NoTranscriptFound, TranscriptsDisabled
from xml.etree.ElementTree import ParseError
import json

base_video_url = 'https://www.youtube.com/watch?v='
base_search_url = 'https://www.googleapis.com/youtube/v3/search?'

API_KEY=''
channel_id = 'UCKSVUHI9rbbkXhvAXK-2uxA'
first_url = base_search_url + 'key={}&channelId={}&part=snippet,id&order=date&maxResults=25'.format(API_KEY, channel_id)

video_links = []
url = first_url

# download all video links from a channel
while True:
    inp = urllib.request.urlopen(url)
    resp = json.load(inp)

    for i in resp['items']:
        if i['id']['kind'] == "youtube#video":
            video_links.append(base_video_url + i['id']['videoId'])

    try:
        next_page_token = resp['nextPageToken']
        url = first_url + '&pageToken={}'.format(next_page_token)
    except KeyError:
        break

# download all captions from all videos
for url in video_links:
    url_id = url.split('watch?v=')[-1]
    while True:
        try:
            source = YouTubeTranscriptApi.list_transcripts(url_id)
            en_caption = source.find_transcript(['en']).fetch()  
            break
        except (KeyError, NoTranscriptFound, TranscriptsDisabled):
            print('No captions there', url_id)
            break
        except ParseError:
            print('ParseError. there is a caption in', url, ', so, try again')

@jdepoix
Copy link
Owner

jdepoix commented Aug 26, 2024

So it doesn't happen consistently for SeXZt5hqe6I, but just randomly happened once?

@Araule
Copy link

Araule commented Aug 30, 2024

Hello, I have the same problem. For around 200 videos, I catch this error around 3-5 times every time, never the same ids.

I use Python 3.10.14 and youtube-transcript-api 0.6.2 (downloaded with pip).

@KarenPHS
Copy link
Author

KarenPHS commented Sep 3, 2024

So it doesn't happen consistently for SeXZt5hqe6I, but just randomly happened once?

Yes, it randomly happened, but more than once.

@dgarridoa
Copy link

I got the same issue using Python 3.11.3 using youtube-transcript-api 0.6.2. And also note that happens randomly, when I retried it ended up working.

ERROR:root:no element found: line 1, column 0, 3q67v12M31M
ERROR:root:no element found: line 1, column 0, McRUxBHgFIo
ERROR:root:no element found: line 1, column 0, mokGJiXVw_4

@aketchum15
Copy link

I am also experiencing the same issue with video "sHnqCqG54Sw" along with others intermittently.
Is there any update on this issue? If there is any information on it not documented here, please let me know. I would be interested in fixing this myself if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants