Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is stopped working #30

Open
bholagourav opened this issue Aug 12, 2024 · 19 comments
Open

It is stopped working #30

bholagourav opened this issue Aug 12, 2024 · 19 comments

Comments

@bholagourav
Copy link

bholagourav commented Aug 12, 2024

For a video which has captions it is throwing error message could not find captions for the video.On local env. it is working fine.

@dfdeagle47
Copy link
Contributor

dfdeagle47 commented Aug 21, 2024

I'm also encountering the issue (probably since August 7th, 2024). From what I could understand so far, it's because the response from YouTube when fetching the page is missing the captionTracks property, hence the error in the code triggered here.

I see that the YouTube response seems to differ based on the environment. If I run this command:

wget -qO- 'https://www.youtube.com/watch?v=qhszd_wqAgQ' | grep 'captionTracks'

locally, I will get a match. However, when running it on the server, there are no matches.

Maybe YouTube A/B testing something, and serving different HTML content based on the location. Or maybe they decided to remove this property when the request comes from a server in a data center.

Technically, this is an unofficial way to retrieve the captions. AFAIK, the official way only allows retrieving the captions for your own videos, so they might not like other platforms scrapping the captions...

@NikeshCohen
Copy link

Mmmm, that is quite unfortunate. I've been trying to retrieve captions from videos aswell. An alternative I've found is this link that is in each YouTube video.

https://www.youtube.com/api/timedtext?v=dq3j-NTqJX4&ei=3gjLZrnMCsGdp-oPnOLjuQI&caps=asr&opi=112496729&exp=xbt&xoaf=4&hl=en-GB&ip=0.0.0.0&ipbits=0&expire=1724607310&sparams=ip,ipbits,expire,v,ei,caps,opi,exp,xoaf&signature=66D15C767604C769FCB036A11473B586E39C505B.5B635F4CF960F33FA808ABA17577D5E941468A75&key=yt8&kind=asr&lang=en

But I see its also left out when pinging the page from a server, there are free sites like https://downsub.com/ that extract caption from videos, its a free tool so I doubt they are using an extreme way to fetch the captions as that would cost money. Any ideas for ways around this?

@dfdeagle47
Copy link
Contributor

dfdeagle47 commented Aug 26, 2024

I was looking at other libs like @os-team/youtube-captions to extract the caption to see their approach.

That lib uses yt-dlp to extract the caption, so I tried running this tool on the server to see what would happen (it works locally):

$ yt-dlp --list-subs 'https://www.youtube.com/watch?v=qhszd_wqAgQ'

[youtube] Extracting URL: https://www.youtube.com/watch?v=qhszd_wqAgQ
[youtube] qhszd_wqAgQ: Downloading webpage
[youtube] qhszd_wqAgQ: Downloading ios player API JSON
[youtube] qhszd_wqAgQ: Downloading web creator player API JSON
ERROR: [youtube] qhszd_wqAgQ: Sign in to confirm you’re not a bot. This helps protect our community. Learn more

So it seems that YouTube is detecting that the request might be coming from a bot (based on the IP I suppose).

Maybe it's possible to make it work by signing in and passing the cookie to yt-dlp. However, I expect that this will be rate-limited if you're making too many requests.

Also links that might be relevant:

@NikeshCohen
Copy link

Thanks for that info! Honestly quite crappy that YouTube is blocking requests from servers (although I understand why they would want to).

Another alternative I stumbled upon is directly accessing captions via the YouTube API, however it costs 220 tokens per request so its not a very scalable option

@dfdeagle47
Copy link
Contributor

I stumbled upon this code:
https://stackoverflow.com/a/70013529

It uses a different API (YouTube's internal API) to retrieve the captions.

I tested the Python code on my server and it seemed to work on principle. I didn't check if the response contains everything, but I'll investigate some more. I want to translate the code in nodejs and see how it behaves.

Although, I don't know how scalable it is. Your server IP might be banned if you make too many calls perhaps.

@dfdeagle47
Copy link
Contributor

Thanks for that info! Honestly quite crappy that YouTube is blocking requests from servers (although I understand why they would want to).

Another alternative I stumbled upon is directly accessing captions via the YouTube API, however it costs 220 tokens per request so its not a very scalable option

When you say the YouTube API, you're talking about the official YouTube API Data v3? The limitation I had is that it only allows you retrieve the captions for videos you own.

@NikeshCohen
Copy link

NikeshCohen commented Aug 27, 2024

When you say the YouTube API, you're talking about the official YouTube API Data v3? The limitation I had is that it only allows you retrieve the captions for videos you own.

Correct yes. Ah i see, i see. Thats not very helpful then, I haven't implemented that as yet, was just info I found. So that option is out the box.

@NikeshCohen
Copy link

I stumbled upon this code: https://stackoverflow.com/a/70013529

It uses a different API (YouTube's internal API) to retrieve the captions.

I tested the Python code on my server and it seemed to work on principle. I didn't check if the response contains everything, but I'll investigate some more. I want to translate the code in nodejs and see how it behaves.

Although, I don't know how scalable it is. Your server IP might be banned if you make too many calls perhaps.

Ah interesting, similar method to what I pivoted to (the stackoverflow link). Its most likely that you aren't getting any any data, although I stand to be corrected, YouTube sends back an OK res but it contains nothing useful at all.

For reference this is the current logic I'm using, not utilizing any lib just raw data from the response when sending a GET to the video:

export const fetchVideoData = async (videoId: string) => {
  try {
    let transcription = "";
    let videoTitle;

    const res = await fetch(`https://www.youtube.com/watch?v=${videoId}`);
    const html = await res.text();

    const titleMatch = html.match(/<title>(.*?)<\/title>/i);
    if (titleMatch) {
      videoTitle = titleMatch[1];
    } else {
      videoTitle = "No title found, ignore this text";
    }

    const captionUrlMatch = html.match(/"captionTracks":.*?"baseUrl":"(.*?)"/);
    if (!captionUrlMatch) {
      throw new Error("Unable to fetch transcription from YouTube");
    }

    const captionUrl = captionUrlMatch[1].replace(/\\u0026/g, "&");
    const captionRes = await fetch(captionUrl);
    const captionXML = await captionRes.text();

    const parsedResult = await parseStringPromise(captionXML);

    if (
      parsedResult &&
      parsedResult.transcript &&
      parsedResult.transcript.text
    ) {
      parsedResult.transcript.text.forEach((textElement: any) => {
        if (textElement._) {
          transcription += textElement._ + " ";
        }
      });
    }

    return {
      transcription: transcription.trim(),
      videoTitle,
    };
  } catch (error) {
    throw new Error("Unable to fetch information from YouTube");
  }
};

@dfdeagle47
Copy link
Contributor

OK I went down a bit of a rabbit hole to find alternative.

InnerTube

First of all, I learned about InnerTube. The gist of it is that the YouTube website uses a different (private) API to interact with the browser.

There is a well-maintained JS lib YouTube.js. Unfortunately, it suffers from the same limitation. For instance, the code below:

import { Innertube } from 'youtubei.js';

async function main() {
  const youtube = await Innertube.create();

  const videoInfo = await youtube.getBasicInfo('pyX8kQ-JzHI');

  console.log(videoInfo);
}
main();

works fine locally because you can retrieve the captions under videoInfo.captions.caption_tracks (beware: it uses snake_case in the response compared to the lib).

However, it returns the same error as always on the server:

VideoInfo {
  basic_info: {
    embed: null,
    channel: null,
    is_unlisted: undefined,
    is_family_safe: undefined,
    category: null,
    has_ypc_metadata: null,
    start_timestamp: null,
    end_timestamp: null,
    view_count: undefined,
    url_canonical: null,
    tags: null,
    like_count: undefined,
    is_liked: undefined,
    is_disliked: undefined
  },
  annotations: undefined,
  storyboards: undefined,
  endscreen: undefined,
  captions: undefined,
  cards: undefined,
  streaming_data: undefined,
  playability_status: {
    status: 'LOGIN_REQUIRED',
    reason: 'Sign in to confirm you’re not a bot',
    embeddable: false,
    audio_only_playablility: null,
    error_screen: PlayerErrorMessage {
      type: 'PlayerErrorMessage',
      subreason: [Text],
      reason: [Text],
      proceed_button: [Button],
      thumbnails: [Array],
      icon_type: 'ERROR_OUTLINE'
    }
  },
  player_config: undefined
}

get_transcript PoC

Here's a code I got working on the server:

const axios = require('axios');
const protobuf = require('protobufjs');
const Buffer = require('buffer').Buffer;

const VIDEO_ID = 'pyX8kQ-JzHI';

function getBase64Protobuf(message) {
  const root = protobuf.Root.fromJSON({
    nested: {
      Message: {
        fields: {
          param1: { id: 1, type: 'string' },
          param2: { id: 2, type: 'string' },
        },
      },
    },
  });
  const MessageType = root.lookupType('Message');

  const buffer = MessageType.encode(message).finish();

  return Buffer.from(buffer).toString('base64');
}

async function main() {
  try {
    const message1 = {
      param1: 'asr',
      param2: 'en',
    };

    const protobufMessage1 = getBase64Protobuf(message1);

    const message2 = {
      param1: VIDEO_ID,
      param2: protobufMessage1,
    };

    const params = getBase64Protobuf(message2);

    const url = 'https://www.youtube.com/youtubei/v1/get_transcript';
    const headers = { 'Content-Type': 'application/json' };
    const data = {
      context: {
        client: {
          clientName: 'WEB',
          // clientVersion: '2.20240826',
          clientVersion: '2.20240826.01.00',
        },
      },
      params,
    };

    const response = await axios.post(url, data, { headers });

    let output =
      response.data.actions[0].updateEngagementPanelAction.content.transcriptRenderer.content.transcriptSearchPanelRenderer.body.transcriptSegmentListRenderer.initialSegments.map(
        (segment) => {
          const { endMs, startMs, snippet } = segment.transcriptSegmentRenderer;

          const text = snippet.runs.map((run) => run.text).join('');

          return {
            start: parseInt(startMs) / 1000,
            dur: (parseInt(endMs) - parseInt(startMs)) / 1000,
            text,
          };
        },
      );

    console.log(output);
  } catch (err) {
    console.error('Error:', err);
  }
}

main();

The idea was to have output use the same interface more or less (although I noticed start and dur are never cast to Number here so they're actually String and this is wrong).

It calls the youtubei/v1/get_transcript' endpoint with the proper protobuf message. It assumes you know the language you want to retrieve though.

Regarding message1

This is the version you want to use for automatically-generated captions:

const message1 = {
  param1: 'asr',
  param2: 'en',
};

And this is the version you want to use if you want captions that were uploaded by the creator:

const message1 = {
  param2: 'en',
};

(obviously, the code currently crashes if you don't use the right message and there are no captions for given message1 params)

Invidious approach

I found some interesting info on the invidious repo:

I'm not familiar with the Crystal programming language, so it's a bit harder to navigate, but I wanted to check how they retrieve the video info to get the list of possible captions to get the "default" caption, but I might try another approach.

@dfdeagle47
Copy link
Contributor

Here's my last attempt of the day.

I've implemented a getDefaultSubtitleLanguage function which attempts to retrieve the default language that should be used for the subtitles. It relies on:

This is optional of course, and you could just attempt the different variations if you know the languages in advance.

Although it uses your API quota, those two endpoints work with the API key, unlike https://developers.google.com/youtube/v3/docs/captions/download which only works with Google OAuth 2.0 and for your videos.

I used Invidious' code for inspiration about the parsing of the YouTube response when fetching the transcript.

Ignoring the quota issue, I don't know what would be the rate-limit for the /youtubei/v1/get_transcript endpoint though...

Code

import { youtube_v3 } from '@googleapis/youtube';
import axios from 'axios';
import { Buffer } from 'buffer';
import protobuf from 'protobufjs';

const youtubeClient = new youtube_v3.Youtube({
  auth: '<YOUR-YOUTUBE-API-KEY>',
});

/**
 * Helper function to encode a message into a base64-encoded protobuf
 * to be used with the YouTube InnerTube API.
 * @param {Object} message - The message to encode
 * @returns {String} - The base64-encoded protobuf message
 */
function getBase64Protobuf(message) {
  const root = protobuf.Root.fromJSON({
    nested: {
      Message: {
        fields: {
          param1: { id: 1, type: 'string' },
          param2: { id: 2, type: 'string' },
        },
      },
    },
  });
  const MessageType = root.lookupType('Message');

  const buffer = MessageType.encode(message).finish();

  return Buffer.from(buffer).toString('base64');
}

/**
 * Returns the default subtitle language of a video on YouTube.
 * @param {String} videoId
 * @returns {Promise<{ trackKind: String, language: String }>} - The default subtitle language and the track kind (e.g., 'asr' or 'standard').
 */
async function getDefaultSubtitleLanguage(videoId) {
  // Get video default language
  const videos = await youtubeClient.videos.list({
    part: ['snippet'],
    id: [videoId],
  });

  if (videos.data.items.length !== 1) {
    throw new Error(`Multiple videos found for video: ${videoId}`);
  }

  const preferredLanguage =
    videos.data.items[0].snippet.defaultLanguage ||
    videos.data.items[0].snippet.defaultAudioLanguage;

  // Get available subtitles
  const subtitles = await youtubeClient.captions.list({
    part: ['snippet'],
    videoId: videoId,
  });

  if (subtitles.data.items.length < 1) {
    throw new Error(`No subtitles found for video: ${videoId}`);
  }

  const { trackKind, language } = (
    subtitles.data.items.find(
      (sub) => sub.snippet.language === preferredLanguage,
    ) || subtitles.data.items[0]
  ).snippet;

  return { trackKind, language };
}

/**
 * Helper function to extract text from certain elements.
 * Inspired by Invidious' extractors_utils.cr
 * https://github.com/iv-org/invidious/blob/384a8e200c953ed5be3ba6a01762e933fd566e45/src/invidious/yt_backend/extractors_utils.cr#L1-L30
 * @param {Object} item - The item to extract text from.
 * @returns {string} The extracted text.
 */
function extractText(item) {
  return item.simpleText || item.runs?.map((run) => run.text).join('');
}

/**
 * Function to retrieve subtitles for a given YouTube video.
 * @param {Object} options - The options for retrieving subtitles
 * @param {String} options.videoId - The ID of the video
 * @param {String} options.trackKind - The track kind of the subtitles (e.g., 'asr' or 'standard')
 * @param {String} options.language - The language of the subtitles
 * @returns {Promise<Array<{ start: Number, dur: Number, text: String }>>} - The subtitles of the video
 */
async function getSubtitles({ videoId, trackKind, language }) {
  const message = {
    param1: videoId,
    param2: getBase64Protobuf({
      // Only include `trackKind` for automatically-generated subtitles
      param1: trackKind === 'asr' ? trackKind : null,
      param2: language,
    }),
  };

  const params = getBase64Protobuf(message);

  const url = 'https://www.youtube.com/youtubei/v1/get_transcript';
  const headers = { 'Content-Type': 'application/json' };
  const data = {
    context: {
      client: {
        clientName: 'WEB',
        clientVersion: '2.20240826.01.00',
      },
    },
    params,
  };

  const response = await axios.post(url, data, { headers });

  // Mapping inspired by Invidious' transcript.cr
  // https://github.com/iv-org/invidious/blob/432c25ad8626fee401b1f349b463515d21718ac8/src/invidious/videos/transcript.cr#L51-L101
  const initialSegments =
    response.data.actions[0].updateEngagementPanelAction.content
      .transcriptRenderer.content.transcriptSearchPanelRenderer.body
      .transcriptSegmentListRenderer.initialSegments;

  if (!initialSegments) {
    throw new Error(
      `Requested transcript does not exist for video: ${videoId}`,
    );
  }

  const output = initialSegments.map((segment) => {
    const line =
      segment.transcriptSectionHeaderRenderer ||
      segment.transcriptSegmentRenderer;

    const { endMs, startMs, snippet } = line;

    const text = extractText(snippet);

    return {
      start: parseInt(startMs) / 1000,
      dur: (parseInt(endMs) - parseInt(startMs)) / 1000,
      text,
    };
  });

  return output;
}

//////////////
//////////////

async function main({ videoId }) {
  try {
    const { language, trackKind } = await getDefaultSubtitleLanguage({
      videoId,
    });

    const subtitles = await getSubtitles({
      language,
      trackKind,
      videoId,
    });

    console.log(subtitles);
  } catch (err) {
    console.error('Error:', err);
  }
}

// Video with ASR captions
main({ videoId: 'pyX8kQ-JzHI' });
// Video with uploaded captions
main({ videoId: '-16RFXr44fY' });
// Video with multiple caption tracks (`defaultAudioLanguage: 'ru'`)
main({ videoId: 'qwQwSTWHTAY' });

@NikeshCohen
Copy link

Bro is absolutely cooking 👨‍🍳 Will have a crack at these ideas and see if it works, the main issue is that YouTube seems to be cracking down on scrappers like crazy, so even if we end up finding a solid solution its only a matter of time before they block that as well.

The reason for my thought process behind my statement: https://youtube.com/shorts/xiJMjTnlxg4?si=TXnwg3NnbBK2UPG1

@NikeshCohen
Copy link

OK I went down a bit of a rabbit hole to find alternative.

BRO YOU ARE A LEGEND, got it working💪. Would be sick to connect with you and pick your brain a bit in regards to your thought process, my social links are in my bio, if not that's totally cool. Thank you!🐐

@pushkarsingh32
Copy link

Here's my last attempt of the day.

I've implemented a getDefaultSubtitleLanguage function which attempts to retrieve the default language that should be used for the subtitles. It relies on:

This is optional of course, and you could just attempt the different variations if you know the languages in advance.

Although it uses your API quota, those two endpoints work with the API key, unlike https://developers.google.com/youtube/v3/docs/captions/download which only works with Google OAuth 2.0 and for your videos.

I used Invidious' code for inspiration about the parsing of the YouTube response when fetching the transcript.

Ignoring the quota issue, I don't know what would be the rate-limit for the /youtubei/v1/get_transcript endpoint though...

Code

import { youtube_v3 } from '@googleapis/youtube';
import axios from 'axios';
import { Buffer } from 'buffer';
import protobuf from 'protobufjs';

const youtubeClient = new youtube_v3.Youtube({
  auth: '<YOUR-YOUTUBE-API-KEY>',
});

/**
 * Helper function to encode a message into a base64-encoded protobuf
 * to be used with the YouTube InnerTube API.
 * @param {Object} message - The message to encode
 * @returns {String} - The base64-encoded protobuf message
 */
function getBase64Protobuf(message) {
  const root = protobuf.Root.fromJSON({
    nested: {
      Message: {
        fields: {
          param1: { id: 1, type: 'string' },
          param2: { id: 2, type: 'string' },
        },
      },
    },
  });
  const MessageType = root.lookupType('Message');

  const buffer = MessageType.encode(message).finish();

  return Buffer.from(buffer).toString('base64');
}

/**
 * Returns the default subtitle language of a video on YouTube.
 * @param {String} videoId
 * @returns {Promise<{ trackKind: String, language: String }>} - The default subtitle language and the track kind (e.g., 'asr' or 'standard').
 */
async function getDefaultSubtitleLanguage(videoId) {
  // Get video default language
  const videos = await youtubeClient.videos.list({
    part: ['snippet'],
    id: [videoId],
  });

  if (videos.data.items.length !== 1) {
    throw new Error(`Multiple videos found for video: ${videoId}`);
  }

  const preferredLanguage =
    videos.data.items[0].snippet.defaultLanguage ||
    videos.data.items[0].snippet.defaultAudioLanguage;

  // Get available subtitles
  const subtitles = await youtubeClient.captions.list({
    part: ['snippet'],
    videoId: videoId,
  });

  if (subtitles.data.items.length < 1) {
    throw new Error(`No subtitles found for video: ${videoId}`);
  }

  const { trackKind, language } = (
    subtitles.data.items.find(
      (sub) => sub.snippet.language === preferredLanguage,
    ) || subtitles.data.items[0]
  ).snippet;

  return { trackKind, language };
}

/**
 * Helper function to extract text from certain elements.
 * Inspired by Invidious' extractors_utils.cr
 * https://github.com/iv-org/invidious/blob/384a8e200c953ed5be3ba6a01762e933fd566e45/src/invidious/yt_backend/extractors_utils.cr#L1-L30
 * @param {Object} item - The item to extract text from.
 * @returns {string} The extracted text.
 */
function extractText(item) {
  return item.simpleText || item.runs?.map((run) => run.text).join('');
}

/**
 * Function to retrieve subtitles for a given YouTube video.
 * @param {Object} options - The options for retrieving subtitles
 * @param {String} options.videoId - The ID of the video
 * @param {String} options.trackKind - The track kind of the subtitles (e.g., 'asr' or 'standard')
 * @param {String} options.language - The language of the subtitles
 * @returns {Promise<Array<{ start: Number, dur: Number, text: String }>>} - The subtitles of the video
 */
async function getSubtitles({ videoId, trackKind, language }) {
  const message = {
    param1: videoId,
    param2: getBase64Protobuf({
      // Only include `trackKind` for automatically-generated subtitles
      param1: trackKind === 'asr' ? trackKind : null,
      param2: language,
    }),
  };

  const params = getBase64Protobuf(message);

  const url = 'https://www.youtube.com/youtubei/v1/get_transcript';
  const headers = { 'Content-Type': 'application/json' };
  const data = {
    context: {
      client: {
        clientName: 'WEB',
        clientVersion: '2.20240826.01.00',
      },
    },
    params,
  };

  const response = await axios.post(url, data, { headers });

  // Mapping inspired by Invidious' transcript.cr
  // https://github.com/iv-org/invidious/blob/432c25ad8626fee401b1f349b463515d21718ac8/src/invidious/videos/transcript.cr#L51-L101
  const initialSegments =
    response.data.actions[0].updateEngagementPanelAction.content
      .transcriptRenderer.content.transcriptSearchPanelRenderer.body
      .transcriptSegmentListRenderer.initialSegments;

  if (!initialSegments) {
    throw new Error(
      `Requested transcript does not exist for video: ${videoId}`,
    );
  }

  const output = initialSegments.map((segment) => {
    const line =
      segment.transcriptSectionHeaderRenderer ||
      segment.transcriptSegmentRenderer;

    const { endMs, startMs, snippet } = line;

    const text = extractText(snippet);

    return {
      start: parseInt(startMs) / 1000,
      dur: (parseInt(endMs) - parseInt(startMs)) / 1000,
      text,
    };
  });

  return output;
}

//////////////
//////////////

async function main({ videoId }) {
  try {
    const { language, trackKind } = await getDefaultSubtitleLanguage({
      videoId,
    });

    const subtitles = await getSubtitles({
      language,
      trackKind,
      videoId,
    });

    console.log(subtitles);
  } catch (err) {
    console.error('Error:', err);
  }
}

// Video with ASR captions
main({ videoId: 'pyX8kQ-JzHI' });
// Video with uploaded captions
main({ videoId: '-16RFXr44fY' });
// Video with multiple caption tracks (`defaultAudioLanguage: 'ru'`)
main({ videoId: 'qwQwSTWHTAY' });

Is it working currently for you?

not working on my side.

says rating or some other thing is needed

@dfdeagle47
Copy link
Contributor

Is it working currently for you?

not working on my side.

says rating or some other thing is needed

@pushkarsingh32

We use a code similar to this one in production, and it seems to work for now on our end.

Although, I can imagine it's not foolproof and YouTube could block this (or might already block this) in some cases.

@marcomoauro
Copy link

Do you know the whole mapping of languages to pass in the param2 parameter?

@dfdeagle47
Copy link
Contributor

Do you know the whole mapping of languages to pass in the param2 parameter?

I don't know the whole mapping, but it seems to be using the ISO 639-1 standard for the language codes.

@jigneshk5
Copy link

Here's my last attempt of the day.

I've implemented a getDefaultSubtitleLanguage function which attempts to retrieve the default language that should be used for the subtitles. It relies on:

This is optional of course, and you could just attempt the different variations if you know the languages in advance.

Although it uses your API quota, those two endpoints work with the API key, unlike https://developers.google.com/youtube/v3/docs/captions/download which only works with Google OAuth 2.0 and for your videos.

I used Invidious' code for inspiration about the parsing of the YouTube response when fetching the transcript.

Ignoring the quota issue, I don't know what would be the rate-limit for the /youtubei/v1/get_transcript endpoint though...

Code

import { youtube_v3 } from '@googleapis/youtube';
import axios from 'axios';
import { Buffer } from 'buffer';
import protobuf from 'protobufjs';

const youtubeClient = new youtube_v3.Youtube({
  auth: '<YOUR-YOUTUBE-API-KEY>',
});

/**
 * Helper function to encode a message into a base64-encoded protobuf
 * to be used with the YouTube InnerTube API.
 * @param {Object} message - The message to encode
 * @returns {String} - The base64-encoded protobuf message
 */
function getBase64Protobuf(message) {
  const root = protobuf.Root.fromJSON({
    nested: {
      Message: {
        fields: {
          param1: { id: 1, type: 'string' },
          param2: { id: 2, type: 'string' },
        },
      },
    },
  });
  const MessageType = root.lookupType('Message');

  const buffer = MessageType.encode(message).finish();

  return Buffer.from(buffer).toString('base64');
}

/**
 * Returns the default subtitle language of a video on YouTube.
 * @param {String} videoId
 * @returns {Promise<{ trackKind: String, language: String }>} - The default subtitle language and the track kind (e.g., 'asr' or 'standard').
 */
async function getDefaultSubtitleLanguage(videoId) {
  // Get video default language
  const videos = await youtubeClient.videos.list({
    part: ['snippet'],
    id: [videoId],
  });

  if (videos.data.items.length !== 1) {
    throw new Error(`Multiple videos found for video: ${videoId}`);
  }

  const preferredLanguage =
    videos.data.items[0].snippet.defaultLanguage ||
    videos.data.items[0].snippet.defaultAudioLanguage;

  // Get available subtitles
  const subtitles = await youtubeClient.captions.list({
    part: ['snippet'],
    videoId: videoId,
  });

  if (subtitles.data.items.length < 1) {
    throw new Error(`No subtitles found for video: ${videoId}`);
  }

  const { trackKind, language } = (
    subtitles.data.items.find(
      (sub) => sub.snippet.language === preferredLanguage,
    ) || subtitles.data.items[0]
  ).snippet;

  return { trackKind, language };
}

/**
 * Helper function to extract text from certain elements.
 * Inspired by Invidious' extractors_utils.cr
 * https://github.com/iv-org/invidious/blob/384a8e200c953ed5be3ba6a01762e933fd566e45/src/invidious/yt_backend/extractors_utils.cr#L1-L30
 * @param {Object} item - The item to extract text from.
 * @returns {string} The extracted text.
 */
function extractText(item) {
  return item.simpleText || item.runs?.map((run) => run.text).join('');
}

/**
 * Function to retrieve subtitles for a given YouTube video.
 * @param {Object} options - The options for retrieving subtitles
 * @param {String} options.videoId - The ID of the video
 * @param {String} options.trackKind - The track kind of the subtitles (e.g., 'asr' or 'standard')
 * @param {String} options.language - The language of the subtitles
 * @returns {Promise<Array<{ start: Number, dur: Number, text: String }>>} - The subtitles of the video
 */
async function getSubtitles({ videoId, trackKind, language }) {
  const message = {
    param1: videoId,
    param2: getBase64Protobuf({
      // Only include `trackKind` for automatically-generated subtitles
      param1: trackKind === 'asr' ? trackKind : null,
      param2: language,
    }),
  };

  const params = getBase64Protobuf(message);

  const url = 'https://www.youtube.com/youtubei/v1/get_transcript';
  const headers = { 'Content-Type': 'application/json' };
  const data = {
    context: {
      client: {
        clientName: 'WEB',
        clientVersion: '2.20240826.01.00',
      },
    },
    params,
  };

  const response = await axios.post(url, data, { headers });

  // Mapping inspired by Invidious' transcript.cr
  // https://github.com/iv-org/invidious/blob/432c25ad8626fee401b1f349b463515d21718ac8/src/invidious/videos/transcript.cr#L51-L101
  const initialSegments =
    response.data.actions[0].updateEngagementPanelAction.content
      .transcriptRenderer.content.transcriptSearchPanelRenderer.body
      .transcriptSegmentListRenderer.initialSegments;

  if (!initialSegments) {
    throw new Error(
      `Requested transcript does not exist for video: ${videoId}`,
    );
  }

  const output = initialSegments.map((segment) => {
    const line =
      segment.transcriptSectionHeaderRenderer ||
      segment.transcriptSegmentRenderer;

    const { endMs, startMs, snippet } = line;

    const text = extractText(snippet);

    return {
      start: parseInt(startMs) / 1000,
      dur: (parseInt(endMs) - parseInt(startMs)) / 1000,
      text,
    };
  });

  return output;
}

//////////////
//////////////

async function main({ videoId }) {
  try {
    const { language, trackKind } = await getDefaultSubtitleLanguage({
      videoId,
    });

    const subtitles = await getSubtitles({
      language,
      trackKind,
      videoId,
    });

    console.log(subtitles);
  } catch (err) {
    console.error('Error:', err);
  }
}

// Video with ASR captions
main({ videoId: 'pyX8kQ-JzHI' });
// Video with uploaded captions
main({ videoId: '-16RFXr44fY' });
// Video with multiple caption tracks (`defaultAudioLanguage: 'ru'`)
main({ videoId: 'qwQwSTWHTAY' });

I tried it but getting this error: "No filter selected. Expected one of: myRating, id, chart". Is their any way to fix the issue?

@zaarheed
Copy link

zaarheed commented Nov 23, 2024

@dfdeagle47 Legend 👑

This resolved the issue for me at transvribe, where Vercel's servers were being blocked by YouTube. I just exported your main() function and replaced it drag-n-drop style with my previous scraper implementation.

@debsouryadatta
Copy link

// @ts-nocheck

import { youtube_v3 } from '@googleapis/youtube';
import axios from 'axios';
import { Buffer } from 'buffer';
import protobuf from 'protobufjs';
import dotenv from 'dotenv';

dotenv.config();

const youtubeClient = new youtube_v3.Youtube({
  auth: process.env.YOUTUBE_API_KEY,
});

/**
 * Helper function to encode a message into a base64-encoded protobuf
 * to be used with the YouTube InnerTube API.
 * @param {Object} message - The message to encode
 * @returns {String} - The base64-encoded protobuf message
 */
function getBase64Protobuf(message) {
  const root = protobuf.Root.fromJSON({
    nested: {
      Message: {
        fields: {
          param1: { id: 1, type: 'string' },
          param2: { id: 2, type: 'string' },
        },
      },
    },
  });
  const MessageType = root.lookupType('Message');

  const buffer = MessageType.encode(message).finish();

  return Buffer.from(buffer).toString('base64');
}

/**
 * Returns the default subtitle language of a video on YouTube.
 * @param {String} videoId
 * @returns {Promise<{ trackKind: String, language: String }>} - The default subtitle language and the track kind (e.g., 'asr' or 'standard').
 */
async function getDefaultSubtitleLanguage(videoId) {
  // Get video default language
  const videos = await youtubeClient.videos.list({
    part: ['snippet'],
    id: videoId,
    // chart: 'mostPopular',
    // maxResults: 1,
  });

  if (videos.data.items.length !== 1) {
    throw new Error(`Multiple videos found for video: ${videoId}`);
  }

  const preferredLanguage =
    videos.data.items[0].snippet.defaultLanguage ||
    videos.data.items[0].snippet.defaultAudioLanguage;

  // Get available subtitles
  const subtitles = await youtubeClient.captions.list({
    part: ['snippet'],
    videoId: videoId,
  });

  if (subtitles.data.items.length < 1) {
    throw new Error(`No subtitles found for video: ${videoId}`);
  }

  const { trackKind, language } = (
    subtitles.data.items.find(
      (sub) => sub.snippet.language === preferredLanguage,
    ) || subtitles.data.items[0]
  ).snippet;

  return { trackKind, language };
}

/**
 * Helper function to extract text from certain elements.
 * Inspired by Invidious' extractors_utils.cr
 * https://github.com/iv-org/invidious/blob/384a8e200c953ed5be3ba6a01762e933fd566e45/src/invidious/yt_backend/extractors_utils.cr#L1-L30
 * @param {Object} item - The item to extract text from.
 * @returns {string} The extracted text.
 */
function extractText(item) {
  return item.simpleText || item.runs?.map((run) => run.text).join('');
}

/**
 * Function to retrieve subtitles for a given YouTube video.
 * @param {Object} options - The options for retrieving subtitles
 * @param {String} options.videoId - The ID of the video
 * @param {String} options.trackKind - The track kind of the subtitles (e.g., 'asr' or 'standard')
 * @param {String} options.language - The language of the subtitles
 * @returns {Promise<Array<{ start: Number, dur: Number, text: String }>>} - The subtitles of the video
 */
async function getSubtitles({ videoId, trackKind, language }) {
  const message = {
    param1: videoId,
    param2: getBase64Protobuf({
      // Only include `trackKind` for automatically-generated subtitles
      param1: trackKind === 'asr' ? trackKind : null,
      param2: language,
    }),
  };

  const params = getBase64Protobuf(message);

  const url = 'https://www.youtube.com/youtubei/v1/get_transcript';
  const headers = { 'Content-Type': 'application/json' };
  const data = {
    context: {
      client: {
        clientName: 'WEB',
        clientVersion: '2.20240826.01.00',
      },
    },
    params,
  };

  const response = await axios.post(url, data, { headers });

  // Mapping inspired by Invidious' transcript.cr
  // https://github.com/iv-org/invidious/blob/432c25ad8626fee401b1f349b463515d21718ac8/src/invidious/videos/transcript.cr#L51-L101
  const initialSegments =
    response.data.actions[0].updateEngagementPanelAction.content
      .transcriptRenderer.content.transcriptSearchPanelRenderer.body
      .transcriptSegmentListRenderer.initialSegments;

  if (!initialSegments) {
    throw new Error(
      `Requested transcript does not exist for video: ${videoId}`,
    );
  }

  let subtitles = "";
  const output = initialSegments.map((segment) => {
    const line =
      segment.transcriptSectionHeaderRenderer ||
      segment.transcriptSegmentRenderer;

    const { endMs, startMs, snippet } = line;

    const text = extractText(snippet);

    // return {
    //   start: parseInt(startMs) / 1000,
    //   dur: (parseInt(endMs) - parseInt(startMs)) / 1000,
    //   text,
    // };
    subtitles += text + ". ";
  });

  return subtitles;
}

//////////////
//////////////

export async function main(videoId) {
  try {
    const { language, trackKind } = await getDefaultSubtitleLanguage(
      videoId,
    );

    const subtitles = await getSubtitles({
      language,
      trackKind,
      videoId,
    });

    // console.log(subtitles);
    return subtitles;
  } catch (err) {
    console.error('Error:', err);
  }
}

// Video with ASR captions
// main({ videoId: 'pyX8kQ-JzHI' });
// // Video with uploaded captions
// main({ videoId: '-16RFXr44fY' });
// // Video with multiple caption tracks (`defaultAudioLanguage: 'ru'`)
// main({ videoId: 'qwQwSTWHTAY' });

@pushkarsingh32 use this, working totally fine,

And yes all thanks to @dfdeagle47 , finally after so many approaches its totally working fine with this one🫡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants