Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove keys.php to use a tooltip? #18

Open
Benjamin-Loison opened this issue Sep 2, 2022 · 29 comments
Open

Remove keys.php to use a tooltip? #18

Benjamin-Loison opened this issue Sep 2, 2022 · 29 comments
Labels
enhancement New feature or request incident medium priority A high-priority issue noticeable by the user but he can still work around it. medium A task that should take less than a day to complete. official instance

Comments

@Benjamin-Loison
Copy link
Owner

Benjamin-Loison commented Sep 2, 2022

As may add a form in the future to enable people to share their YouTube Data API v3 developer keys, this webpage could be used for this even if a short advertisement for it could be added to index.php. Should proceed to #17 before proceeding to this issue as adding keys may not be necessary with current quota usage.

@Benjamin-Loison Benjamin-Loison added enhancement New feature or request good first issue Good for newcomers and removed good first issue Good for newcomers labels Sep 2, 2022
@Benjamin-Loison Benjamin-Loison added the high priority Issue disabling the user to use correctly the main features. label Oct 15, 2022
@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 15, 2022

As last two days the no-key service is using more than the quota of all keys, this issue is prioritized.

Giving tools and tips for searching keys on the web may be useful.
All YouTube Data API v3 keys start with AIzaSyA or AIzaSyB or AIzaSyC or AIzaSyD, more details here.
Searching for instance "AIzaSyA" YouTube on Google (did "AIzaSy" YouTube, "AIzaSyA" YouTube, "AIzaSyB" YouTube, "AIzaSyC" YouTube and "AIzaSyD" YouTube).
Searched AIzaSy, AIzaSyB (haven't done Code for both) and still have to do Issues for AIzaSyD on GitHub (searching AIzaSyA gives other results).
Could also use other search engines, Stack Overflow (the keys I have encountered on it were also treated, I also treated AIzaSyA, AIzaSyB, AIzaSyC and AIzaSyD, I also used an algorithm but I don't know if it manages edits on posts as most keys are removed after edits), GitLab (doesn't seem to find anything with AIzaSy*)...
https://archive.ph/www.googleapis.com was also exploited. Precising /youtube/v3/ seems to not work.
What about Web Archive? https://web.archive.org/web/*/https://www.googleapis.com/youtube/v3/* https://archive.org/developers/

Could show how to contribute with a short video with a Google account.

YouTube_Data_API_v3_key_web_scraper

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 17, 2022

Such a tool would also be useful for the instance host as it would allow him to cleanly add a YouTube Data API v3, as currently by hand have to pay attention to not screw up keys.txt.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 19, 2022

Once done, change this StackOverflow answer to propose my no-key service, but as it is currently running out of quota, I am not advertising it. Done.

@Benjamin-Loison Benjamin-Loison pinned this issue Oct 19, 2022
@Benjamin-Loison Benjamin-Loison added the medium A task that should take less than a day to complete. label Oct 19, 2022
@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 20, 2022

Note that a fresh instance will display for the no-key service: Currently this service is powered by 1 keys. and potentially a PHP warning #23 (screenshot). Could display a custom error message if try to use the no-key service while it isn't powered by any YouTube Data API v3 key.

Related to #19.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 20, 2022

Could make metrics, such as checkQuotaLogs.txt and checkUnusualLogs.txt, public. Adding metrics for how many quota we consume per day would be interesting too.

Have added for the moment https://yt.lemnoslife.com/metrics/
Note that as the no-key service requires to collect multiple YouTube Data API v3 keys, I assume that sharing some WIP details on it isn't having a high priority.

#19 is a bit blocking this.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 20, 2022

Can test many keys with this Python script:

`test_youtube_data_api_v3_keys.py`:
import requests
import json
from tqdm import tqdm

# Assume keys to be unique.
with open('keys.txt') as f:
    keys = f.read().splitlines()

URL = 'https://www.googleapis.com/youtube/v3/channels'
params = {
    'forHandle': '@MrBeast',
    'fields': 'items/id',
}

workingKeys = []
for key in tqdm(keys):
    params['key'] = key
    data = requests.get(URL, params).json()
    try:
        if data['items'][0]['id'] == 'UCX6OQ3DkcsbYNE6H8uQQuVA':
            workingKeys += [key]
    except KeyError:
        pass
    '''
    quotaExceeded = 'quota' in content
    if not 'error' in data or quotaExceeded:
        print(key, quotaExceeded)
    '''

print(workingKeys)
The equivalent parallel algorithm to be faster:
import requests
import json
from tqdm import tqdm
from multiprocessing import Pool

# Assume keys to be unique.
with open('keys.txt') as f:
    keys = f.read().splitlines()

URL = 'https://www.googleapis.com/youtube/v3/channels'

def testYouTubeDataApiV3Key(youTubeDataApiV3Key):
    params = {
        'forHandle': '@MrBeast',
        'fields': 'items/id',
        'key': youTubeDataApiV3Key,
    }
    data = requests.get(URL, params).json()
    try:
        if data['items'][0]['id'] == 'UCX6OQ3DkcsbYNE6H8uQQuVA':
            return youTubeDataApiV3Key
    except KeyError:
        return

with Pool(10) as p:
    workingKeys = set(tqdm(p.imap(testYouTubeDataApiV3Key, keys), total = len(keys)))
workingKeys.remove(None)

print(workingKeys)

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 20, 2022

I am set up a test at 9:01 AM UTC+2 (as at 9:00 AM we aren't running out of quota anymore) to test all YouTube Data API v3 keys that have currently exceeded their quota. If all keys pass this test, then maybe could allow keys having exceeding quota to be added. However someone could fill keys with manually set 0 quota limit...

The tests at 9:01 AM UTC+2 only returned exceeded quota. Will give a try at 10:01 AM UTC+2, otherwise should try every minute and if not passed the test once, then the key is definitely useless. Started the every minute test for all keys at Sat Oct 22 17:34:23 CEST 2022. None of the keys were useful for a single request during 24 hours.

@Benjamin-Loison
Copy link
Owner Author

Could advertise the possibility to share a YouTube Data API v3 key when the no-key service is running out of quota. This should be done at this line of code.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 24, 2022

Setting up myself a notification system in case a or multiple check fails happen may make sense. Adding to metrics the delta logs since last retrieve (requiring authentication). Could precise the error on False in order not to be notified every time it happens. Or can't just download last part of the file?

I added a notification system for each fail for the moment. However if for some reason, such as not enough disk space, making the system unable to write anymore logs, my check doesn't take into account such an absence of additional logs.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 30, 2022

Check Apache 2 logs to see if some people shared their API keys by mistake.
Note that gunzip doesn't output anything to stdout and instead decompress and delete the .gz compressed file, if you want the output on stdout without decompressing use -c.

find -name 'yt.lemnoslife.com-ssl--access.log*'
(gunzip -c 'yt.lemnoslife.com-ssl--access.log.*.gz' && cat yt.lemnoslife.com-ssl--access.log{,.1}) | grep AIzaSy | grep -v addKey

It is safe to add non existing files to the command above as there is a warning on stderr which isn't greped and so we got cat: FILE: No such file or directory. As I execute above command everytime I archive the logs, at least filter the already used keys for the no-key service out, would make this process faster. This is the aim of the following algorithm:

searchKeysInLogs.py:
#!/usr/bin/python3

import os
import subprocess
import re

def execute(cmd):
    return subprocess.check_output(cmd, shell=True).decode('utf-8')

with open('/var/www/ytPrivate/keys.txt') as f:
    keys = set(f.read().splitlines()) # Just for making Python interpreter happy.

path = '/var/log/apache2/'

os.chdir(path)

PREFIX = 'yt.lemnoslife.com-ssl--access.log'

cmd = f'(cat {PREFIX}.*.gz | gunzip && cat {PREFIX}.1 && cat {PREFIX}) | grep AIzaSy | grep -v addKey'
result = execute(cmd)
matches = re.findall(r'AIzaSy[A-D][a-zA-Z0-9-_]{32}', result)
uniqueMatches = set(matches)
keysToAdd = uniqueMatches - keys
print(keysToAdd)

Found this way 22 keys with quota (no others) by checking latest website logs and checked the same way my old VAIO laptop, my ASUS, my computer (including my 2, 3 and 6 TB hard disks), OC3K and the VPS itself. Maybe haven't checked yt.lemnoslife.com-ssl--access.log everywhere but hey I searched enough.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Nov 1, 2022

When adding a new key, make sure to make a backup, as if there isn't any space left on the device, we lose them all. It just happened... Adding a tool to monitor disk space usage would make sense.

https://yt.lemnoslife.com/noKey/videos?part=snippet&id=B-gHb2gPGIs returns for instance:

The request is missing a valid API key.:
{
  "error": {
    "code": 403,
    "message": "The request is missing a valid API key.",
    "errors": [
      {
        "message": "The request is missing a valid API key.",
        "domain": "global",
        "reason": "forbidden"
      }
    ],
    "status": "PERMISSION_DENIED"
  }
}

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Nov 1, 2022

Incident temporarily resolved, as brought back a set of keys, but haven't restored yet all keys.
As found on my 6 TB hard disk my IP making 60 calls to addKey.php between 20/Oct/2022:23:52:51 +0200 and 21/Oct/2022:00:09:34 +0200, I guess I found the set of keys that was deleted, as I claimed on Discord to have added 29 keys on 21 Oct at 00:50 AM. Note that the last time I modified this post to add information about progress was on Oct 21, 2022, 12:49 AM GMT+2. In addition that after running following algorithm for these calls to addKey.php I added 21 keys (+ 3 manually added due to quota consumption).

import requests

def getURLContent(url):
    return requests.get(url).text

for key in keys:
    print(key)
    url = f'https://yt.lemnoslife.com/addKey.php?key={key}'
    result = getURLContent(url)
    print(result)

@Benjamin-Loison Benjamin-Loison pinned this issue Nov 2, 2022
@Benjamin-Loison
Copy link
Owner Author

Isn't there a way in PHP to keep a variable around across user HTTPS requests? That way we wouldn't read and write a file everytime we switch from a key to the other and so we wouldn't have faced this problem.

@Benjamin-Loison Benjamin-Loison removed the high priority Issue disabling the user to use correctly the main features. label Nov 5, 2022
@Benjamin-Loison Benjamin-Loison added the medium priority A high-priority issue noticeable by the user but he can still work around it. label Nov 5, 2022
@Benjamin-Loison Benjamin-Loison unpinned this issue Nov 5, 2022
@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Nov 10, 2022

Note that the disk space seems mostly used by errors in yt.lemnoslife.com-ssl--error.log which weighs more than 8 times more than yt.lemnoslife.com-ssl--access.log, related to #23.

Example of filled logs (file size decreasing order):

File Size (MB) Lines
yt.lemnoslife.com-ssl--error.log.1 1,500 8,319,186
yt.lemnoslife.com-ssl--access.log.1 131.8 509,956
yt.lemnoslife.com-ssl--error.log.2.gz 86.9 6,513,427
yt.lemnoslife.com-ssl--access.log.2.gz 12.1 398,512

yt.lemnoslife.com-ssl--*.1 were filled from 09/Nov/2022:00:01:05 +0100 to 10/Nov/2022:00:44:31 +0100 (~24 hours).
yt.lemnoslife.com-ssl--*.log.2.gz were filled from 08/Nov/2022:00:38:57 +0100 to 09/Nov/2022:00:01:02 +0100 (~24 hours).

Moved from LogLevel debug to LogLevel info ssl:warn in /etc/apache2/sites-available/ssl.yt.lemnoslife.com.conf. See LogLevel documentation.
After a service apache2 restart, it seems that there is nothing written to yt.lemnoslife.com-ssl--error.log. I guess that it means that there isn't any error with the many requests that I still see in yt.lemnoslife.com-ssl--access.log.

Have to wait logs to be rotated to download and use fresh empty files to see if my modification was a good change.

@Benjamin-Loison
Copy link
Owner Author

From Google account credentials can generate a YouTube Data API v3 key from a random project just by using curl? I think that due to 2FA (by default with Google) etc it isn't worth it.

@Benjamin-Loison
Copy link
Owner Author

May think about recoding some of YouTube Data API v3 features by reverse-engineering their YouTube UI, if we aren't able to face the many requests using quota for the no-key service.

@Benjamin-Loison
Copy link
Owner Author

Could add an email linked to the key added, if need to contact the key holder for future modification in the policy.

@Benjamin-Loison
Copy link
Owner Author

Could use supervariable from a HTTPS request to the other or something like that to avoid reading a file for each request for counting no-key service keys or git commit version used for instance or could at least simplify the file content we really need like:

$keysCountFile = '/var/www/ytPrivate/keysCount.txt';
$keysCount = file_get_contents($keysCountFile);

@Benjamin-Loison
Copy link
Owner Author

As described in #48, proceeded at 11:40 PM UTC+1 to logrotate --force /etc/logrotate.d/apache2.

@Benjamin-Loison
Copy link
Owner Author

Next time we are really running out of quota advertise with a @everyone on both Matrix and Discord to empower the no-key service.

@Benjamin-Loison
Copy link
Owner Author

Should add a mechanism to addKey.php to add the keys on all controlled instances.
Maybe just retrieve addKey.php of the other controlled instances from the one that the end-user is interacting with would do the job.

@Benjamin-Loison
Copy link
Owner Author

At 20:43 I got:

The YouTube operational API no-key service is detected as not working!

I tested just following this event the no-key endpoint on the three instances and everything was working fine. Logging what's wrong could be interesting in the case that it happens again.

@Benjamin-Loison
Copy link
Owner Author

Once will have access to moderator tools privilege on Stack Overflow, could run again above algorithms to search for additional YouTube Data API v3 leaked keys.

@Benjamin-Loison
Copy link
Owner Author

Could also make web server logs search for YouTube Data API v3 keys be executed on private instances, as all its users don't seem be comfortable with this subject.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Mar 2, 2024

Should clean inter-instance key and other instances synchronization otherwise disabling the ability for anyone to provide a key seems to make sense.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Aug 22, 2024

Projects that enable the YouTube Data API have a default quota allocation of 1 million units per day

Note that projects that had enabled the YouTube Data API before April 20, 2016, have a different default quota for that API.

https://web.archive.org/web/20160828004328/https://developers.google.com/youtube/v3/getting-started

https://web.archive.org/web/20160404033352/https://developers.google.com/youtube/v3/getting-started is the most recent to snapshot to April 20, 2016 but does not mention how many quota is provided by default.

@Benjamin-Loison
Copy link
Owner Author

Does API explorer provides unlimited quota?

curl -s "https://content-youtube.googleapis.com/youtube/v3/search?part=snippet&q=test&key=AIzaSyBXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
Output:
{
  "error": {
    "code": 403,
    "message": "Requests from referer \u003cempty\u003e are blocked.",
    "errors": [
      {
        "message": "Requests from referer \u003cempty\u003e are blocked.",
        "domain": "global",
        "reason": "forbidden"
      }
    ],
    "status": "PERMISSION_DENIED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "API_KEY_HTTP_REFERRER_BLOCKED",
        "domain": "googleapis.com",
        "metadata": {
          "consumer": "projects/292824132082",
          "service": "youtube.googleapis.com"
        }
      }
    ]
  }
}
minimizeCURL curl.sh 'youtube#searchResult'
Output:
Initial command length: 1,158.
Removing headers
Command with length 1,069 is still fine.
Command with length 1,052 is still fine.
Command with length 1,015 is still fine.
Command with length 969 is still fine.
Command with length 795 is still fine.
Command with length 757 is still fine.
Command with length 681 is still fine.
Command with length 632 is still fine.
Command with length 582 is still fine.
Command with length 570 is still fine.
Command with length 554 is still fine.
Command with length 526 is still fine.
Command with length 292 is still fine.
Command with length 265 is still fine.
Command with length 239 is still fine.
Command with length 206 is still fine.
Command with length 188 is still fine.
Removing URL parameters
Command with length 175 is still fine.
Command with length 168 is still fine.
Removing cookies
Removing raw data
curl 'https://content-youtube.googleapis.com/youtube/v3/search?key=AIzaSyBXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' -H 'X-Origin: https://explorer.apis.google.com'

https://console.cloud.google.com/apis/api/youtube.googleapis.com/quotas?project=my-project-XXXXXXXXXXXXX is not up-to-date in realtime, so let us make as many requests as possible and count them.

@Benjamin-Loison
Copy link
Owner Author

Maybe it expires quickly but thanks to web-scraping can easily recreate one.

@Benjamin-Loison
Copy link
Owner Author

Benjamin-Loison commented Oct 23, 2024

 counter=0
while [ 1 ]
do
    echo "counter: $counter"
    curl -s "https://content-youtube.googleapis.com/youtube/v3/search?part=snippet&q=$counter&key=AIzaSyBXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" -H 'X-Origin: https://explorer.apis.google.com' | jq '.items | length'
    ((counter++))
    #break
done

leads to counter more than hundreds while having returned length still being default 5.

Same with https://www.googleapis.com/youtube/v3/search.

If necessary could also investigate OAuth and maybe use an account for each of these 4 cases (OAuth/key and URL) because of quota display delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request incident medium priority A high-priority issue noticeable by the user but he can still work around it. medium A task that should take less than a day to complete. official instance
Projects
None yet
Development

No branches or pull requests

1 participant