Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

网页端接口大部分都换成wbi,需要w_rid了 #631

Closed
o0HalfLife0o opened this issue Mar 3, 2023 · 43 comments
Closed

网页端接口大部分都换成wbi,需要w_rid了 #631

o0HalfLife0o opened this issue Mar 3, 2023 · 43 comments
Labels
新增/Add 添加或修改新的内容 讨论/Discussions 探讨相关内容 主站杂项/Misc 接口:主站其他类型

Comments

@o0HalfLife0o
Copy link

verifyString有点复杂
https://github.com/DIYgod/RSSHub/blob/2b233e878c11cc4660aa2066d4a91aa9501ea97e/lib/v2/bilibili/utils.js#LL8-L13C3

const addVerifyInfo = (params, verifyString) => {
    const md5 = crypto.createHash('md5');
    const wts = Math.round(Date.now() / 1000);
    const w_rid = md5.update(`${params}&wts=${wts}${verifyString}`).digest('hex');
    return `${params}&w_rid=${w_rid}&wts=${wts}`;
};

https://github.com/DIYgod/RSSHub/blob/2b233e878c11cc4660aa2066d4a91aa9501ea97e/lib/v2/bilibili/cache.js#LL11-L43

    getVerifyString: (ctx) => {
        const key = 'bili-verify-string';
        return ctx.cache.tryGet(key, async () => {
            const cookie = await module.exports.getCookie(ctx);
            const { data: navResponse } = await got('https://api.bilibili.com/x/web-interface/nav', {
                headers: {
                    Referer: 'https://www.bilibili.com/',
                    Cookie: cookie,
                },
            });
            const imgUrl = navResponse.data.wbi_img.img_url;
            const subUrl = navResponse.data.wbi_img.sub_url;
            const r = imgUrl.substring(imgUrl.lastIndexOf('/') + 1, imgUrl.length).split('.')[0] + subUrl.substring(subUrl.lastIndexOf('/') + 1, subUrl.length).split('.')[0];
            const { body: spaceResponse } = await got('https://space.bilibili.com/1', {
                headers: {
                    Referer: 'https://www.bilibili.com/',
                    Cookie: cookie,
                },
            });
            const jsUrl = 'https:' + spaceResponse.match(/[^"]*9.space[^"]*/);
            const { body: jsResponse } = await got(jsUrl, {
                headers: {
                    Referer: 'https://space.bilibili.com/1',
                },
            });
            const array = JSON.parse(jsResponse.match(/\[(?:\d+,){63}\d+\]/));
            const o = [];
            array.forEach((t) => {
                r.charAt(t) && o.push(r.charAt(t));
            });
            return o.join('').slice(0, 32);
        });
    },
@12345-mcpython
Copy link

12345-mcpython commented Mar 22, 2023

@SocialSisterYi
Copy link
Owner

这个wbi有没有什么特别含义?

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 19, 2023

这么说,可生成?

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 19, 2023

这个 wbi 是必须登录的吗?

@o0HalfLife0o
Copy link
Author

o0HalfLife0o commented May 19, 2023

这个 wbi 是必须登录的吗?

不是,我观察到的结果是不管登录与否,个人信息api获取到的那两个值都是一样的

@Drelf2018
Copy link

https://s1.hdslb.com/bfs/static/laputa-home/client/assets/vendor.7679ec63.js

image

image

image

看不懂啊,fe he 数据怎么来的,好像是视频封面?然后 ge 里的 ids 是什么,必须吗

@SocialSisterYi
Copy link
Owner

研究表明参数w_rid的生成和 APIhttps://api.bilibili.com/x/web-interface/nav中的$.data.wbi_img.img_url以及$.data.wbi_img.sub_url字段有关,且这两个参数的生成和 IP Cookie 等外部环境变量无关

image

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 20, 2023

https://api.bilibili.com/x/web-interface/nav 里的数据在同一时间内是定值

且三小时过去了仍然没变

 ⚡ root@VM-4-4-debian  ~  curl https://api.bilibili.com/x/web-interface/nav
{"code":-101,"message":"账号未登录","ttl":1,"data":{"isLogin":false,"wbi_img":{"img_url":"https://i0.hdslb.com/bfs/wbi/653657f524a547ac981ded72ea172057.png","sub_url":"https://i0.hdslb.com/bfs/wbi/6e4909c702f846728e64f6007736a338.png"}}}

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 20, 2023

Python demo By Drelf2018 Nemo2011/bilibili-api#290 (comment) and z0z0r4

import hashlib
import time
from functools import reduce

import httpx

HEADERS = {"User-Agent": "Mozilla/5.0", "Referer": "https://www.bilibili.com"}


def getMixinKey(ae):
    oe = [46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49, 33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41,
          13, 37, 48, 7, 16, 24, 55, 40, 61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11, 36, 20, 34, 44, 52]
    le = reduce(lambda s, i: s + ae[i], oe, "")
    return le[:32]

def encWbi(params: dict):
    resp = httpx.get("https://api.bilibili.com/x/web-interface/nav")
    wbi_img: dict = resp.json()["data"]["wbi_img"]
    img_url: str = wbi_img.get("img_url")
    sub_url: str = wbi_img.get("sub_url")
    img_value = img_url.split("/")[-1].split(".")[0]
    sub_value = sub_url.split("/")[-1].split(".")[0]
    me = getMixinKey(img_value + sub_value)
    wts = int(time.time())
    params["wts"] = wts
    Ae = "&".join([f'{key}={value}' for key, value in params.items()])
    w_rid = hashlib.md5((Ae + me).encode(encoding='utf-8')).hexdigest()
    return w_rid, wts

if __name__ == "__main__":
    w_rid, wts = encWbi({"mid": 558830935})

@12345-mcpython
Copy link

demo: https://github.com/12345-mcpython/bilibili-console/tree/main/bilibili/utils.py

# ps=5 即想给 API 传的参数
"https://api.bilibili.com/x/web-interface/wbi/index/top/feed/rcmd?" + encrypt_wbi("ps=5")

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 20, 2023

https://api.bilibili.com/x/web-interface/nav

到现在也还没变,可能是一天?

@SocialSisterYi
Copy link
Owner

https://api.bilibili.com/x/web-interface/nav

到现在也还没变,可能是一天?

现在已经变了

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 20, 2023

https://api.bilibili.com/x/web-interface/nav 里的数据在同一时间内是定值

且三小时过去了仍然没变

 ⚡ root@VM-4-4-debian  ~  curl https://api.bilibili.com/x/web-interface/nav
{"code":-101,"message":"账号未登录","ttl":1,"data":{"isLogin":false,"wbi_img":{"img_url":"https://i0.hdslb.com/bfs/wbi/653657f524a547ac981ded72ea172057.png","sub_url":"https://i0.hdslb.com/bfs/wbi/6e4909c702f846728e64f6007736a338.png"}}}

不对,根本没变吧?附上具体值?
image

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 21, 2023

仍然是这个

image

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 21, 2023

无所谓,我两小时后都返校了(

@PaulDuanGitHub
Copy link

Python demo By Drelf2018 Nemo2011/bilibili-api#290 (comment) and z0z0r4

import hashlib
import time
from functools import reduce

import httpx

HEADERS = {"User-Agent": "Mozilla/5.0", "Referer": "https://www.bilibili.com"}


def getMixinKey(ae):
    oe = [46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49, 33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41,
          13, 37, 48, 7, 16, 24, 55, 40, 61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11, 36, 20, 34, 44, 52]
    le = reduce(lambda s, i: s + ae[i], oe, "")
    return le[:32]

def encWbi(params: dict):
    resp = httpx.get("https://api.bilibili.com/x/web-interface/nav")
    wbi_img: dict = resp.json()["data"]["wbi_img"]
    img_url: str = wbi_img.get("img_url")
    sub_url: str = wbi_img.get("sub_url")
    img_value = img_url.split("/")[-1].split(".")[0]
    sub_value = sub_url.split("/")[-1].split(".")[0]
    me = getMixinKey(img_value + sub_value)
    wts = int(time.time())
    params["wts"] = wts
    Ae = "&".join([f'{key}={value}' for key, value in params.items()])
    w_rid = hashlib.md5((Ae + me).encode(encoding='utf-8')).hexdigest()
    return w_rid, wts

if __name__ == "__main__":
    w_rid, wts = encWbi({"mid": 558830935})

在算Ae前是不是需要对参数列表进行排序,刚刚测试了一个多参数的接口,算出不一样的w_rid
image

@SocialSisterYi
Copy link
Owner

Python demo By Drelf2018 Nemo2011/bilibili-api#290 (comment) and z0z0r4

import hashlib
import time
from functools import reduce

import httpx

HEADERS = {"User-Agent": "Mozilla/5.0", "Referer": "https://www.bilibili.com"}


def getMixinKey(ae):
    oe = [46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49, 33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41,
          13, 37, 48, 7, 16, 24, 55, 40, 61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11, 36, 20, 34, 44, 52]
    le = reduce(lambda s, i: s + ae[i], oe, "")
    return le[:32]

def encWbi(params: dict):
    resp = httpx.get("https://api.bilibili.com/x/web-interface/nav")
    wbi_img: dict = resp.json()["data"]["wbi_img"]
    img_url: str = wbi_img.get("img_url")
    sub_url: str = wbi_img.get("sub_url")
    img_value = img_url.split("/")[-1].split(".")[0]
    sub_value = sub_url.split("/")[-1].split(".")[0]
    me = getMixinKey(img_value + sub_value)
    wts = int(time.time())
    params["wts"] = wts
    Ae = "&".join([f'{key}={value}' for key, value in params.items()])
    w_rid = hashlib.md5((Ae + me).encode(encoding='utf-8')).hexdigest()
    return w_rid, wts

if __name__ == "__main__":
    w_rid, wts = encWbi({"mid": 558830935})

在算Ae前是不是需要对参数列表进行排序,刚刚测试了一个多参数的接口,算出不一样的w_rid image

需要进行排序的,类似 APP 的 sign 算法

@SocialSisterYi
Copy link
Owner

SocialSisterYi commented May 22, 2023

我使用 Python 实现了一个 Wbi Sign 的生成(不带img_urlsub_url字段的获取

QQFQ$@7ZURWN_N_%79AC 6S

@My-Responsitories
Copy link

My-Responsitories commented May 22, 2023

可以从https://web.archive.org/web/%2A/https://api.bilibili.com/x/web-interface/nav这一网页存档看img_urlsub_url有没有变化. 大致看了一下5月份的, 大约是3天换一次

@o0HalfLife0o
Copy link
Author

我使用 Python 实现了一个 Wbi Sign 的生成(不带img_urlsub_url字段的获取

QQFQ$@7ZURWN_N_%79AC 6S

代码发出来我学习下

@SocialSisterYi
Copy link
Owner

我使用 Python 实现了一个 Wbi Sign 的生成(不带img_urlsub_url字段的获取

QQFQ$@7ZURWN_N_%79AC 6S

代码发出来我学习下

完整版文档和例程代码等最近一个 commit 吧

@SocialSisterYi SocialSisterYi added the 讨论/Discussions 探讨相关内容 label May 23, 2023
@AronTK
Copy link

AronTK commented May 23, 2023

32行调encWbi函那里
image
感谢大佬!爬到了!

请问我的代码是照着写的,为什么爬出来的结果是“账号未登录”,代码如下:
import hashlib
import time
from functools import reduce

import httpx

HEADERS = {"User-Agent": "Mozilla/5.0", "Referer": "https://www.bilibili.com/"}

def getMixinKey(ae):
oe = [46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49, 33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41,
13, 37, 48, 7, 16, 24, 55, 40, 61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11, 36, 20, 34, 44, 52]
le = reduce(lambda s, i: s + ae[i], oe, "")
return le[:32]

def encWbi(params: dict):
resp = httpx.get("https://api.bilibili.com/x/web-interface/nav", headers=HEADERS)
wbi_img: dict = resp.json()["data"]["wbi_img"]
img_url: str = wbi_img.get("img_url")
sub_url: str = wbi_img.get("sub_url")
img_value = img_url.split("/")[-1].split(".")[0]
sub_value = sub_url.split("/")[-1].split(".")[0]
me = getMixinKey(img_value + sub_value)
wts = int(time.time())
params["wts"] = wts
params = dict(sorted(params.items()))
Ae = "&".join([f'{key}={value}' for key, value in params.items()])
w_rid = hashlib.md5((Ae + me).encode(encoding='utf-8')).hexdigest()
return w_rid, wts

if name == "main":
mid=452606628
url_params={
"mid": mid,
"platform": "web",
"web_location": "1550101"
}
# Set the token field to a valid login token or remove it to use cookies for authentication
url_params["token"] = "your_login_token_here"

w_rid, wts = encWbi(url_params)

params = {
    "w_rid": w_rid,
    "wts": wts
}

params.update(url_params)
print(w_rid, wts)
resp = httpx.get(
    url="https://api.bilibili.com/x/web-interface/nav",
    params=params,
    headers=HEADERS
)

print(resp.url, resp.json())

@smellsee
Copy link

32行调encWbi函那里
image
感谢大佬!爬到了!

请问我的代码是照着写的,为什么爬出来的结果是“账号未登录”,代码如下: import hashlib import time from functools import reduce

import httpx

HEADERS = {"User-Agent": "Mozilla/5.0", "Referer": "https://www.bilibili.com/"}

def getMixinKey(ae): oe = [46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49, 33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41, 13, 37, 48, 7, 16, 24, 55, 40, 61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11, 36, 20, 34, 44, 52] le = reduce(lambda s, i: s + ae[i], oe, "") return le[:32]

def encWbi(params: dict): resp = httpx.get("https://api.bilibili.com/x/web-interface/nav", headers=HEADERS) wbi_img: dict = resp.json()["data"]["wbi_img"] img_url: str = wbi_img.get("img_url") sub_url: str = wbi_img.get("sub_url") img_value = img_url.split("/")[-1].split(".")[0] sub_value = sub_url.split("/")[-1].split(".")[0] me = getMixinKey(img_value + sub_value) wts = int(time.time()) params["wts"] = wts params = dict(sorted(params.items())) Ae = "&".join([f'{key}={value}' for key, value in params.items()]) w_rid = hashlib.md5((Ae + me).encode(encoding='utf-8')).hexdigest() return w_rid, wts

if name == "main": mid=452606628 url_params={ "mid": mid, "platform": "web", "web_location": "1550101" } # Set the token field to a valid login token or remove it to use cookies for authentication url_params["token"] = "your_login_token_here"

w_rid, wts = encWbi(url_params)

params = {
    "w_rid": w_rid,
    "wts": wts
}

params.update(url_params)
print(w_rid, wts)
resp = httpx.get(
    url="https://api.bilibili.com/x/web-interface/nav",
    params=params,
    headers=HEADERS
)

print(resp.url, resp.json())

@AronTK 你的请求标头(headers)没有cookie呀,相当于你是未登录状态去爬数据,当然会返回“账号未登录”。参考一个成熟的爬虫代码,改一改吧。

@babyfengfjx
Copy link

babyfengfjx commented May 23, 2023

解决大问题

是说之前写的爬虫都跑了大半年了没问题,最近突然就数据出错了呢,好不容易找到这个项目,真是雪中送碳解决了燃眉之急,再次跑通了,主要参考了wbi签名

参考代码

from functools import reduce
from hashlib import md5
import urllib.parse
import time
import requests

mixinKeyEncTab = [
    46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49,
    33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41, 13, 37, 48, 7, 16, 24, 55, 40,
    61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11,
    36, 20, 34, 44, 52
]

def getMixinKey(orig: str):
    '对 imgKey 和 subKey 进行字符顺序打乱编码'
    return reduce(lambda s, i: s + orig[i], mixinKeyEncTab, '')[:32]

def encWbi(params: dict, img_key: str, sub_key: str):
    '为请求参数进行 wbi 签名'
    mixin_key = getMixinKey(img_key + sub_key)
    curr_time = round(time.time())
    params['wts'] = curr_time                                   # 添加 wts 字段
    params = dict(sorted(params.items()))                       # 按照 key 重排参数
    # 过滤 value 中的 "!'()*" 字符
    params = {
        k : ''.join(filter(lambda chr: chr not in "!'()*", str(v)))
        for k, v
        in params.items()
    }
    query = urllib.parse.urlencode(params)                      # 序列化参数
    wbi_sign = md5((query + mixin_key).encode()).hexdigest()    # 计算 w_rid
    params['w_rid'] = wbi_sign
    return params

def getWbiKeys() :
    '获取最新的 img_key 和 sub_key'
    resp = requests.get('https://api.bilibili.com/x/web-interface/nav')
    resp.raise_for_status()
    json_content = resp.json()
    img_url: str = json_content['data']['wbi_img']['img_url']
    sub_url: str = json_content['data']['wbi_img']['sub_url']
    img_key = img_url.rsplit('/', 1)[1].split('.')[0]
    sub_key = sub_url.rsplit('/', 1)[1].split('.')[0]
    return img_key, sub_key


def get_query(**parameters: dict):
    """
    获取签名后的查询参数
    """
    img_key, sub_key = getWbiKeys()
    signed_params = encWbi(
        params=parameters,
        img_key=img_key,
        sub_key=sub_key
    )
    query = urllib.parse.urlencode(signed_params)
    return query

通过这段代码完成query的返回,然后参考工程中其他原来的接口,将之前接口的参数传递给get_query方法即可获取签名后的查询参数子串了,直接拼接到原来接口上就可以了。

使用demo:

    def getpageinfo(page):
        query = get_query(mid=137324885, ps=30, pn=page)
        url_getvideo = f'https://api.bilibili.com/x/space/wbi/arc/search?{query}'
        print(url_getvideo)
        try:
            videoinfo = requests.get(url_getvideo,headers=headers).json() #  获取当前页视频的信息
            print(videoinfo)
            videoinfo = videoinfo['data']['list']['vlist']
        except Exception as e:
            videoinfo = None

这个主要是获取B站UP主所有视频信息的一个接口使用方法,采用上述方案后又可以愉快的玩耍了。

@aaa1115910
Copy link
Contributor

相关文档 已经提交,注明了完整的wtsw_rid生成方式,并附带 Python 和 JavaScript 两种语言的 Demo

此外,研究还发现img_urlsub_url就是每日刷新,并非三天刷新

05ac3d5

Wbi签名 中有错误,你描述算法时写的是先将参数进行排序再加上 wts,实际上应该带上 wts 一起进行排序,下面的示例代码都是写对了的

@z0z0r4
Copy link
Collaborator

z0z0r4 commented May 24, 2023

问问,token以及web_locate 这些参数对于wbi是必须传入的吗?

Nemo2011/bilibili-api#301 (comment)

是否必须按照web端参数一个不漏的传?

@RayWangQvQ
Copy link
Contributor

感谢逆向和分析,按照文档写了下,可以验证通过。

C#版供参考:https://github.com/RayWangQvQ/BiliBiliToolPro/blob/main/src/Ray.BiliBiliTool.DomainService/WbiDomainService.cs

@Radekyspec
Copy link

问问,token以及web_locate 这些参数对于wbi是必须传入的吗?

Nemo2011/bilibili-api#301 (comment)

是否必须按照web端参数一个不漏的传?

我这里试了是不需要token和web_locate之类的参数,单独传一个mid也可以

只要计算好对应的w_rid就行

@SocialSisterYi
Copy link
Owner

我最近收集了一些带有 Wbi 验证的接口

image

@mokurin000
Copy link

Rust 版本,测试覆盖0%:
https://github.com/000ylop/bili-wbi-sign-rs/

@SocialSisterYi
Copy link
Owner

@poly000 这个代码我得改改,写得有些问题

@mokurin000
Copy link

@poly000 这个代码我得改改,写得有些问题

啊这

@11003
Copy link

11003 commented Jul 9, 2023

解决大问题

是说之前写的爬虫都跑了大半年了没问题,最近突然就数据出错了呢,好不容易找到这个项目,真是雪中送碳解决了燃眉之急,再次跑通了,主要参考了wbi签名

参考代码

from functools import reduce
from hashlib import md5
import urllib.parse
import time
import requests

mixinKeyEncTab = [
    46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49,
    33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41, 13, 37, 48, 7, 16, 24, 55, 40,
    61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11,
    36, 20, 34, 44, 52
]

def getMixinKey(orig: str):
    '对 imgKey 和 subKey 进行字符顺序打乱编码'
    return reduce(lambda s, i: s + orig[i], mixinKeyEncTab, '')[:32]

def encWbi(params: dict, img_key: str, sub_key: str):
    '为请求参数进行 wbi 签名'
    mixin_key = getMixinKey(img_key + sub_key)
    curr_time = round(time.time())
    params['wts'] = curr_time                                   # 添加 wts 字段
    params = dict(sorted(params.items()))                       # 按照 key 重排参数
    # 过滤 value 中的 "!'()*" 字符
    params = {
        k : ''.join(filter(lambda chr: chr not in "!'()*", str(v)))
        for k, v
        in params.items()
    }
    query = urllib.parse.urlencode(params)                      # 序列化参数
    wbi_sign = md5((query + mixin_key).encode()).hexdigest()    # 计算 w_rid
    params['w_rid'] = wbi_sign
    return params

def getWbiKeys() :
    '获取最新的 img_key 和 sub_key'
    resp = requests.get('https://api.bilibili.com/x/web-interface/nav')
    resp.raise_for_status()
    json_content = resp.json()
    img_url: str = json_content['data']['wbi_img']['img_url']
    sub_url: str = json_content['data']['wbi_img']['sub_url']
    img_key = img_url.rsplit('/', 1)[1].split('.')[0]
    sub_key = sub_url.rsplit('/', 1)[1].split('.')[0]
    return img_key, sub_key


def get_query(**parameters: dict):
    """
    获取签名后的查询参数
    """
    img_key, sub_key = getWbiKeys()
    signed_params = encWbi(
        params=parameters,
        img_key=img_key,
        sub_key=sub_key
    )
    query = urllib.parse.urlencode(signed_params)
    return query

通过这段代码完成query的返回,然后参考工程中其他原来的接口,将之前接口的参数传递给get_query方法即可获取签名后的查询参数子串了,直接拼接到原来接口上就可以了。

使用demo:

    def getpageinfo(page):
        query = get_query(mid=137324885, ps=30, pn=page)
        url_getvideo = f'https://api.bilibili.com/x/space/wbi/arc/search?{query}'
        print(url_getvideo)
        try:
            videoinfo = requests.get(url_getvideo,headers=headers).json() #  获取当前页视频的信息
            print(videoinfo)
            videoinfo = videoinfo['data']['list']['vlist']
        except Exception as e:
            videoinfo = None

这个主要是获取B站UP主所有视频信息的一个接口使用方法,采用上述方案后又可以愉快的玩耍了。

贡献一个php代码:

//---------------获取用户信息开始---------------
// 对 imgKey 和 subKey 进行字符顺序打乱编码
function getMixinKey($orig) {
    $mixinKeyEncTab = array(46, 47, 18, 2, 53, 8, 23, 32, 15, 50, 10, 31, 58, 3, 45, 35, 27, 43, 5, 49,
        33, 9, 42, 19, 29, 28, 14, 39, 12, 38, 41, 13, 37, 48, 7, 16, 24, 55, 40,
        61, 26, 17, 0, 1, 60, 51, 30, 4, 22, 25, 54, 21, 56, 59, 6, 63, 57, 62, 11,
        36, 20, 34, 44, 52); // 将 mixinKeyEncTab 补充完整,包括各个元素的值
    $temp = '';
    foreach ($mixinKeyEncTab as $n) {
        $temp .= $orig[$n];
    }
    return substr($temp, 0, 32);
}

// 为请求参数进行 wbi 签名
function encWbi($params, $img_key, $sub_key) {
    $mixin_key = getMixinKey($img_key . $sub_key);
    $curr_time = time();
    $chr_filter = '/[!\'\(\)*]/';
    $query = [];
    $params['wts'] = $curr_time; // 添加 wts 字段
    // 按照 key 重排参数
    ksort($params);
    foreach ($params as $key => $value) {
        $filtered_value = preg_replace($chr_filter, '', (string)$value);
        $query[] = urlencode($key) . '=' . urlencode($filtered_value);
    }
    $query_string = implode('&', $query);
    $wbi_sign = md5($query_string . $mixin_key);
    return $query_string . '&w_rid=' . $wbi_sign;
}

// 获取最新的 img_key 和 sub_key
function getWbiKeys() {
    $url = 'https://api.bilibili.com/x/web-interface/nav';
    $json = get_Url($url);
    $data = json_decode($json, true);
    $img_url = $data['data']['wbi_img']['img_url'];
    $sub_url = $data['data']['wbi_img']['sub_url'];
    $img_key = substr(strrchr($img_url, '/'), 1, -4);
    $sub_key = substr(strrchr($sub_url, '/'), 1, -4);
    return ['img_key' => $img_key, 'sub_key' => $sub_key];
}

function get_bili_query($parameters)
{
    $img_key_sub_key = getWbiKeys();
    $params = [];
    foreach ($parameters as $key => $value) {
        $params[$key] = $value;
    }
    return encWbi(
        $params,
        $img_key_sub_key['img_key'],
        $img_key_sub_key['sub_key']
    );
}
//---------------获取用户信息END---------------

使用:

    public function getBiliData() {
        $parameters = [
            'mid' => '1',
            'ps' => '25'
        ];
        $query = get_bili_query($parameters);
        $url = "https://api.bilibili.com/x/space/wbi/arc/search?$query";
        $json = get_Url($url);
        $jsonObj = json_decode($json,true);
        $vlist = $jsonObj['data']['list']['vlist'];
        return array_slice($vlist, 0, 3); //获取前3条
    }

@11003
Copy link

11003 commented Jul 9, 2023

/**
 * 模拟浏览器访问
 * @param $url
 * @return bool|string
 */
function get_Url($url) {
    $ifpost = 0;
    $datafields = '';
    $cookiefile = '';
    $user_agent = $_SERVER['HTTP_USER_AGENT'];
    $v = false;
    //构造随机ip
    $ip_long = array(
        array('607649792', '608174079'), //36.56.0.0-36.63.255.255
        array('1038614528', '1039007743'), //61.232.0.0-61.237.255.255
        array('1783627776', '1784676351'), //106.80.0.0-106.95.255.255
        array('2035023872', '2035154943'), //121.76.0.0-121.77.255.255
        array('2078801920', '2079064063'), //123.232.0.0-123.235.255.255
        array('-1950089216', '-1948778497'), //139.196.0.0-139.215.255.255
        array('-1425539072', '-1425014785'), //171.8.0.0-171.15.255.255
        array('-1236271104', '-1235419137'), //182.80.0.0-182.92.255.255
        array('-770113536', '-768606209'), //210.25.0.0-210.47.255.255
        array('-569376768', '-564133889'), //222.16.0.0-222.95.255.255
    );
    $rand_key = mt_rand(0, 9);
    $ip= long2ip(mt_rand($ip_long[$rand_key][0], $ip_long[$rand_key][1]));
//模拟http请求header头
    $header = array("Connection: Keep-Alive","Accept: text/html, application/xhtml+xml, */*", "Pragma: no-cache", "Accept-Language: zh-Hans-CN,zh-Hans;q=0.8,en-US;q=0.5,en;q=0.3","User-Agent: .$user_agent",'CLIENT-IP:'.$ip,'X-FORWARDED-FOR:'.$ip);
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, $v);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
    $ifpost && curl_setopt($ch, CURLOPT_POST, $ifpost);
    $ifpost && curl_setopt($ch, CURLOPT_POSTFIELDS, $datafields);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    $cookiefile && curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiefile);
    $cookiefile && curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiefile);
    curl_setopt($ch,CURLOPT_TIMEOUT,60); //允许执行的最长秒数
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
    $ok = curl_exec($ch);
    curl_close($ch);
    unset($ch);
    return $ok;
}

@maoqxxmm
Copy link

maoqxxmm commented Aug 29, 2023

--- Update ---

好像还是得加签,不然爬快了就会被阻止


image

今天突然发现不需要 wbi 加签了,随便爬的样子。不知道是不是阿 b 抽风了。

@z0z0r4
Copy link
Collaborator

z0z0r4 commented Aug 29, 2023

--- Update --- ---更新---

好像还是得加签,不然爬快了就会被阻止

image 今天突然发现不需要 wbi 加签了,随便爬的样子。不知道是不是阿 b 抽风了。

B站前端自己忘了吧笑死

@fanza1
Copy link

fanza1 commented May 7, 2024

如果有人访问https://api.bilibili.com/x/space/wbi/arc/search 遇到request was banned问题

{
    "code": -412,
    "message": "request was banned",
    "ttl": 1
}

headers可以加入referer字段解决问题,用下面格式:(SESSDATA可以用浏览器无痕模式获取)

headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/32.0',
'Cookie': 'SESSDATA=' + SESSDATA,
'referer': 'https://message.bilibili.com/',
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
新增/Add 添加或修改新的内容 讨论/Discussions 探讨相关内容 主站杂项/Misc 接口:主站其他类型
Projects
None yet
Development

No branches or pull requests