-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[request] Boosty.to #2387
Comments
Would love to see something for that platform too. Thank you! |
Is there a way how can I help with implementation extractor for this site? I have subscription, so I could test it out. However I'm not sure how exactly extractors work and what need for them to work. |
You have to know how to imitate requests from a web browser so that the site doesn't stonewall you, and also how to parse HTML (and possibly JSON) content to grab the important bits from the returned pages. That's the gist of it. |
Since the boostydownloader project has been confirmed as being abandoned, I feel like boosty should get another look a year after the last comment on this issue. I attempted to start my own boosty extractor code which was then cleaned up and fixed by mikf. boosty requires cloudflare login for verification I believe, so I'm not sure how to get oAuth outside the provided cookie, but suggestions will be more than welcome. |
Interestingly, boostydownloader has removed it's MIT license. In the issue, they say they have made it permissible, but without a license, this is not possible. Since no code has actually changed, however, the code up to that point has remained the same. I am finding issues with boostydownloader I wasn't aware of before, namely it does not download attachments, which in my eyes is critical. I will see if I can continue work on a boosty fork of gallery-dl for a potential PR, but I can't pretend I understand the order of operations in which gallery-dl consumes pages. |
Continuing the boostydownloader drama, the github user has nuked their account and all repos they had. It's a shame they had to take such catastrophic action to their code, but this means there is no viable boosty downloader at this point. I will see if I can accelerate my PR. |
I would say that making Boosty downloader is relatively easy, the problem is only making gallery-dl extractor. The Boosty API is looks like this: Getting User
The response look like this: {
"blogUrl": "URL",
"coverUrl": "https://images.boosty.to/blog/00000000/cover?change_time=1234",
"isSubscribed": false,
"flags": {
"allowGoogleIndex": false,
"acceptDonationMessages": true,
"showPostDonations": true,
"hasTargets": true,
"isRssFeedEnabled": false,
"isPaymentAcceptBlocked": false,
"allowIndex": true,
"isVerifyPayoutBlocked": false,
"hasSubscriptionLevels": true,
"isPayoutBlocked": false
},
"subscriptionKind": "none",
"isReadOnly": false,
"accessRights": {
"canView": false,
"canCreateComments": false,
"canDeleteComments": false,
"canEdit": false,
"canCreate": false,
"canSetPayout": false
},
"owner": {
"hasAvatar": true,
"name": "Name",
"id": 00000000,
"avatarUrl": "https://images.boosty.to/user/00000000/avatar?change_time=1234"
},
"isOwner": false,
"subscription": null,
"title": "Title",
"hasAdultContent": true,
"isBlackListed": false,
"isBlackListedByUser": false,
"signedQuery": "",
"count": {
"posts": 123,
"subscribers": 123
},
"description": [{
"type": "text",
"modificator": "",
"content": "[\"Hello! Welcome to my Boosty! \",\"unstyled\",[]]"
}, {
"content": "",
"modificator": "BLOCK_END",
"type": "text"
}],
"publicWebSocketChannel": "blogger:00000000",
"allowedPromoTypes": [
"discount",
"trial",
"trial_link"
],
"isTotalBaned": false,
"socialLinks": [{
"type": "website",
"url": "https://..."
}]
}
Getting postsUser feed:
General feed (all users):
Post example: {
"extra": {
"isLast": false,
"offset": "1700536019:4692238"
},
"data": [{
"tags": [{
"id": 1234,
"title": "Tag name"
}],
"isWaitingVideo": false,
"updatedAt": 1714404788,
"data": [{
"type": "image",
"rendition": "",
"height": 602,
"width": 956,
"url": "https://images.boosty.to/image/00000000-0000-0000-0000-000000000000?change_time=1234",
"id": "00000000-0000-0000-0000-000000000000"
},
{
"isMigrated": false,
"type": "file",
"id": "00000000-0000-0000-0000-000000000000",
"url": "https://cdn.boosty.to/file/00000000-0000-0000-0000-000000000000",
"size": 124157865,
"complete": true,
"title": "1.zip"
}],
"donations": 0,
"createdAt": 1714404779,
"teaser": [{
"width": 575,
"rendition": "teaser_auto_background",
"url": "https://images.boosty.to/image/00000000-0000-0000-0000-000000000000",
"height": 840,
"id": "00000000-0000-0000-0000-000000000000",
"type": "image"
}],
"advertiserInfo": null,
"showViewsCounter": false,
"id": "00000000-0000-0000-0000-000000000001",
"subscriptionLevel": {
"ownerId": 00000000,
"isArchived": false,
"currencyPrices": {
"USD": 4.6,
"RUB": 400
},
"deleted": false,
"createdAt": 1688285158,
"name": "Tier1",
"data": [{
"rendition": "",
"width": 1536,
"url": "https://images.boosty.to/image/00000000-0000-0000-0000-000000000002?change_time=1234",
"height": 1024,
"type": "image",
"id": "00000000-0000-0000-0000-000000000002"
}, {
"type": "text",
"modificator": "",
"content": "[\"• All posts works.\\n\",\"unstyled\",[[0,2,17]]]"
}, {
"content": "",
"modificator": "BLOCK_END",
"type": "text"
}],
"price": 400,
"id": 1234567,
"promos": []
},
"count": {
"likes": 1,
"reactions": {
"laught": 0,
"heart": 1,
"angry": 0,
"wonder": 0,
"sad": 0,
"fire": 0,
"dislike": 0,
"like": 0
},
"comments": 0
},
"hasAccess": true,
"comments": {
"extra": {
"isLast": true,
"isFirst": true
},
"data": []
},
"isPublished": true,
"currencyPrices": {
"USD": 0,
"RUB": 0
},
"isRecord": false,
"price": 0,
"isLiked": false,
"donators": {
"data": [],
"extra": {
"isLast": true
}
},
"int_id": 5817430,
"isDeleted": false,
"signedQuery": "",
"isCommentsDenied": false,
"publishTime": 1714404779,
"user": {
"hasAvatar": true,
"avatarUrl": "https://images.boosty.to/user/00000000/avatar?change_time=1234",
"blogUrl": "URL",
"id": 00000000,
"flags": {
"showPostDonations": true
},
"name": "Name"
},
"title": "Some post title"
},
...
} If in So making it is pretty easy with something like I also have an active subscriptions here, but I never managed to write proper plugin, so I was just using some self-made solutions where I just put the cookies and user url. |
@biggestsonicfan |
@mikf if you need someone to test, I have active subscriptions on that platform, can help this way. |
Actually, I somehow didn't thought about it, but do you need the downloader script that I'm using? While it doesn't rely on |
Sure, I guess it would be quite helpful and allow me to provide feature parity. |
@XCanG One of the issues boostydownloader was having before the repo was scrubbed off the face of the earth was the API has a hard limit of 300 posts. It will not go any further, regardless of how many more posts exist, which raises an interesting question: Should we be writing this around the API, or should we write this as a scrape? |
I posted my script here https://github.com/JumpJets/boosty_archiver @biggestsonicfan I don't know what do you mean, first of all Boosty's pages aren't PHP, while they have initial script with some data, when you preloading posts, you actually hitting their API, there is no PHP pages, but API JSONs. If regular scroll through post wouldn't be able to show those posts, then nobody would be able to see them. And their posts is not even paginated, they used cursor. Second of all, I currently have a subscription to a user with > than 300 posts, CSS for Stylish@-moz-document domain("boosty.to") {
[class^="Layout_content"] {
width: unset !important;
display: grid;
justify-content: center;
& [class^="Layout_threeColsCenter"] {
width: unset !important;
& [class^="Feed_feed"] {
counter-reset: posts;
> [class^="Feed_itemWrap"] {
counter-increment: posts;
&::before {
content: "#" counter(posts);
position: absolute;
right: 100%;
margin-right: 5px;
font-size: calc(2vw + 1vh);
font-weight: 100;
font-family: sans;
pointer-events: none;
}
> [class^="Post_root"] {
width: unset !important;
> [class^="Post_contentWrapper"] {
&:has(> [class^="Post_readMore"]) > [class^="Post_content"] {
max-height: unset !important;
& > [class^="Post_shading"] {
display: none !important;
}
}
/* still needed for showing hidden stuff by JS
> [class^="Post_readMore"] {
display: none !important;
}*/
}
& .ce-block__content {
max-width: unset !important;
}
}
}
}
}
& [class^="DialogueChat_dialog"] {
width: 40vw;
}
> [class^="Settings_wrapper"] [class^="SettingsSubscriptions_cardsContainer"] {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(240px, 1fr));
width: 70vw;
gap: 20px;
> [class^="SettingsSubscriptions_card"] {
height: unset !important;
width: unset !important;
margin: 0 !important;
display: grid;
&:has([class^="WithdrawalInfo_root"]) {
order: -1;
}
> div {
height: unset !important;
width: unset !important;
margin: 0 !important;
}
}
}
}
[class^="Layout_layout"] [class^="Post_root"] > [class^="Post_contentWrapper"] {
&:has(> [class^="Post_readMore"]) > [class^="Post_content"] {
max-height: unset !important;
& > [class^="Post_shading"] {
display: none !important;
}
}
/* same
> [class^="Post_readMore"] {
display: none !important;
}*/
}
[class^="Post_container"] {
width: 100% !important;
}
} May be there is something that I don't know, but I don't saw this issue. You could try my script to see if it not works in your case. |
This counter is posts counter, not attachments counter. The user I mentioned above have only 303 media, while amount of posts is larger than that. Apply my CSS (from spoiler) and scroll through posts until you reach latest one and tell me what post number it is. If you gonna paste it just on page, then remove outer brackets with |
If you want to add media counter, it is possible to create separate progress for that only. Actually, now that I'm think of it, I don't have proper coverage for videos, all my subscription, that I know post only images, archives and external links to Google Drive, Mega, etc. Could you dump an example request that have a video in post data? In my case it's probably would be here: https://github.com/JumpJets/boosty_archiver/blob/main/boosty_archiver.py#L503 adding checks for |
I definitely can do that. I'll also try to work out how to get this other error to you, but I'll switch this over to that repo. |
I made an attempt: 1ad58ca I'm not at all satisfied with how Also, the code is currently ignoring edit: Can anyone provide an example of a free audio post? Audio files aren't handled yet, but everything I found needed a subscription to access. |
I don't have anyone who post audio, same with video. Most of subscriptions I ever know have images in post, smaller part of them have links to Google Drive, Mega, etc.
It is almost simple, yes. URL have full path to file, while I was hoping, that you use user posts and parse user text as well. You can see in my example how I parsed it, it's tricky. But still many creators left external links in their posts and it would be useful to parse them in order to possibly export. |
I am thinking of a solution to video and audio. I may start Boosty creator account for testing purposes and create various type of content in order to parse it from API. |
There are |
Yes, but creators I use post password near links as well. At least one creator that require getting password along with link.
|
Subscribed user lists (#2387 (comment)) and homepage feed are now supported (274d99e)
Sorry, I made a typo that made it ignore all external links (ee8c4e2). |
Ok, so I setup creator account and added various type of content on it for testing platform, can add anything else if needed. And for testing I also added link to free subscription, so you can test it out on posts behind a tier. If needed more, I can post as well. |
By the way, looking at your code, I don't quite see how do you detect image extension? In case of Boosty it doesn't show it either in API or in URL, so what I did is feed first chunk from stream to magic to detect proper extension. From my code it this lines. |
Thank you very much! I took a look at the audio post to try to figure out how to handle these, but no luck. The API provides an URL with no query parameters:
while the actual URL used by the website has plenty of them:
Most are trivial to add, but I have no idea how to generate a
Still requires a credit card, and I do not own one.
This gets handled by the gallery-dl/gallery_dl/downloader/http.py Lines 246 to 248 in 5b968a0
gallery-dl/gallery_dl/downloader/http.py Lines 369 to 377 in 5b968a0
|
This is actually what signedQuery is, you have to get it from a post. Look at my file downloading as example to how I get this parameter. I'm currently at work, can't properly quote from phone. |
I've added support for video and audio files in my version, if you still have questions regarding signedQuery, can check my version. What is not perfect here is quality selector and support for hls streams or streams in general. For example, video file that I uploaded have this metadata:
but the best quality I find is worse version of this file and dimensions are also scaled down:
So there is still some questions about their API for videos, as with audio it exact the same file, so no questions here. |
Oh and I find an error in your latest commit |
There is actually a type of post missing or at least the extractor doesn't seem to download this type of media. Sometimes media is served via {
"title": "Dronification ",
"isRecord": false,
"isDeleted": false,
"id": "45d634d3-b8e1-46f0-af24-3d6a009dfac7",
...
"data": [
{
"modificator": "",
"type": "text",
"content": "[\"\\nCнова показываю вкусное для латексных фетишистов :3\",\"unstyled\",[]]"
},
{
"type": "text",
"modificator": "BLOCK_END",
"content": ""
},
{
"title": "24.12.12-109 копия.png",
"url": "https://cdn.boosty.to/file/fdc3c74c-d3ed-4520-9bd5-03154259e1be",
"type": "file",
"complete": true,
"id": "fdc3c74c-d3ed-4520-9bd5-03154259e1be",
"size": 81111934,
"isMigrated": true
},
{
"title": "24.12.12-108 копия.png",
"url": "https://cdn.boosty.to/file/5f7dcc07-1832-4d13-ab57-82ffe1651109",
"type": "file",
"complete": true,
"id": "5f7dcc07-1832-4d13-ab57-82ffe1651109",
"isMigrated": true,
"size": 82137340
},
{
"title": "24.12.12-117 копия.png",
"type": "file",
"url": "https://cdn.boosty.to/file/ecaa6e99-c00f-434e-aceb-2eafafeec3f5",
"id": "ecaa6e99-c00f-434e-aceb-2eafafeec3f5",
"complete": true,
"isMigrated": true,
"size": 76335739
},
...
]
} |
https://boosty.to/app/settings/subscriptions
this lists all authors one follows on that platform.
The text was updated successfully, but these errors were encountered: