Please add site https://old.ranobelib.me/ #1580

DEATHAN-SAA · 2024-11-25T21:16:29Z

Hello.
Apologies if my English is not perfect.
I'm working with this ↓

Hostname: https://old.ranobelib.me/
CSS Selector for the content: .reader-container
CSS Selector for Title of Chapter: div.reader-header-actions:nth-child(3) > div:nth-child(2) > div:nth-child(2)

Provide URL for web page that contains Table of Contents (list of chapters) of a typical story on the site
https://old.ranobelib.me/old/manga/80001--this-marriage-is-bound-to-fail-anyway?ui=6032&section=content

I cannot understand why it does not find the first chapter.
Not all chapters are displayed at once - its maximum is 72 chapters, if you scroll down the list of chapters in advance.

In this case, I scrolled down a little and found only up to chapter 43.
When the story is only 60 chapters, it's not a problem, but for more than that, you have to scroll through the table of contents each time. Stopped at chapter 43 - then you need to scroll to about +24 and click on WebToEpub again. Copy the new links and paste them into the previous WebToEpub window to get all the chapters of the story. And if you scrolled further, for example, to chapter 78 - then WebToEpub will start from chapter 50. You'll have to go back to the table of contents and scroll the mouse wheel again to get chapters 44 to 49 by clicking on WebToEpub.

If a story has over 300 chapters, it becomes very tiring... And WebToEpub can easily skip a chapter somewhere in the list when scrolling. Therefore, you have to check the number of chapters using Excel every time.

Unfortunately, I still haven't figured out how to create new parser. :(

Thank you very much for WebToEpub!
I will be grateful to you for the answer.

dteviot · 2024-11-26T06:58:15Z

Looking at the HTML, the list of chapters is almost certainly in the <script> element which starts with

I think the URL for each chapter is taken using the "volume" and "number" members to make part of the chapter's URL.
e.g. for
https://old.ranobelib.me/old/manga/80001--this-marriage-is-bound-to-fail-anyway?section=content

A recent chapter is

https://old.ranobelib.me/old/80001--this-marriage-is-bound-to-fail-anyway/read/v3/c478

Which has number of 478, volume of 3 and name of "Экстра 6", Although I'm not quite sure how to create the title in the list. I think it's something like Volume X, Chapter Y, Title Z, in Russian.

Can probably use something like this to extract the JSON

        let startString = "window.__CONTENT__ = ";
        let scriptElement = [...dom.querySelectorAll("script")]
            .filter(s => s.textContent.includes(startString))[0];
        return util.locateAndExtractJson(scriptElement.textContent, startString)

And then construct the table of content entries.

Notes
Time taken: 43 minutes

gamebeaker · 2024-11-26T07:58:16Z

@dteviot you could use "[placeholder]"

DEATHAN-SAA · 2024-11-27T19:17:04Z

Hello. Thank you very much for the response. I apologize for taking your time again, but I and programming are not always compatible(((

Install from Source, using the instructions here
Copy the file “Template.js” in the folder plugin/js/parsers.
Rename the copied file, based on the site you want to parse.
Add link to the new file to popup.html.
Text replace “Template” in the file with the new Parser name.
Uncomment the functions of the template you need, modifying the sample implementations as required. Refer to Customizing the Template Parser for a new Web Site for a worked example

I cannot understand where I made a mistake...
And whether I wrote the parser correctly.
I am nub.

`"use strict";

//dead url/ parser
parserFactory.register("old.ranobelib.me", () => new oldranobelibParser());

class oldranobelibParser extends Parser {
constructor() {
super();
}

async getChapterUrls(dom, chapterUrlsUI) {
    let startString = "window.__CONTENT__ = ";
    let scriptElement = [...dom.querySelectorAll("script")]
        .filter(s => s.textContent.includes(startString))[0];
    return util.locateAndExtractJson(scriptElement.textContent, startString)
}

findContent(dom) {
    return dom.querySelector(".reader-container");
}

extractTitleImpl(dom) {
    let title = dom.querySelector("div.reader-header-actions:nth-child(3) > div:nth-child(2) > div:nth-child(2)");
    util.removeChildElementsMatchingCss(title, "span.subtitle, span[hidden]");
    return title;
}

extractAuthor(dom) {
    let authorLabel = dom.querySelector(".media-sidebar__info > div:nth-child(6) > div:nth-child(2)");
    return authorLabel?.textContent ?? super.extractAuthor(dom);
}

findChapterTitle(dom) {
    let title = dom.querySelector("div.reader-header-actions:nth-child(3) > div:nth-child(2) > div:nth-child(2)");
    util.removeChildElementsMatchingCss(title, "span, div");
    return title.textContent;
}

findCoverImageUrl(dom) {
    return util.getFirstImgSrc(dom, "div.media-cover");
}

`

gamebeaker · 2024-11-27T19:50:16Z

@DEATHAN-SAA i would guess that you forgot an closing bracket "}" at the end.
here is your code without error (it doesn't work but no more errors)
(hint: you can share your code over multiple lines if you enclose the block with 3 ` at the beginning

and 3 ` at the end.)

"use strict";

//dead url/ parser
parserFactory.register("old.ranobelib.me", () => new oldranobelibParser());

class oldranobelibParser extends Parser {
    constructor() {
    super();
    }
    async getChapterUrls(dom, chapterUrlsUI) {
        let startString = "window.__CONTENT__ = ";
        let scriptElement = [...dom.querySelectorAll("script")]
            .filter(s => s.textContent.includes(startString))[0];
        return util.locateAndExtractJson(scriptElement.textContent, startString)
    }

    findContent(dom) {
        return dom.querySelector(".reader-container");
    }

    extractTitleImpl(dom) {
        let title = dom.querySelector("div.reader-header-actions:nth-child(3) > div:nth-child(2) > div:nth-child(2)");
        util.removeChildElementsMatchingCss(title, "span.subtitle, span[hidden]");
        return title;
    }

    extractAuthor(dom) {
        let authorLabel = dom.querySelector(".media-sidebar__info > div:nth-child(6) > div:nth-child(2)");
        return authorLabel?.textContent ?? super.extractAuthor(dom);
    }

    findChapterTitle(dom) {
        let title = dom.querySelector("div.reader-header-actions:nth-child(3) > div:nth-child(2) > div:nth-child(2)");
        util.removeChildElementsMatchingCss(title, "span, div");
        return title.textContent;
    }

    findCoverImageUrl(dom) {
        return util.getFirstImgSrc(dom, "div.media-cover");
    }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please add site https://old.ranobelib.me/ #1580

Please add site https://old.ranobelib.me/ #1580

DEATHAN-SAA commented Nov 25, 2024

dteviot commented Nov 26, 2024

gamebeaker commented Nov 26, 2024

DEATHAN-SAA commented Nov 27, 2024

gamebeaker commented Nov 27, 2024

Please add site https://old.ranobelib.me/ #1580

Please add site https://old.ranobelib.me/ #1580

Comments

DEATHAN-SAA commented Nov 25, 2024

dteviot commented Nov 26, 2024

gamebeaker commented Nov 26, 2024

DEATHAN-SAA commented Nov 27, 2024

gamebeaker commented Nov 27, 2024