Skip to content

Commit

Permalink
[Doc] Use axios (StarRocks#53623)
Browse files Browse the repository at this point in the history
  • Loading branch information
DanRoscigno authored Dec 5, 2024
1 parent 81424c0 commit 4e94675
Show file tree
Hide file tree
Showing 4 changed files with 48 additions and 23 deletions.
2 changes: 1 addition & 1 deletion docs/docusaurus/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM node:21

WORKDIR /app/docusaurus
ENV NODE_OPTIONS="--max-old-space-size=8192"
ENV NODE_OPTIONS="--max-old-space-size=8192 --no-warnings=ExperimentalWarning"

RUN apt update && apt install -y neovim python3.11-venv ghostscript

Expand Down
6 changes: 4 additions & 2 deletions docs/docusaurus/PDF/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,11 @@ git switch branch-3.3

The conversion process uses Docker Compose. Launch the environment by running the following command from the `starrocks/docs/docusaurus/PDF/` directory.

The `--wait-timeout 400` will give the services 400 seconds to get to a healthy state. This is to allow both Docusaurus and Gotenberg to become ready to handle requests. On my machine it takes about 200 seconds for Docusaurus to build the docs and start serving them.

```bash
cd starrocks/docs/docusaurus/PDF
docker compose up --detach --wait --wait-timeout 120 --build
docker compose up --detach --wait --wait-timeout 400 --build
```

> Tip
Expand Down Expand Up @@ -113,7 +115,7 @@ node generatePdf.js http://0.0.0.0:3000/zh/docs/introduction/StarRocks_intro/

> Note:
>
> There are 900+ PDF files and more than 4,000 pages in total. Combining takes five hours on my laptop, just let it run. I am looking for a faster method to combine the files.
> There are 900+ PDF files and more than 4,000 pages in total. Combining takes three hours on my laptop, just let it run. I am looking for a faster method to combine the files.
```bash
source .venv/bin/activate
Expand Down
8 changes: 7 additions & 1 deletion docs/docusaurus/PDF/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ services:
docusaurus:
build: ../
environment:
- NODE_OPTIONS="--max-old-space-size=8192"
- DISABLE_VERSIONING='true'
- PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
- PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
Expand All @@ -19,13 +18,20 @@ services:
- ../../en:/app/docusaurus/docs
- ../../zh:/app/docusaurus/i18n/zh/docusaurus-plugin-content-docs/current
working_dir: /app/docusaurus
healthcheck:
test: curl --fail http://docusaurus:3000 || exit 1
interval: 10s
retries: 20
start_period: 140s
timeout: 5s
entrypoint: >
/bin/bash -c "
cd PDF && yarn install &&
python3 -m venv .venv &&
source .venv/bin/activate &&
pip3 install pdfcombine &&
cd /app/docusaurus &&
npm install -g [email protected] &&
yarn install &&
yarn build &&
yarn serve -p 3000 -h 0.0.0.0 &&
Expand Down
55 changes: 36 additions & 19 deletions docs/docusaurus/PDF/generatePdf.js
Original file line number Diff line number Diff line change
Expand Up @@ -38,24 +38,40 @@ function getUrls(url) {
}
};

async function callGotenberg(docusaurusUrl, fileName) {
//var util = require('util');
var execSync = require('child_process').execSync;

var command = `curl --request POST http://gotenberg:3000/forms/chromium/convert/url --form url=${docusaurusUrl} -o ${fileName}`
async function callGotenberg(docusaurusUrl, fileName) {

child = execSync(command, function(error, stdout, stderr){
const path = require("path");
const FormData = require("form-data");

try {
// Convert URL content to PDF using Gotenberg
const form = new FormData();
form.append('url', `${docusaurusUrl}`)

const response = await axios.post(
"http://gotenberg:3000/forms/chromium/convert/url",
form,
{
headers: form.getHeaders(),
responseType: "arraybuffer",
}
);

if (response.status !== 200) {
throw new Error(`Failed to convert file: ${response.statusText}`);
}

//console.log('stdout: ' + stdout);
console.log('stderr: ' + stderr);
const buffer = await response.data;

if(error !== null)
{
console.log('exec error: ' + error);
// Save the converted file
fs.writeFileSync(fileName, buffer);
//console.log('wrote URL content from %s to PDF file %s', docusaurusUrl, fileName);

} catch (err) {
console.error(err.message || err);
}

});
}
};

async function processLineByLine() {
const fileStream = fs.createReadStream('URLs.txt');
Expand All @@ -64,7 +80,7 @@ async function processLineByLine() {
input: fileStream,
crlfDelay: Infinity
});

console.log("Generating PDFs");
for await (const line of rl) {
// Each line in input.txt will be successively available here as `line`.
//console.log(`URL: ${line}`);
Expand All @@ -74,6 +90,7 @@ async function processLineByLine() {
console.log(err);
});
}
console.log(" done");
}

async function requestPage(url) {
Expand All @@ -95,21 +112,21 @@ async function requestPage(url) {
}
});

await callGotenberg(url, fileName);
i++;
await callGotenberg(url, fileName);
process.stdout.write(".");
i++;

}




function main(ms) {
function main() {
// startingUrl is the URL for the first page of the docs
// Get all of the URLs and write to URLs.txt
console.log("Crawling from %s", startingUrl);
getUrls(startingUrl);

console.log(startingUrl);

const yamlHeader = 'files:\n';

fs.writeFile('./combine.yaml', yamlHeader, err => {
Expand Down

0 comments on commit 4e94675

Please sign in to comment.