Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(core): Batch items sent in runonceforeachitem mode (no-changelog) #11870

Merged

Conversation

tomi
Copy link
Collaborator

@tomi tomi commented Nov 25, 2024

Summary

To reduce memory footprint, execute runOnceForEachItem in chunks of 1000
items at a time. This makes the execution a bit slower but reduces the max
memory usage. In the future we could consider e.g. sampling the input data
and determine the chunk size based on that.

Also remove N8N_RUNNERS_ASSERT_DEDUPLICATION_OUTPUT as it can't be checked in all cases anymore and we haven't had any assertion issues with it.

Tested using this WF:
{
  "meta": {
    "instanceId": "5b46fac5e9673392b3401e98bcc7f9ea17bef74f40d13b962cd1ae3eda2b46e0"
  },
  "nodes": [
    {
      "parameters": {},
      "id": "d4ef24a2-e448-479d-ae04-2ffa58988c8b",
      "name": "When clicking ‘Test workflow’",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        460,
        460
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "assignments": {
          "assignments": [
            {
              "id": "be9f61bf-17d0-448a-b5c9-dc9a0823a9f5",
              "name": "fullName",
              "value": "={{ $json.firstName }} {{ $json.lastName }}",
              "type": "string"
            },
            {
              "id": "506ecd03-53be-4a7e-b1ce-dd892260931b",
              "name": "updatedAt",
              "value": "={{ $now.toISO() }}",
              "type": "string"
            }
          ]
        },
        "includeOtherFields": true,
        "options": {}
      },
      "id": "49124629-9c05-44c4-8952-1335ddd10427",
      "name": "Edit Fields",
      "type": "n8n-nodes-base.set",
      "typeVersion": 3.4,
      "position": [
        920,
        460
      ]
    },
    {
      "parameters": {
        "jsCode": "function getRandomUser(idx) {\n  const firstNames = [\"Alice\", \"Bob\", \"Charlie\", \"David\", \"Eve\"];\n  const lastNames = [\"Smith\", \"Johnson\", \"Brown\", \"Williams\", \"Jones\"];\n  const domains = [\"example.com\", \"test.com\", \"domain.com\"];\n\n  const getRandomElement = (arr) => arr[Math.floor(Math.random() * arr.length)];\n  const getRandomInt = (min, max) => Math.floor(Math.random() * (max - min + 1)) + min;\n\n  const firstName = getRandomElement(firstNames);\n  const lastName = getRandomElement(lastNames);\n  const age = getRandomInt(18, 70);\n  const email = `${firstName.toLowerCase()}.${lastName.toLowerCase()}@${getRandomElement(domains)}`;\n  const userId = getRandomInt(1000, 9999);\n\n  return {\n    idx,\n    userId,\n    firstName,\n    lastName,\n    age,\n    email,\n    isActive: Math.random() > 0.5,\n    createdAt: new Date(Date.now() - getRandomInt(0, 1000 * 60 * 60 * 24 * 365)).toISOString(),\n  };\n}\n\nreturn Array.from({ length: 10_000 }).map((_, idx) => getRandomUser(idx))"
      },
      "id": "baaa6158-553b-41df-bd4a-8b4b5458c07b",
      "name": "CreateData",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        680,
        460
      ]
    },
    {
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "\nreturn {\n  ...$json,\n  domain: $json.email.split(\"@\")[1]\n}"
      },
      "id": "502761a1-d54f-40ef-a0c7-3d00e8378ac3",
      "name": "AddField",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1140,
        460
      ]
    },
    {
      "parameters": {
        "jsCode": "let maxId = 0\nlet sumAge = 0\n\nfor (let it of $input.all()) {\n  maxId = Math.max(maxId, it.json.userId)\n  sumAge += it.json.age\n}\n\nreturn {\n  json: {\n    maxId,\n    avgAge: sumAge / $input.all().length\n  }\n}"
      },
      "id": "e4a21d7d-6ac6-4d40-b9aa-63076a4823c4",
      "name": "Aggregate",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1400,
        460
      ]
    },
    {
      "parameters": {
        "jsCode": "const items = $('CreateData').all()\nconst stats = $input.first().json\n\nreturn items.map(i => ({\n  ...i,\n  stats\n}))"
      },
      "id": "9f1f3762-5870-4905-a67e-4a8a92746a2f",
      "name": "AccessPastNode",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1620,
        460
      ]
    }
  ],
  "connections": {
    "When clicking ‘Test workflow’": {
      "main": [
        [
          {
            "node": "CreateData",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Edit Fields": {
      "main": [
        [
          {
            "node": "AddField",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "CreateData": {
      "main": [
        [
          {
            "node": "Edit Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "AddField": {
      "main": [
        [
          {
            "node": "Aggregate",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Aggregate": {
      "main": [
        [
          {
            "node": "AccessPastNode",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "pinData": {}
}
image

Related Linear tickets, Github issues, and Community forum posts

https://linear.app/n8n/issue/CAT-300/[memory]-batch-items-sent-in-runonceforeachitem-mode

Review / Merge checklist

  • PR title and summary are descriptive. (conventions)
  • Docs updated or follow-up ticket created.
  • Tests included.
  • PR Labeled with release/backport (if the PR is an urgent fix that needs to be backported)

@tomi tomi force-pushed the cat-300-memory-batch-items-sent-in-runonceforeachitem-mode branch from 02776b7 to 160e6e6 Compare November 25, 2024 11:34
To reduce memory footprint, execute runOnceForEachItem in chunks of 1000
items at a time. This makes the execution a bit slower but reduces the max
memory usage. In the future we could consider e.g. sampling the input data
and determine the chunk size based on that.
@tomi tomi force-pushed the cat-300-memory-batch-items-sent-in-runonceforeachitem-mode branch from 160e6e6 to c28fe39 Compare November 25, 2024 11:55
@n8n-assistant n8n-assistant bot added core Enhancement outside /nodes-base and /editor-ui n8n team Authored by the n8n team node/improvement New feature or request labels Nov 25, 2024
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

Attention: Patch coverage is 85.29412% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...kages/nodes-base/nodes/Code/JsTaskRunnerSandbox.ts 84.21% 3 Missing ⚠️
...rs/task-managers/data-request-response-stripper.ts 91.66% 1 Missing ⚠️
packages/nodes-base/nodes/Code/Code.node.ts 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Contributor

@ivov ivov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work!! Small comments only :)

When running the workflow it ended with an error, but I take it it's an unrelated syntax issue - all 10k items are in there

Capture 2024-11-25 at 17 13 09@2x

@tomi
Copy link
Collaborator Author

tomi commented Nov 26, 2024

When running the workflow it ended with an error, but I take it it's an unrelated syntax issue - all 10k items are in there

Capture 2024-11-25 at 17 13 09@2x

@ivov Interesting.. I was not able to reproduce this. Does it happen consistently?

@tomi
Copy link
Collaborator Author

tomi commented Nov 26, 2024

@ivov thank you for the review 🙌 Addressed all the comments. Please have another look 🙇

@tomi tomi requested a review from ivov November 26, 2024 08:34
Copy link

cypress bot commented Nov 26, 2024

n8n    Run #8083

Run Properties:  status check passed Passed #8083  •  git commit dd5fb936a8: 🌳 🖥️ browsers:node18.12.0-chrome107 🤖 tomi 🗃️ e2e/*
Project n8n
Branch Review cat-300-memory-batch-items-sent-in-runonceforeachitem-mode
Run status status check passed Passed #8083
Run duration 04m 32s
Commit git commit dd5fb936a8: 🌳 🖥️ browsers:node18.12.0-chrome107 🤖 tomi 🗃️ e2e/*
Committer Tomi Turtiainen
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 1
Tests that did not run due to a developer annotating a test with .skip  Pending 0
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 478
View all changes introduced in this branch ↗︎

Copy link
Contributor

✅ All Cypress E2E specs passed

@ivov
Copy link
Contributor

ivov commented Nov 26, 2024

When running the workflow it ended with an error, but I take it it's an unrelated syntax issue - all 10k items are in there
Capture 2024-11-25 at 17 13 09@2x

@ivov Interesting.. I was not able to reproduce this. Does it happen consistently?

This happens consistently. We should look into this separately.

@tomi tomi merged commit e22d0f3 into master Nov 26, 2024
35 checks passed
@tomi tomi deleted the cat-300-memory-batch-items-sent-in-runonceforeachitem-mode branch November 26, 2024 10:21
@janober
Copy link
Member

janober commented Nov 27, 2024

Got released with [email protected]

riascho pushed a commit that referenced this pull request Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Enhancement outside /nodes-base and /editor-ui n8n team Authored by the n8n team node/improvement New feature or request Released
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants