Skip to content

Commit

Permalink
51 24 attachment in body and bigger files (#60)
Browse files Browse the repository at this point in the history
  • Loading branch information
if0s authored Jun 9, 2023
1 parent 5dc2f98 commit e6ae872
Show file tree
Hide file tree
Showing 17 changed files with 6,844 additions and 1,787 deletions.
2 changes: 2 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ workflows:
jobs:
- build:
name: "Build and publish docker image"
context:
- componentspusher
filters:
branches:
ignore: /.*/
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
node_modules
coverage
.idea
.vscode
10 changes: 10 additions & 0 deletions .grype-ignore.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
ignore:
- vulnerability: CVE-2023-2650
package:
name: libssl3
version: 3.1.0-r4

- vulnerability: CVE-2023-2650
package:
name: libcrypto3
version: 3.1.0-r4
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
## 1.4.0 (June 09, 2023)
* Implemented support `attachments` inside message body for `XML Attachment to JSON` action
* Updated Sailor version to 2.7.1
* Removed old dependencies

## 1.3.7 (September 12, 2022)

* Deleted buildType from component.json to fix component build
Expand Down
52 changes: 43 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,12 @@ is equivalent to
```

#### Environment variables
* `MAX_FILE_SIZE`: *optional* - Controls the maximum size of an attachment to be written in MB.
Defaults to 10 MB where 1 MB = 1024 * 1024 bytes.
* `MAX_FILE_SIZE`: *optional* - Controls the maximum size of an attachment to be read or written in MB.

Defaults to 10 MB where 1 MB = 1024 * 1024 bytes.
* `EIO_REQUIRED_RAM_MB`: *optional* - You can increase memory usage limit for component if you going to work with big files

Defaults to 256 MB where 1 MB = 1024 * 1024 bytes.

## Actions

Expand Down Expand Up @@ -74,9 +78,41 @@ will be converted into:
```

### XML Attachment to JSON
Looks at the JSON array of attachments passed in to component and converts all XML that it finds to generic JSON objects
and produces one outbound message per matching attachment. As input, the user can enter a patter pattern for filtering
files by name or leave this field empty for processing all incoming *.xml files.
#### Configuration Fields

* **Pattern** - (string, optional): RegEx for filtering files by name provided via old attachment mechanism (outside message body)
* **Upload single file** - (checkbox, optional): Use this option if you want to upload a single file

#### Input Metadata
If `Upload single file` checked, there will be 2 fields:
* **URL** - (string, required): link to file on Internet or platform

If `Upload single file` unchecked:
* **Attachments** - (array, required): Collection of files to upload, each record contains object with two keys:
* **URL** - (string, required): link to file on Internet or platform

If you going to use this option with static data, you need to switch to Developer mode
<details><summary>Sample</summary>
<p>

```json
{
"attachments": [
{
"url": "https://example.com/files/file1.xml"
},
{
"url": "https://example.com/files/file2.xml"
}
]
}
```
</p>
</details>

#### Output Metadata

Resulting JSON object

### JSON to XML
Provides an input where a user provides a JSONata expression that should evaluate to an object to convert to JSON.
Expand Down Expand Up @@ -104,12 +140,10 @@ The incoming message should have a single field `input`. When using integrator m
```

## Known limitations
- The maximum size of incoming file for processing is 5 MiB. If the size of incoming file will be more than 5 MiB,
action will throw error `Attachment *.xml is to large to be processed by XML component. File limit is: 5242880 byte,
file given was: * byte.`.
- All actions involving attachments are not supported on local agents due to current platform limitations.
- When creating XML files with invalid XML tags, the name of the potentially invalid tag will not be reported.

- When you try to retrieve sample in `XML Attachment to JSON` action and it's size is more then 500Kb, there will be generated new smaller sample with same structure as original

## Additional Info
Icon made by Freepik from www.flaticon.com

Expand Down
16 changes: 11 additions & 5 deletions component.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"title": "XML",
"version": "1.3.7",
"version": "1.4.0",
"description": "Component to convert between XML and JSON data",
"actions": {
"xmlToJson": {
Expand Down Expand Up @@ -97,12 +97,18 @@
"required": false,
"viewClass": "TextFieldView",
"placeholder": "Pattern"
},
"uploadSingleFile": {
"label": "Upload single file",
"viewClass": "CheckBoxView",
"required": false,
"order": 70,
"help": {
"description": "Use this option if you want to upload a single file"
}
}
},
"metadata": {
"in": {},
"out": "./lib/schemas/attachmentToJson.out.json"
}
"dynamicMetadata": true
}
}
}
124 changes: 69 additions & 55 deletions lib/actions/attachmentToJson.js
Original file line number Diff line number Diff line change
@@ -1,12 +1,21 @@
/* eslint-disable no-await-in-loop */
const sizeof = require('object-sizeof');
/* eslint-disable no-await-in-loop, no-restricted-syntax, max-len, no-unused-vars, no-loop-func */
const { AttachmentProcessor } = require('@elastic.io/component-commons-library');
const { messages } = require('elasticio-node');
const { getUserAgent } = require('../utils');

const { createSchema } = require('genson-js');
const jsf = require('json-schema-faker');
const sizeof = require('object-sizeof');
const { newMessageWithBody } = require('elasticio-node/lib/messages');
const { readFile, writeFile, stat } = require('fs/promises');
const {
getUserAgent,
MAX_FILE_SIZE,
MAX_FILE_SIZE_FOR_SAMPLE,
memUsage,
} = require('../utils');
const attachmentToJsonIn = require('../schemas/attachmentToJson.in.json');
const xml2Json = require('../xml2Json');

const MAX_FILE_SIZE = 5242880; // 5 MiB
const isDebugFlow = process.env.ELASTICIO_FLOW_TYPE === 'debug';
const tempFile = '/tmp/data.json';

function checkFileName(self, fileName, pattern) {
if (fileName === undefined) {
Expand All @@ -26,61 +35,66 @@ function checkFileName(self, fileName, pattern) {
return true;
}

const tooLargeErrMsg = (fileName, fileSize) => `Attachment ${fileName} is too large to be processed by XML component. `
+ `File limit is: ${MAX_FILE_SIZE} byte, file given was: ${fileSize} byte.`;

module.exports.process = async function processAction(msg, cfg) {
const self = this;
const { attachments } = msg;
const pattern = new RegExp(cfg !== undefined ? cfg.pattern || '(.xml)' : '(.xml)');
let foundXML = false;
const { attachments, body = {} } = msg;
const { pattern = '(.xml)', uploadSingleFile } = cfg || {};
const files = [];
if (uploadSingleFile) {
files.push(msg.body);
} else if (body.attachments && body.attachments.length > 0) {
files.push(...(body.attachments || []));
} else if (Object.keys(attachments || {}).length > 0) {
const filteredFiles = Object.keys(attachments)
.map((key) => ({ fileName: key, ...attachments[key] }))
.filter((file) => checkFileName(self, file.fileName, new RegExp(pattern)));
const tooLarge = filteredFiles.find((file) => file.size && file.size > MAX_FILE_SIZE);
if (tooLarge) throw new Error(tooLargeErrMsg(tooLarge.fileName, tooLarge.size));
files.push(...filteredFiles);
}

self.logger.info('Attachment to XML started');
self.logger.info('Found %s attachments', Object.keys(attachments || {}).length);
self.logger.info(`Attachment to XML started\nFound ${files.length} attachments`);

const attachmentProcessor = new AttachmentProcessor(getUserAgent(), msg.id);
// eslint-disable-next-line no-restricted-syntax
for (const key of Object.keys(attachments)) {
const attachment = attachments[key];
const fileName = key;
// get file size based attachment object may not be define or be accurate
let fileSize = attachment.size;
for (const file of files) {
self.logger.info('Processing attachment');

if (checkFileName(self, fileName, pattern)) {
if (fileSize === undefined || fileSize < MAX_FILE_SIZE) {
// eslint-disable-next-line no-await-in-loop
const response = await attachmentProcessor.getAttachment(attachment.url, 'arraybuffer');

this.logger.debug(`For provided filename response status: ${response.status}`);

if (response.status >= 400) {
throw new Error(`Error in making request to ${attachment.url}
Status code: ${response.status},
Body: ${Buffer.from(response.data, 'binary').toString('base64')}`);
}

const responseBodyString = Buffer.from(response.data, 'binary').toString('utf-8');

if (!responseBodyString) {
throw new Error(`Empty attachment received for file ${fileName}`);
}

fileSize = sizeof(responseBodyString);

if (fileSize < MAX_FILE_SIZE) {
const returnMsg = await xml2Json.process(this, responseBodyString);
this.logger.debug('Attachment to XML finished');
foundXML = true;
await self.emit('data', messages.newMessageWithBody(returnMsg.body));
} else {
throw new Error(`Attachment ${key} is too large to be processed my XML component.`
+ ` File limit is: ${MAX_FILE_SIZE} byte, file given was: ${fileSize} byte.`);
}
} else {
throw new Error(`Attachment ${key} is too large to be processed my XML component.`
+ ` File limit is: ${MAX_FILE_SIZE} byte, file given was: ${fileSize} byte.`);
}
let response = await attachmentProcessor.getAttachment(file.url, 'text');
this.logger.debug(`For provided filename response status: ${response.status}`);
let responseBodyString = response.data;
if (response.status >= 400) {
throw new Error(`Error in making request to ${file.url} Status code: ${response.status}, Body: ${responseBodyString}`);
}
}
if (!foundXML) {
self.logger.info('No XML files that match the pattern found within attachments');
if (!responseBodyString) {
throw new Error(`Empty attachment received for file ${file.fileName || ''}`);
}
const fileSize = response.headers['content-length'];
response = null;
if (Number(fileSize) > MAX_FILE_SIZE) throw new Error(tooLargeErrMsg(file.fileName || '', fileSize));
let { body: json } = await xml2Json.process(this, responseBodyString);
responseBodyString = null;
await writeFile(tempFile, JSON.stringify(json));
if ((await stat(tempFile)).size > MAX_FILE_SIZE_FOR_SAMPLE && isDebugFlow) {
this.logger.warn('The message size exceeded the sample size limit. To match the limitation we will generate a smaller sample using the structure/schema from the original file.');
const schema = createSchema(json);
jsf.option({
alwaysFakeOptionals: true,
fillProperties: false,
});
json = jsf.generate(schema);
}
this.logger.debug(`Attachment to XML finished, emitting message. ${memUsage()}`);
await self.emit('data', newMessageWithBody(json));
}
};

async function getMetaModel(cfg) {
return {
in: cfg.uploadSingleFile ? attachmentToJsonIn.properties.attachments.items : attachmentToJsonIn,
out: {},
};
}

module.exports.getMetaModel = getMetaModel;
17 changes: 9 additions & 8 deletions lib/actions/jsonToXml.js
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
/* eslint-disable no-param-reassign */
const { AttachmentProcessor } = require('@elastic.io/component-commons-library');
const { messages } = require('elasticio-node');
const xml2js = require('xml2js');
const _ = require('lodash');
const { getUserAgent } = require('../utils');

const MB_TO_BYTES = 1024 * 1024;
const MAX_FILE_SIZE = process.env.MAX_FILE_SIZE * MB_TO_BYTES || 10 * MB_TO_BYTES;
const { Readable } = require('stream');
const { getUserAgent, MAX_FILE_SIZE } = require('../utils');

module.exports.process = async function process(msg, cfg) {
const { input } = msg.body;
const msgid = msg.id;
const { uploadToAttachment, excludeXmlHeader, headerStandalone } = cfg;

this.logger.info('Message received.');
Expand All @@ -34,8 +34,8 @@ module.exports.process = async function process(msg, cfg) {
throw new Error('Input must be an object with exactly one key.');
}

const xml2String = () => builder.buildObject(input);
const xmlString = xml2String();
const xmlString = builder.buildObject(input);
msg = null;

if (!uploadToAttachment) {
this.logger.info('Sending XML data in message.');
Expand All @@ -50,9 +50,10 @@ module.exports.process = async function process(msg, cfg) {
throw new Error(`XML data is ${attachmentSize} bytes, and is too large to upload as an attachment. Max attachment size is ${MAX_FILE_SIZE} bytes`);
}
this.logger.info(`Will create XML attachment of size ${attachmentSize} byte(s)`);
const getAttachment = async () => Readable.from([xmlString]);

const attachmentProcessor = new AttachmentProcessor(getUserAgent(), msg.id);
const createdAttachmentId = await attachmentProcessor.uploadAttachment(xml2String);
const attachmentProcessor = new AttachmentProcessor(getUserAgent(), msgid);
const createdAttachmentId = await attachmentProcessor.uploadAttachment(getAttachment);
const attachmentUrl = attachmentProcessor.getMaesterAttachmentUrlById(createdAttachmentId);
this.logger.info('Attachment created successfully');

Expand Down
67 changes: 0 additions & 67 deletions lib/actions/parse.js

This file was deleted.

Loading

0 comments on commit e6ae872

Please sign in to comment.