Skip to content

Commit

Permalink
Make stream() sample size configurable
Browse files Browse the repository at this point in the history
Improve document stream() detection limitation.

Related: #426, #452
  • Loading branch information
Borewit committed Jul 22, 2021
1 parent 8fa426b commit f68ae1b
Show file tree
Hide file tree
Showing 3 changed files with 57 additions and 26 deletions.
40 changes: 22 additions & 18 deletions core.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -356,42 +356,46 @@ declare namespace core {
const mimeTypes: Set<core.MimeType>;

/**
* Stream options.
* Option object, used by `stream()`.
*/
interface IStreamOptions {
interface StreamOptions {
/**
* Sample size in bytes.
*/
readonly sampleSize?: number
}

/**
Detect the file type of a readable stream.
Returns a `Promise` which resolves to the original readable stream argument, but with an added `fileType` property, which is an object like the one returned from `FileType.fromFile()`.
This method can be handy to put in between a stream, but it comes with a price.
Internally `stream()` builds up a buffer of `sampleSize` bytes, used as a sample, to determine the file type.
The sample size impacts the file detection resolution.
A smaller sample size will result in lower probability of the best file type detection.
*Note:* This method is only available using Node.js.
@param readableStream - A [readable stream](https://nodejs.org/api/stream.html#stream_class_stream_readable) containing a file to examine.
@param options - Options
@param options - Option object
@returns A `Promise` which resolves to the original readable stream argument, but with an added `fileType` property, which is an object like the one returned from `FileType.fromFile()`.
@example
```
import * as fs from 'fs';
import * as crypto from 'crypto';
import fileType = require('file-type');
```js
const got = require('got');
const FileType = require('file-type');
(async () => {
const read = fs.createReadStream('encrypted.enc');
const decipher = crypto.createDecipheriv(alg, key, iv);
const stream = await fileType.stream(read.pipe(decipher), {sampleSize: 1024});
console.log(stream.fileType);
//=> {ext: 'mov', mime: 'video/quicktime'}
const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';
const write = fs.createWriteStream(`decrypted.${stream.fileType.ext}`);
stream.pipe(write);
(async () => {
const stream1 = got.stream(url);
const stream2 = await FileType.stream(stream1, {sampleSize: 1024});
if (stream2.fileType && stream2.fileType.mime === 'image/jpeg') {
// stream2 can be used to stream the JPEG image (from the very beginning of the stream)
}
})();
```
*/
function stream(readableStream: ReadableStream, options?: IStreamOptions): Promise<core.ReadableStreamWithFileType>
function stream(readableStream: ReadableStream, options?: StreamOptions): Promise<core.ReadableStreamWithFileType>
}

export = core;
39 changes: 33 additions & 6 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,20 +278,47 @@ Type: [`ITokenizer`](https://github.com/Borewit/strtok3#tokenizer)

A file source implementing the [tokenizer interface](https://github.com/Borewit/strtok3#tokenizer).

### FileType.stream(readableStream, sampleSize)

Detect the file type of a readable stream.

If `sampleSize` is not provided, a backward compatible sample size of 4100 bytes is used.
### FileType.stream(readableStream, options?)

Returns a `Promise` which resolves to the original readable stream argument, but with an added `fileType` property, which is an object like the one returned from `FileType.fromFile()`.

This method can be handy to put in between a stream, but it comes with a price.
Internally `stream()` builds up a buffer of `sampleSize` bytes, used as a sample, to determine the file type.
The sample size impacts the file detection resolution. A smaller sample size will result in lower probability of the best file type detection.
The sample size impacts the file detection resolution.
A smaller sample size will result in lower probability of the best file type detection.

*Note:* This method is only available using Node.js.

#### readableStream
Type: [`Readable`](https://nodejs.org/api/stream.html#stream_class_stream_readable)

#### options
Type: `Object`, for example:
```js
{ sampleSize: 400 }
```

##### sampleSize
Type: `number`, change default sample size of 4100 bytes.

#### Example

```js
const got = require('got');
const FileType = require('file-type');

const url = 'https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg';

(async () => {
const stream1 = got.stream(url);
const stream2 = await FileType.stream(stream1, {sampleSize: 1024});
if (stream2.fileType && stream2.fileType.mime === 'image/jpeg') {
// stream2 can be used to stream the JPEG image (from the very beginning of the stream)
}
})();
```


#### readableStream

Type: [`stream.Readable`](https://nodejs.org/api/stream.html#stream_class_stream_readable)
Expand Down
4 changes: 2 additions & 2 deletions test.js
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ test('.stream() method - short stream', async t => {
t.deepEqual(bufferA, bufferB);
});

test('.stream() method - no End-Of-Stream errors', async t => {
test('.stream() method - no end-of-stream errors', async t => {
const file = path.join(__dirname, 'fixture', 'fixture.ogm');
const stream = await FileType.stream(fs.createReadStream(file), {sampleSize: 30});
t.is(stream.fileType, undefined);
Expand All @@ -357,7 +357,7 @@ test('.stream() method - error event', async t => {
await t.throwsAsync(FileType.stream(readableStream), errorMessage);
});

test('.stream() method - sampleSize', async t => {
test('.stream() method - sampleSize option', async t => {
const file = path.join(__dirname, 'fixture', 'fixture.ogm');
let stream = await FileType.stream(fs.createReadStream(file), {sampleSize: 30});
t.is(typeof (stream.fileType), 'undefined', 'file-type cannot be determined with a sampleSize of 30');
Expand Down

0 comments on commit f68ae1b

Please sign in to comment.