Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch insert #5086

Closed
kevinburke opened this issue Oct 19, 2015 · 7 comments
Closed

Batch insert #5086

kevinburke opened this issue Oct 19, 2015 · 7 comments

Comments

@kevinburke
Copy link
Contributor

Currently Model.create([record1, record2, ..., recordN]) opens N connections and issues N inserts of one record each.

Postgres supports inserting multiple records at once, for example:

INSERT INTO films (code, title, did, date_prod, kind) VALUES
    ('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
    ('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy');

I believe this would be a good change, since it opens 1 connection instead of N connections, and it maps more intuitively to what you think would happen if you called Model.create([]).

However, in the current state, if a record fails validation in the database (a uniqueness constraint, foreign key reference, not null constraint, or check constraint, for example), the current master will insert every other record. Multi-row INSERT INTO ... VALUES () will not insert any rows if one row fails constraint checks. This is a breaking change and would probably require a major version bump. (That said I'm not sure whether sails-postgresql guarantees all N-1 rows get written if one row fails? does async.each early exit if one function throws?)

Another option might be to issue the queries sequentially on one connection:

INSERT INTO films (code, title, did, date_prod, kind) VALUES
    ('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy') RETURNING *;
INSERT INTO films (code, title, did, date_prod, kind) VALUES
    ('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy') RETURNING *;

I'm not sure how the client would handle multiple RETURNING * lines, maybe it would and maybe it wouldn't. This would not write any rows after a failing row, but it would write any rows beforehand.

I believe createEach would have to be modified to avoid the async.each call, and I don't believe waterline-sequel supports batch insert so we'd need to build our own query there.

@tjwebb
Copy link
Contributor

tjwebb commented Oct 19, 2015

@kevinburke I know I've mentioned this before, but this issue is also resolved in the es6/knex re-write:

Knex supports batch insert ootb, and I bypass waterline-sequel altogether. I understand if you guys aren't ready to jump to a new adapter, but I'm interested in any feedback you might be able to offer.

@tjwebb
Copy link
Contributor

tjwebb commented Oct 19, 2015

if a record fails validation in the database (a uniqueness constraint, foreign key reference, not null constraint, or check constraint, for example), the current master will insert every other record.

This is probably because, without an explicit createEach implementation, each create is treated by Waterline as a separate operation. I think also the new adapter would solve this problem by implementing the createEach method.

It does constitute an unfortunate semantic difference between the reference implementation and the adapter-specific implementation. In this case I think it's correct to fail the entire insert if any one of the records fails; this would require a fix in Waterline core.

@particlebanana
Copy link
Contributor

Part of the issue with the non-optimized createEach is because we stupidly added nested creates which looking back was a horrible idea.

https://github.com/balderdashy/waterline/issues/1007#issuecomment-145177976

@sailsbot
Copy link

Thanks for posting, @kevinburke. I'm a repo bot-- nice to meet you!

It has been 30 days since there have been any updates or new comments on this page. If this issue has been resolved, feel free to disregard the rest of this message. On the other hand, if you are still waiting on a patch, please:

  • review our contribution guide to make sure this submission meets our criteria (only verified bugs with documented features, please; no questions, commentary, or bug reports about undocumented features or unofficial plugins)
  • create a new issue with the latest information, including updated version details with error messages, failing tests, and a link back to the original issue. This allows GitHub to automatically create a back-reference for future visitors arriving from search engines.

Thanks so much for your help!

@devinivy
Copy link

Hey @kevinburke @kevinburkeshyp,
I saw three great feature requests of yours were closed over night by the bot. Do you mind making a ROADMAP.md file in this repo with #198, #195, and this issue added? For an example, see the waterline roadmap: balderdashy/waterline@e40dd02

If you don't want to do this or don't have time, let me know here and I'll reopen those issues until they are placed in a roadmap file.

particlebanana referenced this issue in balderdashy/sails-postgresql Nov 23, 2015
@particlebanana
Copy link
Contributor

@mikermcneil
Copy link
Member

For posterity: you can check out more info on where this stands in the conversation here: https://gitter.im/balderdashy/sails?at=5812a124806316005dc9e6fb

@balderdashy balderdashy locked and limited conversation to collaborators Oct 28, 2016
@johnabrams7 johnabrams7 pinned this issue Apr 30, 2019
@johnabrams7 johnabrams7 unpinned this issue Apr 30, 2019
@balderdashy balderdashy unlocked this conversation Apr 30, 2019
@johnabrams7 johnabrams7 transferred this issue from balderdashy/sails-postgresql Apr 30, 2019
@balderdashy balderdashy locked as resolved and limited conversation to collaborators Apr 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

6 participants