Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating sql boolean field #659

Open
tsirkin opened this issue May 11, 2022 · 4 comments
Open

Generating sql boolean field #659

tsirkin opened this issue May 11, 2022 · 4 comments

Comments

@tsirkin
Copy link

tsirkin commented May 11, 2022

Hi guys,
Great project! Really.

I am trying to generate sql table with a boolean column

The recipe file

- var: snowfakery_locale
  value: en_US

- object: Person
  fields:
    derived: ${{fake.Boolean(chance_of_getting_true=50)}}

And here is the output of snowfakery test.yml --output-format=sql .

BEGIN TRANSACTION;
CREATE TABLE "Person" (
	id INTEGER NOT NULL, 
	derived VARCHAR(255), 
	PRIMARY KEY (id)
);
INSERT INTO "Person" VALUES(1,'False');
COMMIT;

How do I generate proper boolean column please?

@prescod
Copy link
Contributor

prescod commented May 11, 2022

Salesforce's bulk data APIs are untyped so we have not had any requests so far for strongly typed output.

Also: Snowfakery does not enforce that a column is homogenous in its type.

I'd be curious about your use-case.

The overall solution is to write your schema by hand and then pull in the untyped data into it.

For example, you could write a schema.sql and then a shell script which creates a new DB, evaluates schema.sql and then imports the CSV.

https://www.sqlitetutorial.net/sqlite-import-csv/

Something roughly like this should work:

sqlite3 yourdb.db -cmd ".read schema.sql" -cmd ".import Account.csv Account" -cmd ".import Contact.csv Contact" -cmd ".dump"

Or you could use Python:

https://mungingdata.com/sqlite/create-database-load-csv-python/

(the use of Pandas is kind of gratuitous, the Python CSV library is fine for this use-case)

@tsirkin
Copy link
Author

tsirkin commented May 12, 2022

The truth is, I used snowfakery to generate data for for a Snowflake database, not a Salesforce, that is why I need the column to be a real boolean.
(I just liked the ability to have relations and a plethora of fakers in the tool).
I do have a schema file written separately, so I just generate the sql with Snowfakery and filter out the CREATE TABLE statement.
The problem is, that the booleans are written as 'False' and 'True' into the faked data while I need them as False and True (no quotes). So I am curious if this is possible.

Thank you!

@prescod
Copy link
Contributor

prescod commented May 12, 2022

I am interested in making Snowfakery useful for Snowflake, but are booleans really the only issue? What about date/times, references and (if it's relevant) decimal numbers?

You can learn about how to support new Output Formats here:

https://github.com/SFDO-Tooling/Snowfakery/blob/main/examples/YamlOutputStream.py

https://github.com/SFDO-Tooling/Snowfakery/blob/main/snowfakery/output_streams.py

Perhaps we can directly write the format that Snowflake expects. It sounds like it is very close to SQL. But how do object references (for example) work.

If booleans are really the only problem then perhaps we need a way for you to specify a schema to Snowfakery and it would try to coerce values to fit your schema. If the Schema is in SQL format, it shouldn't be a huge difficulty to do that.

Or perhaps we could make it possible for you to populate a pre-existing Snowflake database and let the adapter coerce values to match your DB.

@tsirkin
Copy link
Author

tsirkin commented May 15, 2022

Looking at the SqlTextOutputStream, it seems to me that by just using the right data type instead of Unicode(255) for the fields, will do most of the trick.
I mean here :

columns = [Column(field_name, Unicode(255)) for field_name in table.fields]

If the generated insert statements will be of the right data types, I will be able to successfully use it for db population.
WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants