-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIP-33] Proposal for Removing SQLite Support for Metadata Databases #8874
Comments
Issue-Label Bot is automatically applying the label Links: app homepage, dashboard and code for this bot. |
This makes sense to me. If not supporting sqlite makes support and development easier, then I'm all for it. The only comment I might have is ensuring that the Docker workflow is mature enough to move forward with the deprecation. I haven't used it yet personally, maybe @craig-rueda can comment on the current state of using Docker for dev work and how easy/hard it is? |
I think Docker is close, having a few more tweaks I'd like to get in before calling it 100%. People have been asking about things like bringing your own requirements, etc. I would also caution that the tests themselves need to be updated a bit in order to lessen, or remove completely, their reliance on the Superset examples. The reason why I bring this up here is that speed and ease of running tests is paramount, and forcing a "real" DB into the mix will definitely affect performance over the current SQLite setup. As we all have skin in this game, our end goal is to make things as streamlined and approachable as possible. Anything that gets in the way or adds too much complexity should be carefully considered. Having said that, I think this will end up being a collaborative effort and will likely take some time to get "right". |
Thanks both for the comments. I think we could move ahead with deprecating SQLite, but gate the removal on the Docker workflow being mature enough for our needs. It would also be reasonable to open a PR moving all the tests over to Postgres to get an understanding of how that will affect CI run time. I'm happy to do that in the near future and report back here. |
I ran a quick test (draft PR here: #8889) and the timings for the tests which were leveraging SQLite seem nearly identical on Postgres: |
This is an unactionable statement.People who have hundreds of users relying on the current metastore have a single vague sentence to migrate their work? No link to a 3rd party tool? No information on what to do with it? the sentence does not paint a complete pictureThere will have to be changes made in Superset as well, presumably to the I cannot upvote this in good conscienceUntil the vague, imprecise, unactionable statement This can be accomplished like so has like so replaced with a link to a easy-to-follow step-by-step set of instructions about how to make the transition to Postgres, I don't feel comfortable with this transition. Why is said transition to be accomplished by third party tools?If every user in superset is going to have to do this, then why not have a console script that ships with Superset and has been thoroughly tested? |
I assume the assumption behind this SIP is that anyone running Superset in production is already using Postgres or MySQL as a backend database. If this is not the case, i.e. there is in fact widespread sqlite production usage with Superset, and they need help moving the data over, I can probably put together a script for performing the migration. FWIW I think encouraging people to use a proper OLTP backend is the right direction, especially as Superset is still in v. 0.x, and therefore I vote +1 for the deprecation of sqlite. |
Hi @metaperl, thanks for the feedback. A couple of points: @villebro is correct, SQLite is not a production-ready datastore, so this should only affect development setups for the vast majority of organizations. Migrating a development setup to our dockerized system should not require bringing development data along for the ride. I don't foresee many (if any) having to transition metadata databases as a result of this change. You'll notice that there is a deprecation period for the SQLite support - it will not be yanked immediately. If we hear feedback during that time that organizations will need support migrating the data to a different datastore it will be a good idea to provide a migration path to the community. I appreciate your enthusiasm. One minor process point is important here though - only Apache Software Foundation committers are able to vote on SIPs. I myself am not yet a committer, so I have no ability to vote yea or nea on this proposal. Does your organization use SQLite in production? If so, I would love to learn more about your deployment so that we can support organizations like yours better! |
+1 this change may cause some pain for some, however, as @villebro mentions, I think it's important to encourage people to use a proper OLTP db in production settings and doing so is worth the pain. I've noticed plenty of instances (in code) that try to work around sqlite's shortcomings and it seems to me the code is suffering in order to support a [slightly] less painful development workflow. I'm in favor of deprecating, especially as we gear up for 1.0 release |
Hi @willbarrett , I am pretty new to superset so my comment may not be accurate, but regarding the batch migrations:
This should not have an actual impact on the migration process. The method As for SQLite for production, I am neither support nor against it, as it does give you an easy deployment with small users' size. However, as we are living in a world of containerization, deployment is not an issue either. |
Thanks for the comments @xinbinhuang - I wasn't aware of that behavior on I don't see it mentioned here in the history, so I should mention that this SIP has gone through the Apache voting process and has been approved. I'll be working on adding deprecation notices in the next week or so. |
The above thread says:
Current Apache superset documentation still has SQLite as the default metadata database. But in reality, Apache superset already doesn't work with SQLite. We can't get started with Apache Superset + SQLite as metadata store today (2021-11-25). Using Apache superset version 1.3.2. I have raised a new issue in this regard. Please comment there if you are also facing this issue. (superset-experiment) ➜ superset-experiment git:(main) ✗ PYTHONPATH=.:$PYTHONPATH superset db upgrade
Loaded your LOCAL configuration at [/Users/sairam/superset-experiment/superset_config.py]
logging was configured successfully
2021-11-25 07:28:07,681:INFO:superset.utils.logging_configurator:logging was configured successfully
2021-11-25 07:28:07,695:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/flask_caching/__init__.py:201: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled.
warnings.warn(
No PIL installation found
2021-11-25 07:28:07,915:INFO:superset.utils.screenshots:No PIL installation found
WARNI [alembic.env] SQLite Database support for metadata databases will be removed in a future version of Superset.
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
INFO [alembic.runtime.migration] Running upgrade 12d55656cbca -> 2591d77e9831, user_id
Traceback (most recent call last):
File "/Users/sairam/superset-experiment/.venv/bin/superset", line 8, in <module>
sys.exit(superset())
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/flask/cli.py", line 586, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/flask/cli.py", line 426, in decorator
return __ctx.invoke(f, *args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/flask_migrate/cli.py", line 149, in upgrade
_upgrade(directory, revision, sql, tag, x_arg)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/flask_migrate/__init__.py", line 98, in wrapped
f(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/flask_migrate/__init__.py", line 185, in upgrade
command.upgrade(config, revision, sql=sql, tag=tag)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/command.py", line 320, in upgrade
script.run_env()
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/script/base.py", line 563, in run_env
util.load_python_file(self.dir, "env.py")
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 92, in load_python_file
module = load_module_py(module_id, path)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 108, in load_module_py
spec.loader.exec_module(module) # type: ignore
File "<frozen importlib._bootstrap_external>", line 783, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/superset/migrations/env.py", line 124, in <module>
run_migrations_online()
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/superset/migrations/env.py", line 116, in run_migrations_online
context.run_migrations()
File "<string>", line 8, in run_migrations
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/runtime/environment.py", line 851, in run_migrations
self.get_context().run_migrations(**kw)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/runtime/migration.py", line 620, in run_migrations
step.migration_fn(**kw)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/superset/migrations/versions/2591d77e9831_user_id.py", line 36, in upgrade
batch_op.create_foreign_key("user_id", "ab_user", ["user_id"], ["id"])
File "/Users/sairam/.pyenv/versions/3.8.1/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/operations/base.py", line 374, in batch_alter_table
impl.flush()
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/alembic/operations/batch.py", line 118, in flush
existing_table = Table(
File "<string>", line 2, in __new__
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 139, in warned
return fn(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 563, in __new__
metadata._remove_table(name, schema)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 558, in __new__
table._init(name, metadata, *args, **kw)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 647, in _init
self._autoload(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 670, in _autoload
autoload_with.run_callable(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1653, in run_callable
return callable_(self, *args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 484, in reflecttable
return insp.reflecttable(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 684, in reflecttable
self._reflect_fk(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 868, in _reflect_fk
sa_schema.Table(
File "<string>", line 2, in __new__
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 139, in warned
return fn(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 563, in __new__
metadata._remove_table(name, schema)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 558, in __new__
table._init(name, metadata, *args, **kw)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 647, in _init
self._autoload(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 670, in _autoload
autoload_with.run_callable(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1653, in run_callable
return callable_(self, *args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 484, in reflecttable
return insp.reflecttable(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 684, in reflecttable
self._reflect_fk(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 868, in _reflect_fk
sa_schema.Table(
File "<string>", line 2, in __new__
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 139, in warned
return fn(*args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 563, in __new__
metadata._remove_table(name, schema)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 558, in __new__
table._init(name, metadata, *args, **kw)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 647, in _init
self._autoload(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 670, in _autoload
autoload_with.run_callable(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1653, in run_callable
return callable_(self, *args, **kwargs)
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 484, in reflecttable
return insp.reflecttable(
File "/Users/sairam/superset-experiment/.venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 678, in reflecttable
raise exc.NoSuchTableError(table.name)
sqlalchemy.exc.NoSuchTableError: ab_user Most amazing feature about Apache Superset is that it was so good to get started quickly. SQLite as metadata store was helping on this. In production, if someone uses SQLite, I believe that is their responsibility. For small, simple application - SQLite was super useful as a Superset Metadata store. Many applications start small and grow big. I use Superset with SQLite as metadata database in all my data science personal projects, experimental applications etc till the application proves it's existence. As @xinbinhuang pointed out, Alembic supports SQLite & non SQLite flows. There is a small additional step that we need to do while writing database migration scripts. I tried finding alternative to SQLite. In Python + SQLAlchemy world, there is no good alternative to SQLite, as far as I know. H2 & HyperSQL are java based and provide only jdbc drivers. SQLAlchemy python+jdbc driver dialects are not mature enough even in November 2021. |
@betodealmeida, @dpgaspar, @michael-s-molina, @villebro et al. I see there's logic to also prevent SQLite being used as an analytics (as opposed to metadata) database and was wondering whether there was merit in extending this SIP (or authoring another SIP) to remove SQLite globally—including as a Celery backend. |
I honestly don't understand what we gain from removing support for SQLite. |
I'm going to go back to some extent on what I've said previously, but I have to agree with @betodealmeida here. While I agree that vanilla SQLite is not a good solution for a production Superset deployment, things have changed somewhat since the original SIP discussion. There are nowadays many highly available SQLite variants out there, and being able to run Superset on SQLite makes it much more approachable for quick testing than if we were to require using Postgres or MySQL. Also, developing on docker compose isn't always as convenient as running a full Flask backend locally. I, for one, still prefer running my devenv without docker compose, and I believe so do many other committers. Also, the fact that our ephemeral env config s using SQLite says a good deal about the convenience of using SQLite over Postgres/MySQL. I think having SQLite backend support behind a config flag that's disabled by default is IMO a good solution, as it will serve as a good warning for anyone considering using it for production use. So I vote for keeping things as they are. FYI @john-bodley |
@villebro @john-bodley If we are going to preserve SQLite support then I think we should resolve the following error that frequently happens when testing RCs: |
We should probably also submit a lazy consensus email to discard this SIP if the general opinion changed. @rusackas for awareness |
@michael-s-molina @rusackas maybe we could discuss this in the town hall tomorrow? |
@michael-s-molina the "error" you reported is related to using SQLite as an analytical database as opposed to Superset's metadata database. |
There's a 4.0 proposal related to this SIP that will be submitted for lazy consensus. If we think that we shouldn't drop support for SQLite, let's just downvote the proposal during that phase and we can officially discard this SIP. |
This SIP was considered obsolete by the 4.0 lazy consensus email and it's now officially discarded. |
[SIP] Proposal for Removing SQLite Support for Metadata Databases
Motivation
Historically, SQLite has been used as a starting point for development for new contributors to Superset. While the need for an easy onramp still exists, I believe that it is fully covered by the new Docker compose file contributed by Craig Rueda. Deprecating SQLite will encourage
docker-compose
as a primary starting point for new contributors to the project. SQLite support causes multiple pain points for developers, including bad migration practices with unnecessary batch of migrations and database schema drift due to SQLite’s inability to alter constraints on existing tables. Issues contributed to Github referenced below which indicate that SQLite is in some instances being used in a production-like environment. This is a strong anti-pattern that we would like to avoid that makes supporting Superset in the community more difficult.Proposed Change
Immediate:
In 2 minor versions:
New or Changed Public Interfaces
Support for SQLite as a metadata database will be officially removed. SQLite may still be usable, but it will not be supported in future Alembic migrations.
Migration Plan and Compatibility
Users who leverage SQLite in production will be required to migrate their data to a different supported metadata database, either MySQL or Postgres. This can be accomplished via third-party tools.
Rejected Alternatives
The primary other option is not deprecating SQLite and not removing it from the build matrix. This is really a binary decision.
The text was updated successfully, but these errors were encountered: