Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(templating): Safer Jinja template processing #11704

Merged
merged 14 commits into from
Nov 17, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 14 additions & 5 deletions superset/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ def _try_json_readsha( # pylint: disable=unused-argument
# Experimental feature introducing a client (browser) cache
"CLIENT_CACHE": False,
"ENABLE_EXPLORE_JSON_CSRF_PROTECTION": False,
"ENABLE_TEMPLATE_PROCESSING": False,
"ENABLE_TEMPLATE_PROCESSING": True,
"KV_STORE": False,
"PRESTO_EXPAND_DATA": False,
# Exposes API endpoint to compute thumbnails
Expand Down Expand Up @@ -667,16 +667,25 @@ class CeleryConfig: # pylint: disable=too-few-public-methods
# A dictionary of items that gets merged into the Jinja context for
# SQL Lab. The existing context gets updated with this dictionary,
# meaning values for existing keys get overwritten by the content of this
# dictionary.
# dictionary. Exposing functionality through JINJA_CONTEXT_ADDONS has security
# implications as it opens a window for a user to execute untrusted code.
# It's important to make sure that the objects exposed (as well as objects attached
# to those objets) are harmless. We recommend only exposing simple/pure functions that
# return native types.
mistercrunch marked this conversation as resolved.
Show resolved Hide resolved
JINJA_CONTEXT_ADDONS: Dict[str, Callable[..., Any]] = {}

# A dictionary of macro template processors that gets merged into global
# A dictionary of macro template processors (by engine) that gets merged into global
# template processors. The existing template processors get updated with this
# dictionary, which means the existing keys get overwritten by the content of this
# dictionary. The customized addons don't necessarily need to use jinjia templating
# language. This allows you to define custom logic to process macro template.
# dictionary. The customized addons don't necessarily need to use Jinja templating
# language. This allows you to define custom logic to process templates on a per-engine
# basis. Example value = `{"presto": CustomPrestoTemplateProcessor}`
CUSTOM_TEMPLATE_PROCESSORS: Dict[str, Type[BaseTemplateProcessor]] = {}

# Prevent access to classes/objects and proxy methods in the default Jinja context,
# unless explicitly overridden by JINJA_CONTEXT_ADDONS or CUSTOM_TEMPLATE_PROCESSORS.
Copy link
Member

@mistercrunch mistercrunch Nov 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not want to make JINJA_CONTEXT_ADDONS mutually exclusive with SAFE_JINJA_PROCESSING. Someone may want to add a safe function to their environment without having to fully pivot into the legacy/more risky approach. It should be easy to support this, but we should highlight the caveats.

"""
Exposing functionality through JINJA_CONTEXT_ADDONS has security implications as it opens a window for a user to execute untrusted code. It's important to make sure that you make sure that the objects exposed (as well as objects attached to those objets) are harmless. We recommend only exposing simple/pure functions that return native types.
"""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unintentional. Fixing to allow JINJA_CONTEXT_ADDONS to coexist with SAFE_JINJA_PROCESSING

SAFE_JINJA_PROCESSING: bool = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to introduce a TEMPLATE_PROCESSOR parameter that accepts TemplateProcessorEnum values, something like

TemplateProcessorEnum(enum.Enum):
    SafeJinja = 1
    LegacyJinja = 2
    Chevron = 3
    Custom = 4

In this approach, LegacyJinja would include the old datetime, random etc base context, and SafeJinja would have a more limited set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to support so many different modes. To me it's more important to find a "paved path" of safe and flexible templating solution that makes the most sense. Every feature flag we added here is more like a temporary solution for compatibility rather than something we want to support in the long-term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktmud I agree, I think we should push safety and (potentially unsafe) customizability as a path forward.


# Roles that are controlled by the API / Superset and should not be changes
# by humans.
ROBOT_PERMISSION_ROLES = ["Public", "Gamma", "Alpha", "Admin", "sql_lab"]
Expand Down
24 changes: 13 additions & 11 deletions superset/extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,20 +42,22 @@

class JinjaContextManager:
def __init__(self) -> None:
self._base_context = {
"datetime": datetime,
"random": random,
"relativedelta": relativedelta,
"time": time,
"timedelta": timedelta,
"uuid1": uuid.uuid1,
"uuid3": uuid.uuid3,
"uuid4": uuid.uuid4,
"uuid5": uuid.uuid5,
}
self._base_context: Dict[str, Any] = {}
self._template_processors: Dict[str, Type["BaseTemplateProcessor"]] = {}

def init_app(self, app: Flask) -> None:
if not app.config["SAFE_JINJA_PROCESSING"]:
self._base_context = {
"datetime": datetime,
"random": random,
"relativedelta": relativedelta,
"time": time,
"timedelta": timedelta,
"uuid1": uuid.uuid1,
"uuid3": uuid.uuid3,
"uuid4": uuid.uuid4,
"uuid5": uuid.uuid5,
}
self._base_context.update(app.config["JINJA_CONTEXT_ADDONS"])
self._template_processors.update(app.config["CUSTOM_TEMPLATE_PROCESSORS"])

Expand Down
103 changes: 79 additions & 24 deletions superset/jinja_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,15 @@
"""Defines the templating context for SQL Lab"""
import inspect
import re
from typing import Any, cast, List, Optional, Tuple, TYPE_CHECKING
from functools import partial
from typing import Any, Callable, cast, Dict, List, Optional, Tuple, TYPE_CHECKING

from flask import g, request
from jinja2.sandbox import SandboxedEnvironment
from flask import current_app, g, request
from flask_babel import gettext as _
from jinja2.sandbox import ImmutableSandboxedEnvironment, SandboxedEnvironment

from superset import jinja_base_context
from superset.exceptions import SupersetTemplateException
from superset.extensions import feature_flag_manager, jinja_context_manager
from superset.utils.core import convert_legacy_filters_into_adhoc, merge_extra_filters

Expand All @@ -31,6 +34,20 @@
from superset.models.core import Database
from superset.models.sql_lab import Query

NONE_TYPE = type(None).__name__
ALLOWED_RETURN_TYPES = [
NONE_TYPE,
"bool",
"str",
"unicode",
"int",
"long",
"float",
"list",
"dict",
"tuple",
]


def filter_values(column: str, default: Optional[str] = None) -> List[str]:
""" Gets a values for a particular filter as a list
Expand Down Expand Up @@ -151,7 +168,7 @@ def cache_key_wrapper(self, key: Any) -> Any:

def url_param(
self, param: str, default: Optional[str] = None, add_to_cache_keys: bool = True
) -> Optional[Any]:
) -> Optional[str]:
"""
Read a url or post parameter and use it in your SQL Lab query.

Expand Down Expand Up @@ -186,6 +203,24 @@ def url_param(
return result


def safe_proxy(func: Callable[..., Any], *args: Any, **kwargs: Any) -> Any:
mistercrunch marked this conversation as resolved.
Show resolved Hide resolved
return_value = func(*args, **kwargs)
if not current_app.config["SAFE_JINJA_PROCESSING"]:
return return_value

value_type = type(return_value).__name__
if value_type not in ALLOWED_RETURN_TYPES:
raise SupersetTemplateException(
_(
"Unsafe return type for function %(func)s: %(value_type)s",
func=func.__name__,
value_type=value_type,
)
)

return return_value


class BaseTemplateProcessor: # pylint: disable=too-few-public-methods
"""Base class for database-specific jinja context

Expand Down Expand Up @@ -218,22 +253,18 @@ def __init__(
self._schema = query.schema
elif table:
self._schema = table.schema
self._extra_cache_keys = extra_cache_keys
self._context: Dict[str, Any] = {}
self._env = (
ImmutableSandboxedEnvironment()
if current_app.config["SAFE_JINJA_PROCESSING"]
else SandboxedEnvironment()
)
self.set_context(**kwargs)

extra_cache = ExtraCache(extra_cache_keys)

self._context = {
"url_param": extra_cache.url_param,
"current_user_id": extra_cache.current_user_id,
"current_username": extra_cache.current_username,
"cache_key_wrapper": extra_cache.cache_key_wrapper,
"filter_values": filter_values,
"form_data": {},
}
def set_context(self, **kwargs: Any) -> None:
self._context.update(kwargs)
self._context.update(jinja_base_context)
if self.engine:
self._context[self.engine] = self
self._env = SandboxedEnvironment()

def process_template(self, sql: str, **kwargs: Any) -> str:
"""Processes a sql template
Expand All @@ -247,6 +278,21 @@ def process_template(self, sql: str, **kwargs: Any) -> str:
return template.render(kwargs)


class JinjaTemplateProcessor(BaseTemplateProcessor):
def set_context(self, **kwargs: Any) -> None:
super().set_context(**kwargs)
extra_cache = ExtraCache(self._extra_cache_keys)
self._context.update(
{
"url_param": partial(safe_proxy, extra_cache.url_param),
"current_user_id": partial(safe_proxy, extra_cache.current_user_id),
"current_username": partial(safe_proxy, extra_cache.current_username),
"cache_key_wrapper": partial(safe_proxy, extra_cache.cache_key_wrapper),
"filter_values": partial(safe_proxy, filter_values),
}
)


class NoOpTemplateProcessor(
BaseTemplateProcessor
): # pylint: disable=too-few-public-methods
Expand All @@ -257,7 +303,7 @@ def process_template(self, sql: str, **kwargs: Any) -> str:
return sql


class PrestoTemplateProcessor(BaseTemplateProcessor):
class PrestoTemplateProcessor(JinjaTemplateProcessor):
"""Presto Jinja context

The methods described here are namespaced under ``presto`` in the
Expand All @@ -266,6 +312,15 @@ class PrestoTemplateProcessor(BaseTemplateProcessor):

engine = "presto"

def set_context(self, **kwargs: Any) -> None:
super().set_context(**kwargs)
self._context[self.engine] = {
mistercrunch marked this conversation as resolved.
Show resolved Hide resolved
"first_latest_partition": self.first_latest_partition,
"latest_partitions": self.latest_partitions,
"latest_sub_partition": self.latest_sub_partition,
"latest_partition": self.latest_partition,
}

@staticmethod
def _schema_table(
table_name: str, schema: Optional[str]
Expand Down Expand Up @@ -321,11 +376,11 @@ class HiveTemplateProcessor(PrestoTemplateProcessor):

# The global template processors from Jinja context manager.
template_processors = jinja_context_manager.template_processors
keys = tuple(globals().keys())
for k in keys:
o = globals()[k]
if o and inspect.isclass(o) and issubclass(o, BaseTemplateProcessor):
template_processors[o.engine] = o
default_processors = {"presto": PrestoTemplateProcessor, "hive": HiveTemplateProcessor}
for engine in default_processors:
# do not overwrite engine-specific CUSTOM_TEMPLATE_PROCESSORS
if not engine in template_processors:
template_processors[engine] = default_processors[engine]


def get_template_processor(
Expand All @@ -336,7 +391,7 @@ def get_template_processor(
) -> BaseTemplateProcessor:
if feature_flag_manager.is_feature_enabled("ENABLE_TEMPLATE_PROCESSING"):
template_processor = template_processors.get(
database.backend, BaseTemplateProcessor
database.backend, JinjaTemplateProcessor
)
else:
template_processor = NoOpTemplateProcessor
Expand Down