-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(templating): Safer Jinja template processing #11704
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11704 +/- ##
==========================================
- Coverage 63.04% 56.00% -7.05%
==========================================
Files 895 408 -487
Lines 43315 14437 -28878
Branches 4015 3716 -299
==========================================
- Hits 27308 8085 -19223
+ Misses 15829 6352 -9477
+ Partials 178 0 -178
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
superset/config.py
Outdated
@@ -677,6 +677,10 @@ class CeleryConfig: # pylint: disable=too-few-public-methods | |||
# language. This allows you to define custom logic to process macro template. | |||
CUSTOM_TEMPLATE_PROCESSORS: Dict[str, Type[BaseTemplateProcessor]] = {} | |||
|
|||
# Prevent access to classes/objects and proxy methods in the default Jinja context, | |||
# unless explicitly overridden by JINJA_CONTEXT_ADDONS or CUSTOM_TEMPLATE_PROCESSORS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may not want to make JINJA_CONTEXT_ADDONS
mutually exclusive with SAFE_JINJA_PROCESSING
. Someone may want to add a safe function to their environment without having to fully pivot into the legacy/more risky approach. It should be easy to support this, but we should highlight the caveats.
"""
Exposing functionality through JINJA_CONTEXT_ADDONS
has security implications as it opens a window for a user to execute untrusted code. It's important to make sure that you make sure that the objects exposed (as well as objects attached to those objets) are harmless. We recommend only exposing simple/pure functions that return native types.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was unintentional. Fixing to allow JINJA_CONTEXT_ADDONS
to coexist with SAFE_JINJA_PROCESSING
superset/jinja_context.py
Outdated
def set_context(self, **kwargs: Any) -> None: | ||
extra_cache = ExtraCache(self._extra_cache_keys) | ||
self._context = { | ||
"url_param": partial(safe_proxy, extra_cache.url_param), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't familiar with partial
, but I'm guessing that the intent is to clean up other methods/context that could be attached to the callable (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent with partial
is to wrap the callable with a method to enforce a safe return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good step forward for securing Jinja. Would be curious to hear other community thoughts on this, especially if someone is extensively using datetime
, random
, uuid
etc from the current base context. Perhaps kick off a DISCUSS on the mailing list?
superset/config.py
Outdated
@@ -677,6 +677,10 @@ class CeleryConfig: # pylint: disable=too-few-public-methods | |||
# language. This allows you to define custom logic to process macro template. | |||
CUSTOM_TEMPLATE_PROCESSORS: Dict[str, Type[BaseTemplateProcessor]] = {} | |||
|
|||
# Prevent access to classes/objects and proxy methods in the default Jinja context, | |||
# unless explicitly overridden by JINJA_CONTEXT_ADDONS or CUSTOM_TEMPLATE_PROCESSORS. | |||
SAFE_JINJA_PROCESSING: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to introduce a TEMPLATE_PROCESSOR
parameter that accepts TemplateProcessorEnum
values, something like
TemplateProcessorEnum(enum.Enum):
SafeJinja = 1
LegacyJinja = 2
Chevron = 3
Custom = 4
In this approach, LegacyJinja
would include the old datetime
, random
etc base context, and SafeJinja
would have a more limited set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to support so many different modes. To me it's more important to find a "paved path" of safe and flexible templating solution that makes the most sense. Every feature flag we added here is more like a temporary solution for compatibility rather than something we want to support in the long-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktmud I agree, I think we should push safety and (potentially unsafe) customizability as a path forward.
superset/jinja_context.py
Outdated
none_type, | ||
"bool", | ||
"str", | ||
"unicode", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm pretty sure unicode
has been replaced by str
in Python 3.
superset/jinja_context.py
Outdated
return_value = func(*args, **kwargs) | ||
value_type = type(return_value).__name__ | ||
if value_type not in allowed_types: | ||
raise SupersetTemplateException("Unsafe template value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to get a more verbose error here, something like
raise SupersetTemplateException(__("Unsafe return type for function %(func)s: %(value_type)s", func=func.__name__, value_type=value_type)
Taking @ktmud 's feedback, making this less configurable and moving towards sane defaults which can be overridden via existing |
UPDATING.md
Outdated
@@ -36,7 +38,7 @@ assists people when migrating to a new version. | |||
and requires more work. You can easily turn on the languages you want | |||
to expose in your environment in superset_config.py | |||
|
|||
* [11172](https://github.com/apache/incubator-superset/pull/11172): Breaking change: SQL templating is turned off be default. To turn it on set `ENABLE_TEMPLATE_PROCESSING` to True on `DEFAULT_FEATURE_FLAGS` | |||
* [11172](https://github.com/apache/incubator-superset/pull/11172): Breaking change: SQL templating is turned off be default. To turn it on set `ENABLE_TEMPLATE_PROCESSING` to True in `FEATURE_FLAGS` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should prevent DEFAULT_FEATURE_FLAGS
from being overridden by custom config files/modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! but some tests are still failing, left a couple of non blocking comments and a question regarding presto context
UPDATING.md
Outdated
- [11172](https://github.com/apache/incubator-superset/pull/11172): Breaking change: SQL templating is turned off be default. To turn it on set `ENABLE_TEMPLATE_PROCESSING` to True on `DEFAULT_FEATURE_FLAGS` | ||
>>>>>>> master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a merge conflict, while your at it can you fix turned off be default
to turned off by default
self._context[self.engine] = { | ||
"first_latest_partition": partial(safe_proxy, self.first_latest_partition), | ||
"latest_partitions": partial(safe_proxy, self.latest_partitions), | ||
"latest_sub_partition": partial(safe_proxy, self.latest_sub_partition), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Not totally related with this PR but it's strange that def latest_sub_partition(self, table_name: str, **kwargs: Any) -> Any:
returns Any
should probably be Optional[str]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, non-blocking comment on naming of function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one small question
COLLECTION_TYPES = ("list", "dict", "tuple", "set") | ||
|
||
|
||
@memoized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of memoization here? Is current_app
proxy very expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value doesn't change over time, so just trying to save some processing cycles. Probably not much gain here, but attempting not to repeat initialization logic.
184c6b4
to
f7fbd59
Compare
* Enable safer Jinja template processing * Allow JINJA_CONTEXT_ADDONS with SAFE_JINJA_PROCESSING * Make template processor initialization less magical, refactor classes * Consolidat Jinja logic, remove config flag in favor of sane defaults * Restore previous ENABLE_TEMPLATE_PROCESSING default * Add recursive type checking, update tests * remove erroneous config file * Remove TableColumn models from template context * pylint refactoring * Add entry to UPDATING.md * Resolve botched merge conflict * Update docs on running single python test * Refactor template context checking to support engine-specific methods
SUMMARY
Prevent passing potentially unsafe modules/methods to the Jinja template context:
datetime
andtimedelta
current_user_id
andurl_param
through a method that enforces static return valuesThere are no changes to Jijna templating syntax with this update, other than the default modules (such as
datetime
) are no longer available by default.Adding custom Jinja context items is still possible via
JINJA_CONTEXT_ADDONS
as is overriding the template processor on a per-database engine basis viaCUSTOM_TEMPLATE_PROCESSORS
.Additional context/discussion in #11617
TODO
TEST PLAN
ADDITIONAL INFORMATION
@mistercrunch @ktmud @dpgaspar @villebro