-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix escaping filter_key in prometheus output #260
Conversation
* Fixes vozlt#142 * it can be escaped the 2 - 4 bytes character
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll trust you and leave it to you. The ngx_http_vhost_traffic_status_escape_prometheus() function has been merged from this PR.
@jongiddy Hi, I heard you implement this function, please you could review on this patch🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Prometheus exposition format is UTF-8 encoded so valid UTF-8 characters do not need to be encoded further.
Encoding valid UTF-8 characters to percent-encoding would be surprising. Why did you choose this encoding? The format does not mention this form of escape.
There are 2 problems here:
\xff
is an invalid UTF-8 character butngx_utf8_decode
does not detect this.- Even if did detect it was invalid, and it tried to use the
\xff
encoding that it uses fordec > 0x10ffff
that encoding is not supported.
I would suggest a fix to ngx_utf8_decode to detect more invalid UTF-8 characters, including \xff
. This allows valid UTF-8 to continue to be sent literally.
Change the handling of invalid UTF-8 to do something other than \x..
. Maybe percent-encoding is the best we can do.
I agree with your suggestion. Right away I send the patch to nginx-devel team. After patch merged. to follow backwards compatibility, we suppose to prepare the branches between versions sometime soon. |
Great. Thanks for creating the nginx patch. Also, I retract problem 2 in my comment above. My initial reading was that it would replace an invalid character with Taking a second look, it sends two backslashes ( Maybe |
@jongiddy |
* it is necessary while the below patch is released officially. * nginx/nginx@2c5fccd
@jongiddy |
@@ -187,7 +187,7 @@ ngx_http_vhost_traffic_status_escape_prometheus(ngx_pool_t *pool, ngx_str_t *buf | |||
} | |||
} else { | |||
char_end = pa; | |||
if (ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) { | |||
if (*char_end > 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (*char_end > 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) { | |
if (*pa >= 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) { |
This is a good approach, but 0xf8
is invalid too, so you need to use >=
.
I also suggest using *pa
as this is the pointer to the current character. char_end
happens to have the same value, but it has a different meaning associated with its use in the ngx_utf8_decode
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jongiddy Thanks! I fixed it.
@@ -237,7 +237,7 @@ ngx_http_vhost_traffic_status_escape_prometheus(ngx_pool_t *pool, ngx_str_t *buf | |||
} | |||
} else { | |||
char_end = pa; | |||
if (ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) { | |||
if (*char_end > 0xf8 || ngx_utf8_decode(&char_end, last - pa) > 0x10ffff) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jongiddy Also fixed
in detail, please read below.
#142 (comment)