-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document u8
s that represent ASCII characters in config fields as characters in the docs
#18669
Comments
Is there a good way of representing |
@jszwedko any thoughts on how to best solve this? I am by no means an rust or vector expert but I am happy to put in the work given some thought through guidance :) |
I discovered that the delimiter is actually already marked as an "ASCII character" in the code: vector/lib/codecs/src/encoding/format/csv.rs Lines 77 to 83 in 056c2df
However this is not correctly translated in the cue definition: vector/website/cue/reference/components/sinks/base/socket.cue Lines 153 to 156 in 056c2df
It should be using the vector/website/cue/reference.cue Line 500 in 056c2df
So I think the issue might lie in the translation from the configurable definition to the cue files (and potentially, past that, to the website rendering of the config options). The translation happens in a couple of places:
This is a nuanced issue 😅 If you are interested in digging into it, those would be some places to start though. Let me know if you have additional questions! |
@jszwedko thanks a lot for those hints. what I found out so far (in the end I will ask for your intuition), is the generated json schema file, which is then fed into the ruby script does contain the undesired type already. so the faulty behavior happens before that. regarding the generated json schema I have not yet identified the code doing the conversion but I found this list of valid types for the json schema (I could imagine based on that assumptions are made about vector/lib/vector-config-common/src/schema/json_schema.rs Lines 537 to 550 in c2765f4
looking through the quite well written documentation I found this peace of insight, which basically hints on the type conversion being just inherently complicated...: vector/lib/vector-config-macros/src/ast/field.rs Lines 63 to 95 in c2765f4
going one step further finding this hint which explicetly states the vector/lib/vector-config-macros/src/ast/util.rs Lines 243 to 281 in c2765f4
I would be willing to look into this more deeply but first i like to check back if I am on the right track? |
@scMarkus 👋🏻 I'm the engineer who originally wrote all of the Overall, while delegated types are sort of related to the problem, there's more of a fundamental underlying issue here: both Essentially, there's no type in the standard library that represents an ASCII character. UTF-8 characters? Sure, The simpler approach is just annotating the This adds metadata that tells the documentation generation script to ignore determining the field type itself, and to instead just use the type we've given it. That gets us generating these fields as being The second part we have to solve, likely, is actually having the documentation render that default value -- e.g. That's something you'll likely need to test, but we can also work through that as part of any PR you open. |
Hi @tobz /// The field delimiter to use when writing CSV.
#[configurable(metadata(docs::type_override = "ascii_char"))]
#[serde(
default = "default_delimiter",
with = "vector_core::serde::ascii_char",
skip_serializing_if = "vector_core::serde::is_default"
)]
pub delimiter: u8, and looked at the intermediate json output. which changes from "delimiter": {
"description": "The field delimiter to use when writing CSV.",
"default": 44,
"type": "integer",
"maximum": 255.0,
"minimum": 0.0,
"_metadata": {
"docs::numeric_type": "uint",
"docs::human_name": "Delimiter"
}
}, to "delimiter": {
"description": "The field delimiter to use when writing CSV.",
"default": 44,
"type": "integer",
"maximum": 255.0,
"minimum": 0.0,
"_metadata": {
"docs::numeric_type": "uint",
"docs::type_override": "ascii_char",
"docs::human_name": "Delimiter"
}
}, which makes the sinks cue file change but not website/cue/reference/remap/functions/parse_csv.cue file diff --git a/website/cue/reference/components/sinks/base/aws_s3.cue b/website/cue/reference/components/sinks/base/aws_s3.cue
index b8cb9be0d..769c2d4d7 100644
--- a/website/cue/reference/components/sinks/base/aws_s3.cue
+++ b/website/cue/reference/components/sinks/base/aws_s3.cue
@@ -421,7 +421,7 @@ base: components: sinks: aws_s3: configuration: {
delimiter: {
description: "The field delimiter to use when writing CSV."
required: false
- type: uint: default: 44
+ type: ascii_char: {}
}
double_quote: {
description: """ I do agree to opening and working on a PR. EDIT: |
Ah, yeah. The documentation for remap functions is not generated automatically through all of the |
@scMarkus thanks for noticing! I'll close this. |
For example, #18320, introduced a few new fields that take
u8
s that are supposed to be characters. These end up being documented as unsigned integers like so:Even though in configuration they would be specified as character literals (i.e.
,
rather than44
).The text was updated successfully, but these errors were encountered: