Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Annotation Format #12226

Closed
alexhersh opened this issue Aug 4, 2015 · 41 comments
Closed

Improve Annotation Format #12226

alexhersh opened this issue Aug 4, 2015 · 41 comments
Labels
area/api Indicates an issue on api area. area/extensibility priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@alexhersh
Copy link

I am working on implementing network policy rules in Kubernetes annotations and am hitting a limitation in how Kubernetes parses dictionary values in the pod spec.

I would like to put nested lists within annotation values, but annotations are expected to be single "key": "value" string pairs. To get around this I am tokenizing my data as a single string and then marshaling it separately in my code, but the result is unreadable and not intuitive.

For example, I would like to format my pod config like:

"annotations": {
    "rule1": [
        {
            "labels": {
                "name": "pod1"
            }
        },
        {
            "protocol": "tcp"
        }
    ]
}

but my workaround looks like:

"annotations": {
    "rule1": "[{\"labels\": {\"name\": \"pod1\"}},{\"protocol\": \"tcp\"}]"
}

One line solutions are cool and all, but this design is not very easy to program or debug, nor is it suitable for larger annotations. I would much prefer to have more robust annotation support natively in the Kubernetes pod spec.

If you agree that this feature is worth implementing, I can go ahead and try to build it in myself.

@nikhiljindal
Copy link
Contributor

cc @bgrant0607

@bgrant0607 bgrant0607 added the area/api Indicates an issue on api area. label Aug 5, 2015
@bgrant0607
Copy link
Member

This was on my list of features to consider for "v2" APIs: #8190.

If changed in the v1 API we'd need to retain backward compatibility. Do you think that's feasible?

We've discussing "subobject plugins" a number of times. Some standardized schema metadata (apiVersion, kind) would be necessary for interpreting them, much as our top-level API resources.

cc @smarterclayton @thockin

@bgrant0607 bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Aug 5, 2015
@bgrant0607
Copy link
Member

cc @brendandburns

@bgrant0607
Copy link
Member

cc @lavalamp

@bgrant0607
Copy link
Member

@alexhersh I do think this is worthwhile, if you'd like to give it a try. The main requirement that needs to be satisfied is backward compatibility -- strings need to continue to work.

@bgrant0607 bgrant0607 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Aug 7, 2015
@bgrant0607 bgrant0607 added this to the v1.1 milestone Aug 7, 2015
@bgrant0607 bgrant0607 changed the title Improve Annotation Format on Pod Spec Improve Annotation Format Aug 7, 2015
@bgrant0607
Copy link
Member

This would also make it easier to use annotations to represent experimental fields exposed through new API versions when we serialize the resources using v1 in etcd.

I think we should be able to retain backward compatibility, but might need our own polymorphic type, similar to IntOrString.

cc @caesarxuchao

@thockin
Copy link
Member

thockin commented Aug 7, 2015

This would be a cute trick, but it might wreak havoc on client libraries. I know we can trick Go into doing things, I don't know about other languages...

@bgrant0607
Copy link
Member

cc @abonas @jimmidyson for input on clients in at least one other language

An alternative would be a new extensions field, perhaps parallel to metadata rather than a child of it. That was proposed last December IIRC.

@thockin
Copy link
Member

thockin commented Aug 8, 2015

Whether we use the same field or another, I'd like some input on how
hostile (or not) this is to clients. Trying to imagine doing it in C++,
for example...

On Fri, Aug 7, 2015 at 3:57 PM, Brian Grant notifications@github.com
wrote:

cc @abonas https://github.com/abonas @jimmidyson
https://github.com/jimmidyson for input on clients in at least one
other language

An alternative would be a new extensions field, perhaps parallel to
metadata rather than a child of it. That was proposed last December IIRC.


Reply to this email directly or view it on GitHub
#12226 (comment)
.

@bgrant0607
Copy link
Member

I don't see a problem. It's generic json: maps, lists, strings, etc. A schema isn't necessary to parse it.

@thockin
Copy link
Member

thockin commented Aug 8, 2015

no, no schema required, just non-obvious marshalling/unmarshalling. Or
maybe this is de rigeur and I'm just a whiner.

On Fri, Aug 7, 2015 at 8:58 PM, Brian Grant notifications@github.com
wrote:

I don't see a problem. It's generic json: maps, lists, strings, etc. A
schema isn't necessary to parse it.


Reply to this email directly or view it on GitHub
#12226 (comment)
.

@jimmidyson
Copy link
Member

No big deal for the Java client. We would probably hold it as a string & allow registration of custom (un)marshalers.

Is there any concern on starting to use annotations for larger objects like this? Potentially hitting the resource limit? I could see this functionality having a lot of uses but what kind of impact could this have on storage, etc?

@abonas
Copy link
Contributor

abonas commented Aug 8, 2015

cc @simon3z

@abonas
Copy link
Contributor

abonas commented Aug 8, 2015

@bgrant0607 @thockin thanks for doing this discussion.
It shouldn't be a problem for the Ruby kubeclient client library itself, it's pretty flexible. (I'll reverify this in tests and report back if I find any issue)
However such a change will present certain problems to the projects/products which consume the ruby client - they might have made assumptions regarding the structure of annotations in the relevant database schemas and/or the UI. One place that has those assumptions is https://github.com/manageiq/manageiq
There are several other consumers of the kubeclient I know of, and since most of them are not open source, I'm not able to say how badly it will break their code.
So given that annotations is a released api and as it's mentioned above it has to keep backwards compatibility, the default "out of the box" annotations should, IMO, come with the format as released in 1.0.
Shall someone want to work with something more advanced, it should be either a flag or a plugin that can be added to a specific system, or communicated as an api break in a future release.

@smarterclayton
Copy link
Contributor

We have folks using about 20k per resource for builds to store metadata.
It makes lists slower (enough that we are probably going to look at
selective field retrieval) but we haven't hit storage issues yet (etcd
limit is 1M but big objects will have a cost as they move through the raft
log). Having the schema be native json for a few of them instead of
strings would reduce some of the overhead (5-10%) in those calls.

On Aug 8, 2015, at 4:12 AM, Jimmi Dyson notifications@github.com wrote:

No big deal for the Java client. We would probably hold it as a string &
allow registration of custom (un)marshalers.

Is there any concern on starting to use annotations for larger objects like
this? Potentially hitting the resource limit? I could see this
functionality having a lot of uses but what kind of impact could this have
on storage, etc?


Reply to this email directly or view it on GitHub
#12226 (comment)
.

@bgrant0607
Copy link
Member

We will need size limits at some point, not just for annotations, but also for command lines, env, and anywhere else people could cram extra data. Some sort of resource-size LimitRange and/or ResourceQuota would make sense.

I agree with the backward-compatibility concerns. If we were to do this, it would need to be a new field.

I propose the field be called extensions. I'd make it a sibling of metadata, rather than nested under metadata. I think there are going to be other common top-level fields we will want across all objects, such as dependency-related fields. We could create ObjectCommon as an inline struct, which would include extensions. Conceptually it could include ObjectMeta, also, but that's probably too much churn for too little gain.

@lavalamp
Copy link
Member

Implementation note: the easiest way to do this in go is to just stick variables of type json.RawJSON in the structure; that way, you don't even unmarshal them if you don't need to.

It would be a breaking change to start doing this for annotations, however. Do we want to encourage people to store this much data in their objects?

@bgrant0607 bgrant0607 added this to the v1.2 milestone Aug 29, 2015
@davidopp
Copy link
Member

davidopp commented Jan 8, 2016

Related to #18670

@smarterclayton
Copy link
Contributor

We discussed this today some and while there are some short term advantages, the addition of additional unstructured structure does not dramatically improve the ability of the annotation to fulfill its core purpose - allowing third parties to carry opaque data with arbitrary objects. The json structure of the child object is only truly beneficial in a json serialization - and we already anticipate allowing a formal protobuf representation of all objects. The protobuf representation of these annotations would not be any easier to work with than a map, and would require more complex client logic for handling edge cases (whether numbers are allowed, is their format the same as json or different, etc).

We feel that map[string]string is the simplest construct that accomplishes the goal, and additional structure is only situationally relevant. We should however better document best practices and patterns for working with annotations, planning for versioning, and storing complex data.

@thockin
Copy link
Member

thockin commented Jan 8, 2016

+1 to @smarterclayton - I move close this

@ravigadde
Copy link
Contributor

@smarterclayton
If someone is using a client to create a pod, the limitation with annotations is not a big deal as the client can do the encoding. The problem exists for kubectl users with hand written pod specs, cant expect users to encode by hand. Would prefer to use kubectl for creating pods and not invent a layer above it just for this purpose (and then be burdened with keeping up with kubectl or resorting to ugly hacks).

@smarterclayton
Copy link
Contributor

Is

kubectl annotate 'key=[{rule1:value2},{rule2:value1}]'

"ugly"? How about:

cat file.json | kubectl annotate -f - --dry-run -o json ... | kubectl create -f - 

Kubectl already does JSON encoding for you in this case.

@ravigadde
Copy link
Contributor

Nice trick, better than not having anything. Its still hacky imo to ask the end user to do this on every pod.

  1. Write a pod spec in a json file
  2. Write the additions to pod spec in a separate json file
  3. Use the dry run to generate a json file or to create a pod

PS: Is dry-run available on master?

@ravigadde
Copy link
Contributor

Not sure how the second version works.

cat file.json | kubectl annotate -f - --dry-run -o json ... | kubectl create -f -

-f looks for a file with pod name. It cant have just the annotations. If it has the pod, where do you put the unencoded annotations? If it needs encoded annotations, its back to square one.

It means that long structures still go on the command line which is a big pain.

@ravigadde
Copy link
Contributor

If kubectl create and annotate can take an extensions json file as another argument, json encode and update annotations that might work.

@smarterclayton
Copy link
Contributor

The ... is the example is the encoded json string (well, bash). When
you pipeline you can chain mutate the object.

Ultimately, if you want to apply an extension to everything, and you
want it to be simple, you need to have a more sophisticated scheme and
maybe an admission controller.

Ie support "v1.myextensions/does" with value "somethingsimple". Then
have an admission controller convert "somethingsimple" to something
more structured (as another annotation). If someone needs complex
encoded values... those are already hard in a cli. I don't think it's
unreasonable to want to make the cli experience more streamlined, but
it's probably something that is iterative. The intent is to make
that easy in the cli - but I think the experience around how to build
easy extensions through annotations needs more of an example. We
could support better extension of kubectl to allow transformations
to be implemented in shell or Python that can do transformations, for
anyone in an environment that can deliver files alongside kubectl.

@ravigadde
Copy link
Contributor

I don't think a very sophisticated scheme is needed for what is being asked. There are a lot of plugins/extensions which need some configuration. It would be nice to carry that in the pod spec without making them first class objects.

The alternative is to submit a proposal, rally the community and finally get it in a release. Even for good use cases, this process takes several months. If the use case is not very common, then its not going to be accepted in to k8s. Having a mechanism to carry this configuration will allow the plugins to move at their own pace.

@smarterclayton
Copy link
Contributor

I understand. The recommendation at the current time is to focus on
annotations as they are, identify places where we can enhance the
experience around them, and identify from concrete examples where
future releases of Kube (1.4+) can deliver better options. In the
short term, extensions on certain elements in existing objects will be
considered, as enhancements to annotations do not improve locality
of data to specific elements.

@ravigadde
Copy link
Contributor

An alternate is to define a new field "extensions" as proposed earlier in the thread. As long as the size of this can be limited, I dont see any drawbacks. It will allow standard interfaces/tooling to be used for k8s.

@bgrant0607
Copy link
Member

I think the main issues for annotations as field extensions currently are:

  1. No conversion from yaml to json
  2. Escaping quotes in the nested json

Escaping seems to not be a problem for yaml. Use single quotes:

  annotations:
    scheduler.alpha.kubernetes.io/tolerations: '[{ "key": "dedicated", "operator": "Exists", "effect": "NoScheduleNoAdmitNoExecute" }]'
kubectl convert -f nginx-pod.yaml -o json
        "annotations": {
            "scheduler.alpha.kubernetes.io/tolerations": "[{ \"key\": \"dedicated\", \"operator\": \"Exists\", \"effect\": \"NoScheduleNoAdmitNoExecute\" }]"
        }

@bgrant0607
Copy link
Member

Literal and folded styles also work:

  annotations:
    scheduler.alpha.kubernetes.io/tolerations: >
       [
        { "key": "dedicated",
          "operator": "Exists",
          "effect": "NoScheduleNoAdmitNoExecute"
        }
       ]
        "annotations": {
            "scheduler.alpha.kubernetes.io/tolerations": "[\n { \"key\": \"dedicated\",\n   \"operator\": \"Exists\",\n   \"effect\": \"NoScheduleNoAdmitNoExecute\"\n }\n]\n"
        }

That doesn't seem bad.

@davidopp
Copy link
Member

cc/ @kevin-wangzefeng

@davidopp
Copy link
Member

@bgrant0607 Did the second example (this one #12226 (comment) ) render correctly on Github? That is, it's just regular YAML but with a "greater than" sign after the colon on the line with the key name?

Also, if we do this would that mean people who want to use this annotation would have to write their whole config in YAML? I assume you can't just toss in some random YAML (for the annotation) in the midst of a JSON pod description? It seems like that might be a deal-breaker.

@bgrant0607
Copy link
Member

@davidopp Either block scalar style should work:
http://yaml.org/spec/current.html#id2539942

Yes, my example has a greater-than sign at the end of line. A pipe at the end of line (literal style) also works.

JSON is legal YAML, so they can mix JSON and YAML as they wish.

@davidopp
Copy link
Member

What did you mean "no conversion from yaml to json"? Clearly your "kubectl convert -f nginx-pod.yaml -o json" converts yaml to json. Did you mean inside the API server?

@bgrant0607
Copy link
Member

@davidopp The data inside the annotation value is not converted because it is opaque to kubectl and to the apiserver. We could do the conversion server-side, but that would have to be on a per-annotation basis.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jan 22, 2016 via email

@bgrant0607
Copy link
Member

cc @mikedanese

@bgrant0607
Copy link
Member

I think we're never going to do this, so closing.

@davidopp
Copy link
Member

For posterity: you can find an example of embedding JSON in the value of an annotation in #19758. It works, complete with validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/extensibility priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests