-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate unmarshalling and parsing in a single pass #5643
base: master
Are you sure you want to change the base?
Conversation
b266dd7
to
7c8ea44
Compare
7fd9d12
to
99923f8
Compare
@@ -3065,5 +3113,10 @@ public AllTypesRequest build() { | |||
public List<SdkField<?>> sdkFields() { | |||
return SDK_FIELDS; | |||
} | |||
|
|||
@Override | |||
public Map<String, SdkField<?>> sdkFieldNameToField() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to return a collection? Doesn't seem like we use any of the collection functionality anywhere. Can we just have a getByName
type method, that returns the SdkField
(or null)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this method has to return a map as defined in the SdkPojo
interface. We use it while parsing when we find a name to figure out to which field that name maps. Both the data class and the builder implement SdkPojo
.
unmarshallFromJson(sdkPojo, response.content().get()); | ||
return unmarshallResponse(sdkPojo, response); | ||
} | ||
return unmarshallFromJson(sdkPojo, response.content().get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these get()
s guaranteed to be safe here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point yes, otherwise hasJsonPayload
above returns false and the first branch will be used.
@@ -224,23 +217,111 @@ public T unmarshall(JsonUnmarshallerContext context, | |||
|
|||
public <TypeT extends SdkPojo> TypeT unmarshall(SdkPojo sdkPojo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dumb question: What is the reason for the branching here? i.e. why isn't this just a call to unmarshallResponse
? I assume unmarshallFromJson
is the fast-path here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the idea. Most of the complications here are to support restJson, RPC protocols are a lot simpler and do not need any of that, so we attempt to separate one from the other to avoid paying the cost for all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can we add more docummentation to this class? Lot going on
- Can we add additional tests for some uncovered branches? I know the protocol tests should cover the majority but looks like some lines are missed https://sonarcloud.io/component_measures?metric=new_line_coverage&selected=aws_aws-sdk-java-v2%3Acore%2Fprotocols%2Faws-json-protocol%2Fsrc%2Fmain%2Fjava%2Fsoftware%2Famazon%2Fawssdk%2Fprotocols%2Fjson%2Finternal%2Funmarshall%2FJsonUnmarshallingParser.java&pullRequest=5643&id=aws_aws-sdk-java-v2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add more documentation to this class? Lot going on
Sure, I will look into what can be improved.
Can we add additional tests for some uncovered branches?
I will remove some of the removing location that's not used. If you look closely most of the remaining ones are about error recover from invalid JSON, e.g., that we find a field name or a closing bracket after an opening one, if that's not the case the JSON is not valid, which might not be, we don't know until we finish consuming it. I will look into covering more that we do now.
if (token == null) { | ||
return (SdkPojo) ((Buildable) pojo).build(); | ||
} | ||
if (token == JsonToken.VALUE_NULL) { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm intuitively, seems like these cases should return the same value. When would token
be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is between an input of length 0 and an input with a literal value null
.
LGTM, can we run this in the canaries? |
Quality Gate failedFailed conditions See analysis details on SonarCloud Catch issues before they fail your Quality Gate with our IDE extension SonarLint |
Motivation and Context
Change the way to unmarshall JSON payloads. Before this change the process was done in two steps, first we parse the JSON input into a
JsonNode
structure than then is traversed to unmarshall it into aSdkPojo
. After this change, there's a single pass that builds theSdkJson
as we parse the input.Changes
SdkPojo
The
SdkPojo
interface was changed to add a new method,sdkFieldNameToField()
that returns aMap<String, SdkField<?>>
which allow us to lookups the field metadata for a given field name. We also changed the code generation to create this map on top of the list of fields that we were creating before.JsonUnmarshallingParser
A new internal class
JsonUnmarshallingParser
was added that is used byJsonProtocolUnmarshaller
to unmarshall the input. This class builds a JSON parer and builds the expected java types as the stream of tokens are read. Unmarshalling of simple types is still done using the registered unmarshallers.Benchmarks results
This change improves the performance of unmarshalling in general. Shapes with labeled simple types (as in members of structures) benefit the most from this change in plain JSON, whereas collections of datapoints (e.g., list of doubles) do not seem to benefit much, mainly because of how the parsing from strings dominates the time.
V2DynamoDbAttributeValue
Below is the before result of running the
V2DynamoDbAttributeValue.getItem
benchmarkAnd the after results,
And the after V2 results, updated after enabling and using Jackson's fast float parsing, notice how this change improves the performance as there's no longer need to create intermediate objects.
The improvements comparing before and after is
Edit, the improvements comparing before and after V2 is
JsonMarshallerBenchmark
This change also shows a ~2x performance improvements of the
JsonMarshallerBenchmark.unmarshall
for RPCv2, but, not so much for AWS JSON. Below is the before result of the benchmarkAnd the after results,
And the after V2 results, updated after enabling and using Jackson's fast float parsing, notice how RPCv2 also improves as there's no longer need to create intermediate objects.
The improvements comparing before and after is
Edit, the improvements comparing before and after V2 is
Modifications
Testing
Screenshots (if appropriate)
Types of changes
Checklist
mvn install
succeedsscripts/new-change
script and following the instructions. Commit the new file created by the script in.changes/next-release
with your changes.License