Since JSON does not specify what happens with duplicate keys, that means it is "unspecified behavior" in language spec speak.
Software which prepares JSON with duplicate keys is causing unspecified behavior in downstream processors, and therefore its output can be regarded as broken.
If you put out JSON with duplicate keys, you are responsible for whatever consequently blows up in your face or the faces of downstream users.
First thing we could do here is rename the JSON.parse :symbolize_names keyword to :symbolize_keys -- always trips me up for some reason.
Why? JSON is name-value pairs https://www.json.org/json-en.html
Sure, but JSON.parse returns a Hash, which is key-value pairs. The method argument is about the return value, not the input value, so it is more like "Parse this JSON, and symbolize the keys in the resulting Hash before returning it to me". Plus, as mentioned, symbolize_keys is more conventional.
I guess I'm more thinking about Ruby/Rails conventions, e.g. methods like Hash#symbolize_keys.
Mixing the two has always been a smell to me, but maybe you're right.
Maybe it could just be an alias. People do also refer to them as key-value pairs, so that feels reasonable.
Changing the default behavior for duplicate keys is unnecessary... and therefore should not be done.
IMO, it's nuts to purposely introduce random bugs into the apps of everyone who uses your dependency.
There was a post here recently about how this sort of behaviour on duplicate keys has actually led to security issues (in the context of Go) [0]
The issue was triggered by using multiple parsers, with different behaviors.
The change of behavior here would prevent the specific problem, but makes the general problem worse.
This change creates a new code path that did not exist when all the app code was developed and tested, so there's a decent chance something bad can happen that it's not prepared for.
The thing that I find surprising about this is that it seems so innocuous. Duplicate keys in an object? Meh just take the first or the last or something, no big deal, I'm sure whatever library I'm using will handle it
Just to point out, the problem in that issue isn’t really about JSON parsing. It’s about the layer whose job it is to verify/sanitize the input passing the unverified/unsanitized input through to the next layer.
The idea that your JSON parser can catch this kind of error in the general case just doesn’t make sense.
Changing the default behavior of the parser would catch this specific problem — by coincidence — but will also create new problems. They are as yet unknown, but we will find out (unless someone gains the sense to back off this bad idea before it makes it into production).
RFC 8259 describes these possibilities, accurately quoted in the article: "Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates."
Keeping the first key/value would be a poor idea since it is a behavior not mentioned in the RFC.
Not sure what they mean by "json gem", as it comes with ruby and not installed via gem install command.
I've always used
require 'json/ext'
This uses the C extension version and is much faster.