Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You can't express number or array with XML

Numbers can be expressed, they just have to be parsed on the client. For array, child elements are ordered and form an array naturally.



> Numbers can be expressed, they just have to be parsed on the client.

But client has to know that they're numbers. So you can't just parse XML into JavaScript object without any knowledge about its structure.

> For array, child elements are ordered and form an array naturally.

But you can't express array of 0-length this way without explicit knowledge about structure. And you can't distinguish array of 1-length from an ordinary value without explicit knowledge about structure.


> without explicit knowledge about structure

That's why XML Schema exists. So, everything you listed is possible in XML if you don't put aside some specifications on purpose.

And JSON is so self-describing that someone came with... JSON Schema.


Yes. And that's my point: you don't need schemas to work with JSON. JSON Schema exists, of course, it would be strange if someone did not invent it, as it's an obvious idea. But I've yet to see anyone using it.


> But client has to know that they're numbers. So you can't just parse XML into JavaScript object without any knowledge about its structure.

<Number value="123" type="i32" />


This is tag with name Number and with two attributes. Are you trying to invent XML sublanguage? It's possible. But with JSON it's already invented and there are multiple libraries for every programming language. What should I use to parse your format?


> But with JSON it's already invented

It's inventory and optimized for certain language types, specifically those that are loose with their numeric types. This includes most scripting languages.

In languages that require specific intrinsic types to be defined for a number, there are usually a lot to choose from and using the wrong one can be a real problem when converting from XML or JSON to a native format.

On a very simple level, what container do we use to encode the following data structures:

  [100,4,10,156]

  <array>
    <num>100</num>
    <num>4</num>
    <num>10</num>
    <num>156</num>
  </array>
In most dynamic languages, hut number type used is just the included numeric container, which generally includes some sort of complex decision between floating point and bignums. For something like C, Java or Rust, generally we would want to choose an appropriate type. In this case, it looks like an unsigned 8-bit integer will suffice for most, and a 16 bit signed value for Java (which doesn't support unsigned values).

But what if the next number is much larger? Should we really need to parse all the values to determine the correct data type to use? That seems very inefficient, and we can't even be sure that we'll encounter values that accurately illustrate the range of values in one parsing. What if the next message or file we parse has large values?

For these languages the inherent data type definition of JSON is a poor match, since its looseness does not transfer easily to a language which does not inherently support it. If your target languages supports dynamicaly resizing untyped arrays, untyped key-value maps, and generic number types that support both very big and floating number types automatically, then JSON is an almost perfect representation format for you. If your target language works best when those items are broken into smaller more explicit components, there's a lot of extra work in parsing JSON, and I can see how that makes XML not look much worse in comparison (especially since your parsed data structure will likely be leaner because there isn't the overhead inherent in those convenient magical types, references are rarely as efficient as pointers).


Odd that you would pick numbers. Those are not exactly but free in json. And dates are as likely to actually be a problem.


> And dates are as likely to actually be a problem.

I'm currently working against a REST API which JSON format has three different date formats as sub-elements of the same root object...


As opposed to json: `{"number": 12345678901234567890}`

What does that look like when you parse it?


Mapping from string to number. Probably BigInteger for Java, whatever other standard integer for other languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: