Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Whatever you may think of `null` (ahem https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retra...), it is not a string type. It is horrifying to me that all of the problems described in the article are cases of allowing a user to execute arbitrary code by string input, and the article fails to point this out.


It is also security policies put together by overly cautious people. They say "I can't trust our developers to write properly secure code, and it won't hurt anyway so I'll just institute a blanket policy to block everything that might appear in an SQL injection attack. Let's see... we'll block quotes, ";", the word "null"... maybe a few more things."


This is also horrifying, as it indicates that they're taking wildly destructive precautions but still putting user input into SQL queries.


I don't think the suggestion was that the policy is created because the data is going into sql queries. The policies are often in place because the people in charge of the policy flat out don't trust the developers. Say the policy comes out of the CTO's office, they have two options: 1) liberal password policy, risk a crappy developer screwing up, or 2) conservative policy, not have to worry at night.


Slight change of topic here. Your comment made me realize that once a system gets too complicated the only change that can be made is to add more complexity.

In this case. It would be better to ensure all SQL is properly escaped, but because that isn't a trivial task, instead you end up adding another layer of complexity.


Right, and in larger organizations, the "they" are two completely different groups of people. My current situation: Developers in City A are expected to use secure coding standards. But the IT group in City B, halfway across the country, installs a intruder-alert proxy which blocks urls which might contain HTML or SQL. The CTO is probably too busy playing golf with Oracle salesfolks to care, everyone is just looking after their own end of the business.


I think you're misreading what the parent said. It sounded like that even if prepared queries (like PDO), some developers don't trust the database devs to Do The Right Thing(tm).


If they're using prepared statements, they shouldn't be doing anything about SQL injection attacks, as either the prepared statements work or they don't.


Exactly -- "if they are". Are they? How can you, as a manager, know? How can you make 100% sure that every developer to ever work on the code always does the right thing?

Then consider the downside and cost of filtering the input, vs. the downside and cost of injection attack times a probability of 0.01% of a developer messing up -- and a policy, however idiotic it may seem to the individual developer, may make sense on a business level.


A manager can just add easily create a policy about prepared statements as create a policy about mangling text to "protect" user input that is being put into SQL queries. And neither is easier to monitor. Adding the second as an additional policy because of uncertainty about compliance with the first is silly -- it bus you nothing where the first is complied with and gets you another monitoring problem. Better to just expend the same additional effort you'd expend monitoring the second policy to just improve monitoring the first policy.


It seems like you are ready to declare "second line of defense" as a silly thing in principe?

Sometimes adding a second line of defense at 90% efficiency is cheaper than increasing the first line from 90% to 99%.


I was responding to the premise. I was not responding to not the premise. Shrug.


You can use 100% prepared statements, and still be vulnerable to sql injection. Imagine SQL dynamically building SQL. It's real and I've seen it.


We had to integrate a webshop with the customer's existing SAP installation. The consultants in charge of SAP gave us a SOAP webservice that, among other things, had a command called ExecuteSQL - we could pass a raw SQL string and the SAP server would execute it. They "protected" it by blacklisting common naughty words like DROP, ALTER, etc., as well as semicolons and CAST (I don't really know why). When we needed to actually include a semicolon in the query as part of a string literal, I had to work around their filter by replacing the string literal with CONVERT(VARCHAR(MAX), 0x.......) .


Seems far more likely to be type coercion, or over enthusiastic validation, rather than arbitrary code execution.


Type coercion does not necessarily imply the contents were execute as code.


I'm not aware of any language that coerces the string `'null'` to the value `null`. Are you aware of any?


ColdFusion does. There was a famous stackoverflow thread on that "We have an employee whose last name is Null. Our employee lookup application is killed when that last name is used as the search term (which happens to be quite often now)"

http://stackoverflow.com/questions/4456438/how-to-pass-null-...


A badly coded website could go in the other direction, eg in JS, someone writes a poor string equality check:

    function normalise(s) {
        return (s + "").toLowerCase();
    }

    function isEqual(a, b) {
        return normalise(a) == normalise(b);
    }
And then someone else uses it in some form validation logic:

    if (isEqual(surname, null)) {
        return error;
    }


Then as the next step, some programmer somewhere notices that all the NULLs in the database have been converted into strings somewhere, and puts UPDATE table SET column = NULL WHERE column = "NULL" somewhere, maybe even in a trigger, and that's that.

The problem with data destruction is that only one thing in the entire data pipeline has to spuriously map two distinct values to the same output value, and the data is destroyed and can not be recovered. Very few programmers think carefully about whether a serialization or data transfer format maps every single possible input to a different output. It may only take one oversight to ruin your year.


It tends to not be in programming languages but in textual data formats (including many standard implementations of comma-separated files) and even databases that can only hold string values by nature but also want a way to represent null (and fail to fix this by escaping the hell out of everything with quotation marks, and yet it isn't that "code" is executing).


Rails at least used to do this when reading back from a database (eg if you had a nullable string field it would come back as nil if it happened to be the literal "nil")

Edit: IIRC there was a directive that would prevent the behavior though?


Java and JavaScript do the opposite: null values are represented by the string "null" ( http://stackoverflow.com/questions/24591212/why-the-coercion... ) - it's conceivable many programs processing strings would assume "null" in a string comes from a null value originally.


Your inaccurate description gave me a small heart attack. We all know JavaScript is horrible, but that is next-level horrible.

null is coerced to "null" when it's needed as a string. This behaviour is similar to printf in C. It's not the case that "null" === null like True == 1 in Python. (This would be my interpretation of "x is represented by y".)


If you want 'next-level horrible', open your browser's console and take a look at what you get back from "typeof null".


JavaScriptCore:

>>> typeof null

object

>>> typeof "null"

string

>>> typeof (null + "")

string


Something always bugs me about this criticism:

#1 arguably makes sense. Unless "null" were a "type" (which it isn't), the other possible options for this ("undefined", or "string", "number", "array", etc.) all make less sense.

#2 is expected/correct.

#3 makes sense if you accept that the "+" operator aggressively casts its arguments to strings when they are anything other than two numbers. Arguably that's a bad design, but it's not unexpected or unpredictable.

The one completely unexplainable "+" behavior I know of in JS (i.e., a recent node/v8) is the second of these:

  > [] + {}
  '[object Object]' // makes sense because the string representation of [] is ''
  > {} + []
  0 // ???
  > var x = {} + []
  undefined
  > x
  '[object Object]' // but assigning to a variable first makes sense again


> #1 arguably makes sense. Unless "null" were a "type" (which it isn't), the other possible options for this ("undefined", or "string", "number", "array", etc.) all make less sense.

`typeof` gets the primitive type of a value, and exactly like undefined null is a primitive value with its own primitive type, as `typeof undefined` returns `"undefined"` `typeof null` was supposed to return `"null"`, it doesn't due to a bug in the original implementation which was then spec-enshrined as fixing it would break too much stuff: http://www.2ality.com/2013/10/typeof-null.html


This is because:

> [] + {}

Becomes:

> [].toString() + {}.toString()

Empty array to a string is "" (empty string). Object to string is "[object Object]".

> {} + []

Becomes:

> +[]

The prefix turns the array into a number, same as Number([]). This seems to be because {} is interpreted as a code block, not an object, due to it being the first thing.

Put it in parentheses and it'll do toString like before.


And now the obligatory WAT video: https://www.destroyallsoftware.com/talks/wat


The first of which is just my point. Null cannot be treated safely as an object in any way. To attempt such elicits an exception. Yet its type is 'object' nonetheless, needlessly complicating every type check that has to do with objects. I've been working with Javascript, professionally and as a hobbyist, since about 1995, and have yet to encounter any satisfactory reason why this should be the case.

Edit: And a comment nearby explains that it is the way it is for the same reason Make barfs on leading spaces. Well, at least that's a reason! A terrible, terrible reason.


No, it might be a conversion from a database to CSV that's doing it or something similar.


"Types", sounds like someone doesn't understand dynamic languages, which are some of the most popular languages today.


To be more clear than my previous response, I am well aware that dynamically typed languages do all sorts of nonsense, but nearly none of them consider this `null` to equal `"null"`, certainly not the widely popular ones.

Side note: please refrain from "sounds like someone doesn't understand". It is extremely condescending, dissuades discussion by most people whether they actually know the topic or not, and can be especially deflating to people who are less educated but currently learning.

If there is something that you feel I misunderstood about the way popular languages treat the string `"null"`, feel free to explain it (there are many other response comments I've gotten that are good examples). But if the purpose of your comment is to make me or anyone else feel ignorant, please just keep it to yourself.


I strongly believe parent was being sarcastic.


Ugh, on re-reading, very probably. I'm suser sensitive to this stuff with all the "inclusion is a joke" nonsense going around today, and probably overreacted to jest. Apologies if so.


Sadly, all of my professional work to date has been in dynamic languages. They still have types, and `null` is still not `"null"`.

Edit: whoops, forgot the one time I wrote a one-off Mac app in Swift.


I was kinda sorta being sarcastic, though that kind of thing is a big problem in dynamic languages. Javascript has the concept of "truthy" values of ... whatever rather than a simple boolean type. Realistically this particular problem is not due to dynamic languages per se. Most likely it's due to some stage of a process where data was serialized/deserialized without sufficient care. In general it's more of an "in band signaling" problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: