This article gives a good example of why you SHOULD consider using an existing framework to create a REST API.
I see no concern about authentication, authorisation, scalability, protection from SQL injection attacks, nor making the output easily parseable by third-party applications.
None of these are issues you can simply say, "I'll deal with that later when it becomes a problem." They are reasons why an existing framework is helpful.
Any example of a web application using a database should NOT be using string concatenation for adding field values to an SQL statement. Ever. Really.
Frameworks don't magically solve these concerns. I can create my API with Django, and there's no guarantee that I will restrict my endpoints to authenticated parties or I guard article modifications to their authors, for example.
Django doesn't magically scale; you still need to a) learn how their ORM works and where it can be greedy or b) just use a "saner" ORM (eg. SQLAlchemy) or otherwise write your own without the need of a thousand features you don't need.
SQL... set variables with `postgres_escape` and not `set`, always.
Output... he returns appropriate responses, but between all the "RESTful" frameworks I've seen, they ALL have different opinions on what should be returned, what HTTP codes to use, etc. I'm not sure a framework helps, other than to inform you or get you stuck with their ridged paradigms.
Please make a little effort and expand on why this article is interesting instead.
For instance, people are using micro frameworks for a reason. They want less clutter and a more direct grab on what the machine is doing. With the proposed theoretical solution we get to remove one big complexity element in the setup, it is very interesting.
Unfortunately when defaults are inherently insecure, they lead to people building insecure systems. If every single parameterized SQL command needs to include (possibly multiple) escapes then it'll be missed in some places.
It's unnecessary anyway. It would be way better would be if the parameters were bound as named parameters. Ex:
location ~ /articles/(?<id>\d+) {
postgres_pass database;
rds_json on;
postgres_query HEAD GET "SELECT * FROM articles WHERE id = $id";
postgres_rewrite HEAD GET no_rows 410;
}
See how $id has no quotes around it? The DB driver should parse the parameter and bind it as a string in that position. If you need to use it as a different data type (ex: integer) then you can do an explicit type conversion. That's how you prevent SQL injection.
"The DB driver should parse the parameter and bind it as a string in that position. If you need to use it as a different data type (ex: integer) then you can do an explicit type conversion."
Not sure exactly what you mean here. The protocol and libpq support sending literal values entirely outside of the query itself. There is no reason for the DB driver to do any parsing.
Well, it would appear to me this is one of the few posts on OpenResty. I recently discovered it myself as one of the faster options on the the TechEmpower Benchmarks. I know that this will lead to a benchmark flamewar, but that is where I first heard of it and started researching it.
However, you might also notice that this project uses Lua (specifically LuaJIT) to run embedded in the server. I would think this avoids the problems partially. There is even a new MVC project budding out of this for Lua/Moonscript called Lapis, if that really assuages your concerns. Seems pretty micro, so I am not sure it is meant to really handle your security concerns. It was used on a demo site for a web game shop site that made it here a few months ago. Check it out; it is pretty cool.
As pointed out before, few frameworks I have researched care about security. Ironically, I think web2py is the only I remember having dedicated page space on doing OWASP evaluation against their code base from a long time back: http://www.web2py.com/book/default/chapter/01#Security). Others rarely mention it, and are total crap security wise. As others pointed out, frameworks are little guarantee against security unless, like the language and dev experience itself, prepare to know what the hell you are doing.
Final point, a much cooler complement might be the "Web Application Firewall" projects built into the web server itself to block typical injection attacks. NAXSI, specifically for Nginx, is a cool project and I will look into soon. Maybe this will interest you to.
It was a bad move not include protection against SQL injection to begin with. I was thinking about keeping it as simple as possible and just show a simplified example. But I think this was a bad thing to do so I have updated the example so it now uses postgres_escape.
And yes, this approach should only be used for very simple APIs. If you're building something bigger, use a framework.
CouchDB's support for "views" and using comet/websockets (have they implemented websockets yet?) to communicate with a JS app served from the DB itself...
Riak does too - any DB that implements a RESTful interface to the database instead of a binary protocol.
CouchDB is particularly well-suited for this use-case though (I would never use Riak in that fashion).
Exactly. Couchdb is preferable to this for so many reasons. The author is effectively using the Nginx configuration language for server-side scripting. The result ends up looking a lot like a bad PHP crud app. With couchdb, the REST API is already built in.
Interesting. I'd like to see how you implement role based access control and sql injection safeguards. How would you define complex validation schemes? In stored procedures?
I want to chime in that I have been using a similar setup: nginx, ngx-postgres, postgresql 9.2.4 in production for a while. It has been very snappy and rock solid till this point.
I used android async-http to talk to the server from my app to have everything online. This is an intranet app so latency is not really an issue.
To clarify, it looks like "postgres_escape" is the way to do escaping.
Unfortunately, it seems a bit awkward to differentiate between empty and NULL strings. That's something to be careful of.
Also, I really think this should be included in the blog post, even if it's simple. Protecting against SQL injection is not optional, so leaving it out only muddies the comparison with more traditional frameworks. Also, there's always going to be someone that copies and pastes it without thinking much about it.
That being said, this is very cool and I hope to see more development and exploration in this area.
From the docs re postgres_escape: "Because nginx cannot tell the difference between empty and non-existing strings, all empty strings are by default escaped to NULL value."
This behavior actually is what anyone who has used Oracle is accustomed to (empty string is NULL). I don't recall the default behavior in Postgres but Postgres is "Oracle-ish" in a lot of ways so I would be surprised if this is not the default there as well.
The "empty string is NULL" behavior you describe is a standards violation. Postgres does not follow Oracle off of that particular cliff; postgres does differentiate between the two.
... And Oracle is going to change that any day now for varchar, which is why they've been hounding us to use "varchar2" from time immemorial. (They wouldn't have any other reason to encourage varchar2, would they?) :)
Yeah but by the time we've taken care of injection issues (and handled all the other gotchas that haven't been articulated yet), we've written about as much boilerplate as we would for just about any other scripting language / framework, wouldn't we?
Don't get me wrong -- it's still a pretty neat idea. It's just there's that "other 90%" (i.e. the necessary work to create anything seriously production grade) that always comes to mind when evaluating a new interface.
I would implore blog writers to not do this. I realize it makes the code easier to read, but even if you loudly disclaim "THIS CODE IS INSECURE DO NOT USE IT AS-IS" there are plenty of copy/paste coders in the world who will do exactly that, perhaps even intending to come back later and add in the protections, but never actually get around to it.
Wouldn't you be throwing away your ability to add a caching layer. I guess their are ngx modules for Memcache/Redis but coding even a simple api in nginx.conf can get difficult?
I still like the idea of nginx being able to communicate with different datastores. For example, let's say you want to serve a file but it requires authentication, you can pass some signed request via query string and cross check it in session in redis/postgres wherever.
Every time I see something like this I think it is an opportunity to finally do something with http://www.eclipse.org/Xtext/. How cool would it be to come up with a simple (external!) DSL to mount SQL queries over HTTP routes, and transform the whole thing to a nginx configuration file?
Hopefully the former. It would be difficult to build anything non-trivial with it. However, I suppose you could build a basic counter or key-value system over HTTP with it.
Can you recommend a public repo which demonstrates use of the HttpLuaModule? [Edited to add:] Another comment mentions the Lapis framework which gives [me] a good starting point.
This setup is powerful by its simplicity. Only two layers. And stored procedures can help you do validation, so one could imagine a full Twitter like implemented this way (lest the ui, obviously)
Postgresql doesn't offer proper timezone storage, or in fact any timezone storage. Timestamptz is just a UTC timestamp that is converted to/from your connection-specified timezone. You completely lose the "what timezone was this originally specified in" data.
The point is that the timestamp type is meant more for calendaring, unless you're manually normalizing the dates you're passing into postgres to a certain timezone like UTC. Timestamptz is what you actually want 99% of the time.
Yes, timestamptz is the one you want 100% of the time. But there are a ton of people who don't realize it doesn't work the way they think it does. Lots of people think it stores the timezone, and it is really important to understand that is not the case when designing your app. Postgresql is actually completely lacking in the ability to store and validate timezones, as well as converting values to other timezones in useful ways. It is quite possibly the biggest weak spot postgresql has.
Well, this is for a very simple REST API and the post acknowledges that much. I mean if this does become a "project", I'm sure it's time to move it to a proper framework, but as you say for a quick hack, it's a pretty nice way to do it (with proper sanitization, of course).
I'd also venture that this is an order of magnitude faster and more responsive than a framework so there are definitely some plusses and minuses to consider.
yep agreed. if you're comfortable with nginx and just want to test something quickly, this is many times better than downloading a framework, setting up, routing, etc.
Auth Basic can be done directly against postgres, plus a few more modules (headers-more and eval was how I did it, IIRC). Could probably do it with just openresty too.
After authentication, you have the credentials needed check authorization, it'd probably just involve a few changes to the SQL in the article.
I see no concern about authentication, authorisation, scalability, protection from SQL injection attacks, nor making the output easily parseable by third-party applications.
None of these are issues you can simply say, "I'll deal with that later when it becomes a problem." They are reasons why an existing framework is helpful.
Any example of a web application using a database should NOT be using string concatenation for adding field values to an SQL statement. Ever. Really.