Simple API with Nginx and PostgreSQL

stevoski · on July 26, 2013

This article gives a good example of why you SHOULD consider using an existing framework to create a REST API.

I see no concern about authentication, authorisation, scalability, protection from SQL injection attacks, nor making the output easily parseable by third-party applications.

None of these are issues you can simply say, "I'll deal with that later when it becomes a problem." They are reasons why an existing framework is helpful.

Any example of a web application using a database should NOT be using string concatenation for adding field values to an SQL statement. Ever. Really.

thezilch · on July 26, 2013

Frameworks don't magically solve these concerns. I can create my API with Django, and there's no guarantee that I will restrict my endpoints to authenticated parties or I guard article modifications to their authors, for example.

Django doesn't magically scale; you still need to a) learn how their ORM works and where it can be greedy or b) just use a "saner" ORM (eg. SQLAlchemy) or otherwise write your own without the need of a thousand features you don't need.

SQL... set variables with `postgres_escape` and not `set`, always.

Output... he returns appropriate responses, but between all the "RESTful" frameworks I've seen, they ALL have different opinions on what should be returned, what HTTP codes to use, etc. I'm not sure a framework helps, other than to inform you or get you stuck with their ridged paradigms.

gbog · on July 27, 2013

Again the middle brow dismissal.

Please make a little effort and expand on why this article is interesting instead.

For instance, people are using micro frameworks for a reason. They want less clutter and a more direct grab on what the machine is doing. With the proposed theoretical solution we get to remove one big complexity element in the setup, it is very interesting.

rgbrenner · on July 26, 2013

"protection from SQL injection attacks"

In the module he's using, there is a postgres_escape function.. so there's no reason he couldn't have used it.

http://labs.frickle.com/nginx_ngx_postgres/README

sehrope · on July 26, 2013

Unfortunately when defaults are inherently insecure, they lead to people building insecure systems. If every single parameterized SQL command needs to include (possibly multiple) escapes then it'll be missed in some places.

It's unnecessary anyway. It would be way better would be if the parameters were bound as named parameters. Ex:

    location ~ /articles/(?<id>\d+) {
      postgres_pass database;
      rds_json  on;

      postgres_query    HEAD GET  "SELECT * FROM articles WHERE id = $id";
      postgres_rewrite  HEAD GET  no_rows 410;
   }

See how $id has no quotes around it? The DB driver should parse the parameter and bind it as a string in that position. If you need to use it as a different data type (ex: integer) then you can do an explicit type conversion. That's how you prevent SQL injection.

jeffdavis · on July 26, 2013

"The DB driver should parse the parameter and bind it as a string in that position. If you need to use it as a different data type (ex: integer) then you can do an explicit type conversion."

Not sure exactly what you mean here. The protocol and libpq support sending literal values entirely outside of the query itself. There is no reason for the DB driver to do any parsing.

sehrope · on July 26, 2013

Miscom on my part. By driver I meant the nginx module that's calling out to libpq (which would more accurately be referred to as the DB driver).

I meant that libpq supports bind variables and the nginx module should be using them rather than performing a string substitution.

616c · on July 27, 2013

Well, it would appear to me this is one of the few posts on OpenResty. I recently discovered it myself as one of the faster options on the the TechEmpower Benchmarks. I know that this will lead to a benchmark flamewar, but that is where I first heard of it and started researching it.

http://www.techempower.com/benchmarks/#section=data-r6&hw=i7...

However, you might also notice that this project uses Lua (specifically LuaJIT) to run embedded in the server. I would think this avoids the problems partially. There is even a new MVC project budding out of this for Lua/Moonscript called Lapis, if that really assuages your concerns. Seems pretty micro, so I am not sure it is meant to really handle your security concerns. It was used on a demo site for a web game shop site that made it here a few months ago. Check it out; it is pretty cool.

https://github.com/leafo/lapis

As pointed out before, few frameworks I have researched care about security. Ironically, I think web2py is the only I remember having dedicated page space on doing OWASP evaluation against their code base from a long time back: http://www.web2py.com/book/default/chapter/01#Security). Others rarely mention it, and are total crap security wise. As others pointed out, frameworks are little guarantee against security unless, like the language and dev experience itself, prepare to know what the hell you are doing.

Final point, a much cooler complement might be the "Web Application Firewall" projects built into the web server itself to block typical injection attacks. NAXSI, specifically for Nginx, is a cool project and I will look into soon. Maybe this will interest you to.

https://code.google.com/p/naxsi/

postfuturist · on July 26, 2013

Edit: he should have escaped the $title and $body.

ricny046 · on July 26, 2013

It was a bad move not include protection against SQL injection to begin with. I was thinking about keeping it as simple as possible and just show a simplified example. But I think this was a bad thing to do so I have updated the example so it now uses postgres_escape.

And yes, this approach should only be used for very simple APIs. If you're building something bigger, use a framework.

egeozcan · on July 26, 2013

If you use lapis[1] things will be much easier.

[1]: http://leafo.net/lapis/

bdcravens · on July 26, 2013

Doesn't couchdb accomplish this? (out of box REST API with little config)

Ixiaus · on July 26, 2013

CouchDB's support for "views" and using comet/websockets (have they implemented websockets yet?) to communicate with a JS app served from the DB itself...

Riak does too - any DB that implements a RESTful interface to the database instead of a binary protocol.

CouchDB is particularly well-suited for this use-case though (I would never use Riak in that fashion).

evan2m · on July 26, 2013

Exactly. Couchdb is preferable to this for so many reasons. The author is effectively using the Nginx configuration language for server-side scripting. The result ends up looking a lot like a bad PHP crud app. With couchdb, the REST API is already built in.

ams6110 · on July 26, 2013

Except if you already have a postgres-based system and you want to build a simple REST api around some small piece of it.

bb6xt · on July 26, 2013

Interesting. I'd like to see how you implement role based access control and sql injection safeguards. How would you define complex validation schemes? In stored procedures?

terabytest · on July 26, 2013

Simple, you'd use a framework.

bb6xt · on July 31, 2013

I thought as much. If the goal is to not write code then writing stored procedures is akin to writing code, just in a different language.

dangayle · on July 26, 2013

I made it to the bottom and thought the article was cut off. Turns out, it really is that simple. (Sans the sql injection and auth stuff).

Fantastic.

rgbrenner · on July 26, 2013

I thought the same thing.. page could use some whitespace at the end or a footer.

jahewson · on July 26, 2013

I really like the simplicity of this approach, but are the Postgres calls asynchronous? If not, it won't scale...

mfenniak · on July 26, 2013

I think the answer is yes, it performs the calls asynchronously. I've been searching through the source code for the postgres module (https://github.com/FRiCKLE/ngx_postgres), and it appears to use the libpq asynchronous API (http://www.postgresql.org/docs/9.2/static/libpq-async.html).

Of course, one could easily use the async API in a non-async manner, so I'm not 100% certain.

hbbio · on July 26, 2013

The MySQL driver is async, the Postgres one should be as well.

davyjones · on July 27, 2013

I want to chime in that I have been using a similar setup: nginx, ngx-postgres, postgresql 9.2.4 in production for a while. It has been very snappy and rock solid till this point.

I used android async-http to talk to the server from my app to have everything online. This is an intranet app so latency is not really an issue.

joevandyk · on July 26, 2013

If you created stored procedures in postgresql that accepted a json/hstore argument, you'd have a pretty flexible/fast system here.

i.e. PUT to "/api/blah" with {"id": 1, "name": "Joe"} could call the update_blah stored proc passing in that json.

kclay · on July 26, 2013

Pretty neat, what about SQL injections?

ricny046 · on July 26, 2013

This example is not secured against SQL injections but you can do it easily. Look at the documentation here: https://github.com/FRiCKLE/ngx_postgres/

jeffdavis · on July 26, 2013

To clarify, it looks like "postgres_escape" is the way to do escaping.

Unfortunately, it seems a bit awkward to differentiate between empty and NULL strings. That's something to be careful of.

Also, I really think this should be included in the blog post, even if it's simple. Protecting against SQL injection is not optional, so leaving it out only muddies the comparison with more traditional frameworks. Also, there's always going to be someone that copies and pastes it without thinking much about it.

That being said, this is very cool and I hope to see more development and exploration in this area.

ams6110 · on July 26, 2013

From the docs re postgres_escape: "Because nginx cannot tell the difference between empty and non-existing strings, all empty strings are by default escaped to NULL value."

This behavior actually is what anyone who has used Oracle is accustomed to (empty string is NULL). I don't recall the default behavior in Postgres but Postgres is "Oracle-ish" in a lot of ways so I would be surprised if this is not the default there as well.

jeffdavis · on July 27, 2013

The "empty string is NULL" behavior you describe is a standards violation. Postgres does not follow Oracle off of that particular cliff; postgres does differentiate between the two.

sixbrx · on July 27, 2013

... And Oracle is going to change that any day now for varchar, which is why they've been hounding us to use "varchar2" from time immemorial. (They wouldn't have any other reason to encourage varchar2, would they?) :)

ricny046 · on July 26, 2013

Good point. I will update the examples in the blog post as soon as possible.

hedonist · on July 26, 2013

Yeah but by the time we've taken care of injection issues (and handled all the other gotchas that haven't been articulated yet), we've written about as much boilerplate as we would for just about any other scripting language / framework, wouldn't we?

Don't get me wrong -- it's still a pretty neat idea. It's just there's that "other 90%" (i.e. the necessary work to create anything seriously production grade) that always comes to mind when evaluating a new interface.

ams6110 · on July 26, 2013

I would implore blog writers to not do this. I realize it makes the code easier to read, but even if you loudly disclaim "THIS CODE IS INSECURE DO NOT USE IT AS-IS" there are plenty of copy/paste coders in the world who will do exactly that, perhaps even intending to come back later and add in the protections, but never actually get around to it.

osi · on July 26, 2013

looks like that door is wide open

scragg · on July 26, 2013

Wouldn't you be throwing away your ability to add a caching layer. I guess their are ngx modules for Memcache/Redis but coding even a simple api in nginx.conf can get difficult?

I still like the idea of nginx being able to communicate with different datastores. For example, let's say you want to serve a file but it requires authentication, you can pass some signed request via query string and cross check it in session in redis/postgres wherever.

purple_horse · on July 26, 2013

If you want a really good REST API for Postgres you're better off using HTSQL. http://htsql.org

EmmanuelOga · on July 27, 2013

Every time I see something like this I think it is an opportunity to finally do something with http://www.eclipse.org/Xtext/. How cool would it be to come up with a simple (external!) DSL to mount SQL queries over HTTP routes, and transform the whole thing to a nginx configuration file?

nickstinemates · on July 26, 2013

Is this a "look how cool/flexible nginx is!" article or a real proposal about how to do API's?

leftnode · on July 26, 2013

Hopefully the former. It would be difficult to build anything non-trivial with it. However, I suppose you could build a basic counter or key-value system over HTTP with it.

ams6110 · on July 26, 2013

It is by definition difficult to build anything non-trivial regardless.

This looks like a very nice way to build small simple REST APIs especially if you're already using nginx and postgres anyway.

diminoten · on July 26, 2013

Can this be done with redis, or memcache?

syvolt · on July 26, 2013

Same developer made this:

https://github.com/agentzh/redis2-nginx-module

mrweasel · on July 26, 2013

There is http://wiki.nginx.org/HttpMemcachedModule, but it's not really the same thing.... Cool though.

taf2 · on July 26, 2013

very cool, add in some lua scripting and you can get some pretty solid single purpose endpoints: http://wiki.nginx.org/HttpLuaModule

maybe not perfect for a whole app but maybe depending on your goals.

moens · on July 26, 2013

Can you recommend a public repo which demonstrates use of the HttpLuaModule? [Edited to add:] Another comment mentions the Lapis framework which gives [me] a good starting point.

@subs: thanks!

chanux · on July 26, 2013

Hope this will help you http://blog.cloudflare.com/pushing-nginx-to-its-limit-with-l...

This helped me setting up nginx/lua/redis setup on Ubuntu https://github.com/ehazlett/nginx-rt-log/blob/master/readme....

iamlacroix · on July 26, 2013

You can also check out a simple example[1] used for the TechEmpower Benchmarks[2].

[1]: https://github.com/TechEmpower/FrameworkBenchmarks/tree/mast...

[2]: http://www.techempower.com/benchmarks/

justincormack · on July 26, 2013

There is a presentation here that goes through the basics. http://www.londonlua.org/scripting_nginx_with_lua/index.html

twism · on July 26, 2013

An oldie but goodie on this sort of stuff on nginx ... http://agentzh.org/misc/slides/nginx-conf-scripting/nginx-co...

Oculus · on July 26, 2013

I wonder what the difference in performance would be compared to NodeJS and Express.

cgarvis · on July 26, 2013

http://www.techempower.com/benchmarks/#section=data-r6&hw=i7...

AYBABTME · on July 26, 2013

Why use a SQL database when all you really want is a key-value store?

DrJokepu · on July 26, 2013

Because PostgreSQL is very good at being a key-value store?

gbog · on July 27, 2013

Who says he wants a key value store?

This setup is powerful by its simplicity. Only two layers. And stored procedures can help you do validation, so one could imagine a full Twitter like implemented this way (lest the ui, obviously)

AYBABTME · on July 27, 2013

The API he implements suggests a key-value scheme.

level09 · on July 26, 2013

This is amazing, always wondered if I could move things from the app server to the web server level.

but are there any performance benchmarks ? I suppose nginx would have a better performance.

nickstinemates · on July 26, 2013

Very curious - why would you want to do that?

BadassFractal · on July 26, 2013

On an unrelated note, timestamptz is advisable over timestamp.

dragontamer · on July 26, 2013

Timestamptz is one amazing reason to use Postgres over MySQL frankly. Proper time + timezone storage is quite useful.

asdasf · on July 26, 2013

Postgresql doesn't offer proper timezone storage, or in fact any timezone storage. Timestamptz is just a UTC timestamp that is converted to/from your connection-specified timezone. You completely lose the "what timezone was this originally specified in" data.

tobych · on July 26, 2013

And see this StackOverflow question: http://stackoverflow.com/questions/9571392/ignoring-timezone...

akurilin · on July 26, 2013

The point is that the timestamp type is meant more for calendaring, unless you're manually normalizing the dates you're passing into postgres to a certain timezone like UTC. Timestamptz is what you actually want 99% of the time.

asdasf · on July 27, 2013

Yes, timestamptz is the one you want 100% of the time. But there are a ton of people who don't realize it doesn't work the way they think it does. Lots of people think it stores the timezone, and it is really important to understand that is not the case when designing your app. Postgresql is actually completely lacking in the ability to store and validate timezones, as well as converting values to other timezones in useful ways. It is quite possibly the biggest weak spot postgresql has.

ghostdiver · on July 27, 2013

Postgres already offers HTTP interface:

http://wiki.postgresql.org/wiki/HTTP_API

rrjanbiah · on July 27, 2013

I'm looking for information on how to return homepage or template files from such pure REST interface. If you have any idea, please share.

euphemize · on July 26, 2013

Great for a quick little hack, but I wouldn't want to maintain that conf file once the project grows.

eksith · on July 26, 2013

Well, this is for a very simple REST API and the post acknowledges that much. I mean if this does become a "project", I'm sure it's time to move it to a proper framework, but as you say for a quick hack, it's a pretty nice way to do it (with proper sanitization, of course).

I'd also venture that this is an order of magnitude faster and more responsive than a framework so there are definitely some plusses and minuses to consider.

euphemize · on July 26, 2013

yep agreed. if you're comfortable with nginx and just want to test something quickly, this is many times better than downloading a framework, setting up, routing, etc.

stevewilhelm · on July 26, 2013

How does one do user authentication and resource authorization?

jallmann · on July 26, 2013

Auth Basic can be done directly against postgres, plus a few more modules (headers-more and eval was how I did it, IIRC). Could probably do it with just openresty too.

After authentication, you have the credentials needed check authorization, it'd probably just involve a few changes to the SQL in the article.

jahewson · on July 26, 2013

Presumably with an 'api_key' table and an 'Authorization' header. Alternatively, Nginx can do cookies and have if-then logic to check them.

mp3jeep01 · on July 26, 2013

You can easily do AuthBasic in nginx with a user file