Do you have specific criticisms of my work and/or attention to detail re: the TL...

jerf · on July 26, 2015

I'll add another slight spin, which is that I'd never run this in production, ever, even if I actually personally trusted you a great deal, because I can't possibly audit this sort of code base. By extension, for the same reasons I can't audit the source very well, neither can anyone else. (i.e., no, I do not personally audit everything I run but I reasonably expect that because it is possible others have. "Many eyes make bugs shallow" may be oversold but it is not simply false.) The only practical way I know to audit this sort of code base at this time is naked-eye inspection, and I don't trust that.

I say "I know" because there may be something out there. I know there exists tools for source code analysis that deal directly with assembler, because I see theses around writing them. I don't know where to get one, though, or how to use it, or how much to trust it. There's a lot more that deal with "C" or "Java" rather than raw assembler, so I can fire several tools, both commercial and open source, at the problem.

And all that said I'm still extremely strongly in the "STOP WRITING C CODE AND PUTTING IT ON THE INTERNET DAMMIT" side, even with that support. Without even that support, I frankly don't care if it's 100 times faster than nginx. nginx is already maxing out my risk tolerance as it is, and I've begun a long, slow program to get it out of my stack too.

I want to emphasize how this is explicitly (but also unapologetically) non-specific, none of this is personal, none of this is directly critical of your code (because if you've gotten the impression I haven't even glanced at it, that is correct), and in particular, please by all means do whatever you like with your spare time. The "problem" here isn't that you have somehow failed to leap my bar, the problem is that the bar is impractically high for code written in raw assembler. I suppose you could provide a math proof but I'd almost argue in that case the server becomes implemented in said proof language rather than assembler anymore.

technion · on July 27, 2015

The related element is this: if the team behind nginx somehow all get by the same bus, I am confident someone appropriate will pick up and maintain the product.

In the case of an ASM project, I would be very surprised if anyone came along with the appropriate knowledge to ever want to touch the codebase. LibreSSL is currently pulling bits of ASM out of the codebase just to remove that factor.

Like Jerf said, I want to be clear that I'm incredibly impressed you got this project over the line, and I can't make any complaint about how you've done things.

arthurcolle · on July 26, 2015

How would you replace nginx??

jerf · on July 29, 2015

Go's http server. The inside of nginx is an incredible mess of C. I begrudgingly trust it since it is being actively attacked and maintained. The inside of Go's http server is incredibly clearly written. I am confident that it too will be maintained. And it's roughly half the speed of nginx, which is plenty fast. (Few web application servers in the world are sitting there with nginx using all the CPU.)

It's a long slow process partially precisely because I intend to do this carefully, and discarding nginx is not something to be taken lightly, but long term, as I said, I want the C out of my stack.

wtarreau · on Aug 6, 2015

That's interesting because I'm seeing people do the exact opposite : since almost all security holes in web environments these days come not from the applications but the myriad of frameworks and unauditable layers making fun use of objects all over the place, now using C or even ASM is the only way to limit the moving parts and to ensure that your code base doesn't change between two audits.

ddevault · on July 25, 2015

I'd be happy to give some more detailed thoughts. I write a lot of assembly myself [1], so hopefully that lends me some credibility.

I haven't read much of your TLS implementation, but I'm not a security researcher and I don't think I'm qualified to give an opinion on your particular implementation. However, there are some points to be made here. First of all, almost no one ships their own crypto for a good reason. To trust that something is secure, you need to have lots of people working on it and lots of projects invested in it (something that OpenSSL and such have). "Many eyes makes bugs shallow" is the common phrase, but it holds truer when large companies with highly skilled engineers are putting their secrets and their customer's secrets on the line. Your implementation has the unfortunate problem of being written in assembly. While I don't think there's anything inherently wrong with assembly, many people don't share the same opinion. People will be reluctant to contribute to it (how do people contribute, anyway?). Also, it is easier to make mistakes in assembly, and I would be surprised if there weren't several mistakes in your TLS implementation and in the rest of the server. This doesn't reflect poorly on you as a developer, but is instead a consequence of choosing assembly.

Assembly also inherits a lot of the common security problems C has (like buffer overflows), but makes them harder to identify. I would feel very uncomfortable exposing anything written in assembly to the public net, and doubly so if it used an unproven TLS stack written in the same. Other projects avoid the problem of untested crypto by using tested crypto from an external module like OpenSSL.

[1] https://github.com/KnightOS/kernel

2ton_jeff · on July 25, 2015

Agreed regarding general trust in any crypto stack. I've been doing commercial software development for 28 years now, and my company's products all reflect this. Whether I expect high-value security sites to use my software in production or not, well I certainly do not. Hardened stacks are few and far between, and OpenSSL can by no measure be deemed hardened (though certainly getting better of late thanks to all of the bug releases). Do I expect that my entire stack is 100% bug-free? No, but one of the niceties IMO of doing assembly language programming is that it is far less error tolerant in the ways you describe. Reading all of the nasties re: security-related code, and then applying the commonly-accepted mitigation strategies was applied throughout.

Re: how do people contribute, it is on my list of things to do for github's linguist x86_64 support (which is why I didn't put it all on github to begin with).

At the end of the day, trust is a function of time and perceived scrutiny of the stacks at hand. We are getting there slowly but surely :-) Cheers!

e12e · on July 26, 2015

I'm sorry, you've held off on publishing on github due to missing source code highlighting? Or am I completely misunderstanding what you're saying (I think I am...)?

2ton_jeff · on July 26, 2015

Admittedly I haven't checked recently, but before I released 2 Ton Digital I did a few test github projects and they all looked horrific so yeah I left it out on purpose. It's been on my "someday when I am bored" list since then (to fix up linguist so it all looks half-decent), and also why all the "library as HTML" on 2ton.com.au is self-highlighted.

userbinator · on July 26, 2015

Assembly also inherits a lot of the common security problems C has (like buffer overflows), but makes them harder to identify.

Actually, I'd say that it has advantages because the mindset is very different when writing Asm - it naturally forces you to think about things in a low-level and precise fashion, which keeps considerations such as buffer lengths more in the mind than higher-level languages that attempt to abstract it away.

Programming at the instruction level also allows much more fine tuning of instruction ordering and such to resist timing attacks, without any compiler optimisations getting in the way.

anotherangrydev · on July 26, 2015

I don't know this firsthand, but I've heard from many people that had delved into OpenSSL code that it is an example of code where "many things could be made better", to put it lightly.

ddevault · on July 26, 2015

OpenSSL could be way better. That's why LibreSSL exists. But it is something that lots of people are looking at.

buster · on July 26, 2015

I think although a lot of people used OpenSSL only few looked into its source code. Those who did might have been horrified but since there was no real alternative continued using it.

Only after massive security vulnerabilties and a lot of media attention more people looked at OpenSSL in detail and eventually decided to do something about it. Which mostly was "let's write a new library or fork it". Thus, LibreSSL, sodium, nacl and such.

What got us in the mess with OpenSSL, was to leave a key component of many software projects to a struggling, small team. It's amazing how much open source relies on a few ancient programs written and maintained by few with little to no financial support (e.g. NTP, GPG).

ddevault · on July 26, 2015

>Only after massive security vulnerabilties and a lot of media attention more people looked at OpenSSL in detail and eventually decided to do something about it. Which mostly was "let's write a new library or fork it". Thus, LibreSSL, sodium, nacl and such.

Right, but those vulnerabilities _were_ found. I worry that they wouldn't be found in this. The only people who'd go looking are the people who see that a specific website is using it and want to exploit it.

buster · on July 26, 2015

That would only be true if those vulnerabilties occured because someone found a bug in the source code. Given the horrible mess the openssl code is said to be, i'd argue that most vulnerabilties were found without source.

deoxxa · on July 26, 2015

Wasn't it that assumption ("lots of people are looking at [it]") that lead to the current state of OpenSSL?

e12e · on July 26, 2015

From the documentation:

> BREACH/TIME/etc > > Both the BREACH and TIME attacks rely on measuring the size of compressed response bodies. Since rwasa supports dynamic content compression by default, the HeavyThing library's default setting for webserver_breach_mitigation is enabled and set to 48 bytes. For each rwasa response when TLS and gzip is active, this setting adds an X-NB header that contains a random 0..48 bytes that is hex-encoded to each response header. While this doesn't render response sizing attacks completely useless, it makes a would-be attacker's job much more difficult due to the highly variable response lengths.

It's my understanding that random padding doesn't in fact make the attacker's job "much more" difficult. Only a little more, or not at all?

Could you comment on how integrated the TLS stack is with the webserver? Normally I'd think that using some kind of dedicated SSL terminating proxy, either a new version of HAproxy -- or stunnel/stud or similar -- would make more sense than deploying a new TLS stack that hasn't been through any outside review?

That said, as mentioned by others here - openssl is clearly not a great example of a secure/good TLS implementation. I'm not sure there are any (yet). Hopefully libressl will become one. Personally I'd like to see a minimal library that combined a couple of AES/ECC primitives and implemented TLS 1.2+ only (No SSL), with a sane and clean API on top.

Something along the lines of NaCl but with a goal to support a subset of standard TLS with forward secrecy (and explicitly throw old clients under the bus, Android 2x be dammed).

2ton_jeff · on July 26, 2015

> It's my understanding that random padding doesn't in fact make the attacker's job "much more" difficult. Only a little more, or not at all?

The BREACH attack verbage at http://breachattack.com spells it out fairly clearly, by adding random bytes to all of the HTTP responses, it makes small compressed HTTP payloads impossible to determine whether guessed bytes were correct or not (well, depending of course on the size variable of the random bytes added).

> Could you comment on how integrated the TLS stack is with the webserver?

The TLS layer is entirely separate from the webserver layer. I built the epoll, TLS, SSH, webserver and client as "IO layers", such that they can be stacked together arbitrarily (imagine epoll/IPv4 listener -> TLS -> SSH -> TLS -> Webserver, perfectly doable, albeit a little nutty).

e12e · on July 26, 2015

> it makes small compressed HTTP payloads impossible to determine whether guessed bytes were correct or not (well, depending of course on the size variable of the random bytes added).

Hm, ok. At least you didn't "just add some random padding" :-)

Thanks for the comment on structure. Might be nice to try and make ssl/tls terminating proxy as a separate binary I guess.

As for the code, for someone new to fasm it wasn't immediately obvious that to build one had to assemble then link (fasm -m $((bignumber))[1] project.asm project.o && ld -o project project.o # optionally strip project). Might want to but that in a Readme/makefile/build.sh. I found the general recipe in the hello-example - but a short readme in the various project folder and/or top level wouldn't hurt.

[1] ed: from https://2ton.com.au/HeavyThing/#echoserver

fasm -m 262144 echo.asm && ld -o echo echo.o

discardorama · on July 26, 2015

> I built the epoll, TLS, SSH, webserver and client as "IO layers"

I think you should just finish the job and implement the entire OS in assembly. ;-)

I'm kidding. To me, assembly programming has always seemed like a true art form. You're forced to think about everything, and if you can successfully fit all the pieces together properly, it's beautiful. Also, not many can hack it through assembly, so there's a huge selection bias too.

TheLoneWolfling · on July 27, 2015

Why can you not simply repeat the request a bunch of times and take the minimum length / 10th percentile / maximum / something along those times?

It increases the number of requests required, yes, but I don't see why it makes it impossible.

inglor · on July 25, 2015

Probably the amount of eyes that read it. This stuff is insanely easy to get wrong in very subtle ways and reimplementing the stack always carries a risk.

This is why people usually prefer to stick to the "regular" libraries even if they're known to be slower.

lucio · on July 26, 2015

All "regular" libraries were, at some point, "new" libraries.