... Which begs the question: what good is an ORM if it does not prevent by design such issues? Here, we are essentially saying users of ORM must also have in their mind the SQL version. Or call an expert after the mess is done :/...
I find this question to be very weird. I have never seen ORMs as tools for completely abstracting away the database. I have always seen them as convenience APIs for using the database. They exist to make your code shorter, less repetitive, more readable, easier to reason about, and more maintainable; not to make you forget about the database altogether.
I also don't see ORMs as exclusive. It's fine to use ORMs for 95% of your use cases, but drop down to raw SQL for the remaining 5% where it's not a good match. That's still a win for the goals I mentioned above.
> I have never seen ORMs as tools for completely abstracting away the database.
Some are advertised that way. Entity Framework Code First, Migrations etc.
At our place, our DB dev team is larger than our api team, which in turn collectively dwarfs our front end teams. ORMs in this sort of environment have never really been given a chance, but I feel like it would be easy to justify them in many situations...
As an example of this, the influential DHH of Basecamp blogged saying just that: https://m.signalvnoise.com/conceptual-compression-means-begi... "Basecamp 3 has about 42,000 lines of code, and not a single fully formed SQL statement as part of application logic!"
Given that these are the people who wrote the Rails ORM, you'd expect that they know how to use the ORM to generate high quality SQL. Which is actually quite doable - in Rails / ActiveRecord you're much better served by knowing what happens for every ORM call, and the default development log prints every generated SQL query as well. Think it now also provides alerts when the queries are slow.
True, but my point was that that blog post supported the point that ORMs are sometimes promoted as a way to avoid needing to know SQL. The intended audience of that blog post was not people who write ORMs, it was to persuade people who are writing applications that they don’t need to learn how to use a database, that they only need to learn how to use an ORM.
Yeah, that's isn't going to work. Would want to use an ORM the way you'd use a bicycle - it won't necessarily let you do something you couldn't before, but it makes it easier. Using an ORM without knowing the SQL it generates is like learning to bicycle without learning to walk. You'll fall down at some point and then you're well and truly screwed.
They also let you compose SQL fragments in ways that would be a lot more difficult or less clear otherwise. In the case of Diesel, they let you typecheck the composition of those fragments. ORM hate is based on misunderstanding of the benefits those who do use them get from them.
I'm not very sure that Jooq and SQLAlchemy and other 'not quite ORMs' really help that much either.
I think, as the article suggests, that 'opaque' is as much the problem as the classic Object-Relational impedance mismatch, but I would posit that the opaqueness isn't just that the programmer can't _see_ what happens, but that they also don't _care_.
So I'd put my faith in tools instead. If programmers don't want to think about this, let the tools try instead.
The static analyser was interesting. Although static analysers have shallow comprehension, they were able to identify anti-patterns that were common enough to make it a useful linter. And of course their classic-mistakes approach can be applied to other languages too, even if each language and ORM needs its own corpus.
I sincerely hope that JetBrains and other IDEs bring this kind of analysis into their IDEs. I write a lot of non-Ruby code, and the JetBrain IDEs do keep suggesting nice 'code simplification' and 'generate boilerplate' checkers as I type.
But while they check my regex for well-formed-ness and colour my SQL but don't have any kind of meta analysis and comprehension. Missing opportunity.
You could take this further for normal non-DB code too - they could warn me when I have inefficient O(n) search in a loop and so on.
> You could take this further for normal non-DB code too - they could warn me when I have inefficient O(n) search in a loop and so on.
For phpstorm you have the EA extended plugins which gives lot of hints like that. I'm sure you could find the same kind of plugins or write one backed by a static analysis tool for other languages.
ORMs are good for 90% of your use cases, the rest I'm happy to use SQL for.
I could attempt to deconstruct the ORM problem, but Martin Fowler had a nice article [1] on replying to "ORM hate" a few years back which is worth the read for folks that question the usefulness of ORMs (and I believe questioning things is a healthy thing to do!)
As much as I like Fowler, he's missing an important option in his description: don't try to do object relational mapping generically. Consider that you have business model objects. In the UI, you create a "view" of that data. Of course the UI view is not related in structure to the business model object. That would be absurd. We need to use collections of model objects and present the data in a flexible. Just as absurd would be to build a framework for automatically mapping the data in the model layer to the UI view.
Now consider that your DB is just like your UI. It's a view of your model objects. Just in the same way, there is no reason why your model objects should be related structurally to your DB representation of the data. And in exactly the same way there is no particular reason why you should have a way to automatically map to and from your DB layer to your model objects.
In many cases, it will be less complex to build a bespoke OR mapping rather than try to find a system that will do it generically.
Computers are fast. Correctness is more important than performance. Most businesses fail.
With an ORM you write the first working version of your product quickly, validate product-market fit, and maybe spend a small amount of time profiling and optimising eventually when you need to scale. That's a good tradeoff.
> Which begs the question: what good is an ORM if it does not prevent by design such issues?
This isn't as terrible as you make it out to be. For example, general purpose programing languages don't prevent the programmer writing a O(2^n) algorithm; yet they are useful.
Right up until the point where you have to hack around the ORM to get performance where it needs to be. Then you suddenly have more complex and harder to maintain "magic" code than if you'd just written some simple-but-boring boilerplate and SQL from the start.
I'm wading through the tedious and boring process of writing a data access layer for an application at the moment. It's repetitive, there's lots of error-checking, and testing it is a pain in the arse. My big consolation, though, is that in a year's time when I need to optimise the queries because we're getting load on it, it'll be easy to understand and simple to change.
ORMs don't make code easier to understand, they make it less boring to write. Those are not the same things.
I guess this depends on the framework in use, but for example, in Hibernate you add a single "include" clause to avoid n+1 queries. It is only "magic" if you don't understand how it works, but then your SQL would be just as inefficient.
Any ORM I know allows you to easily drop down to raw SQL when you need it - but in the majority of cases you never need to. Writing everything manually in SQL with tedious boilerplate wrappers because you might need to manually optimize some queries at some point in the future seems like a massive violation of the YAGNI principle.
> Then you suddenly have more complex and harder to maintain "magic" code than if you'd just written some simple-but-boring boilerplate and SQL from the start.
Not my experience at all. Rather you have your straightforward ORM code with maybe a couple of "magic" hint annotations here and there where you need them. Your non-performance-critical queries are plain and simple, your performance-critical queries are less plain but remain readable. Whereas if you write boilerplate SQL the whole time all your queries are "readable" in theory, but there's just so much of them that you can never actually understand more than a fraction and have no way of knowing which differences are important and which are accidents.
(Unless, of course, you throw away the whole ORM infrastructure at the first sign of a performance problem and insist that you absolutely have to run custom SQL directly, disable entity caches, and so on. But don't do that.)