On one hand this is a great news and as Fortran user pretty happy about it.
On the other I’m a bit worried that once LLVM backend will take off for real, so much focus will concentrate on it, that other compilers (PGI, gfortran, Intel Fortran) development will finally stall or even get dropped and we will wake up in future where LLVM behemoth swallowed all the competition.
This concern is not only limited to Fortran, but also other languages to be clear. Of course I’m very grateful for LLVM project, it’s absolutely great. I think that the idea to focus talent pool from many languages is brilliant and beneficial for everyone. The only problem is when it becomes so focused that it becomes only option. I am very happy for variety and options for compilers and backends we have now.
I browsed their github and their linked docs, and it wasn't obvious to me what benefit cranelift has over LLVM. There was one section on implementation differences, but I'm not familiar/smart enough to infer from that what use cases cranelift enables.
I believe compilation speed is much better, for one immediate practical benefit. I think there are others but not sure. Many many people have tried to use LLVM as a JIT and either given up, or had to add significant infrastructure, such as additional IRs, on top to make it work.
So are those the motivations/design-goals or just nice properties? In particular, is JIT/fast-compilation speed incidental to the relative youth of the project? I.e., presumably fewer optimizations vs LLVM allows the compiler to be faster?
It's design-goals and tradeoffs. It was a light JIT compiler first, and has been designed that way, while LLVM is an heavy AOT compiler and was designed that way.
I really doubt Intel Fortran will stall. ifort is pretty big business for Intel and the HPC space. It's probably one area where LLVM's advantages don't really matter.
I've been working on LoopVectorization in Julia, and benchmarking against a few compilers.
Intel's compilers are far ahead of GCC and LLVM in vectorizing loops.
LLVM (even with Polly) fairs worst in my benchmarks, so it does not look like a future where Intel is obsolete is on the horizon yet.
However, I am excited for Flang and its FIR MLIR-dialect. I haven't benchmarked MLIR-optimized code at all. I'm sure that will change things, but until I test I have no idea by how much.
Note that a lot of Intel’s advantage in SIMD, historically, came from optimization modes that simply (unsafely) assume that there are no dependencies that would block vectorization (instead of doing the conservative analysis like GCC and Clang). It makes for impressive benchmarks, but is borderline-unusable in much real code.
Intel’s compiler team has actually suggested some patches adding the same mode to LLVM, though I’m not sure what the current status is, since the initial reaction was not overwhelmingly positive.
Interesting.
Would this be safer in a language like Fortran, where (without aliasing between separate arrays), loop dependencies should be more obvious?
I think it'd be nice to be able to activate this mode through pragmas.
Does "#pragma omp simd" result in more aggressive use of blocking?
I don't know how omp simd is related to "blocking", but it can slow code by a factor of two compared with GCC optimizing the loop normally, because you get AVX(51)2 but not FMA. (Observed with a generic C GEMM, which got about 60% of the micro-optimized one after removing such pragmas and just letting gcc do its -Ofast thing.)
My experience in optimising c++ and Fortran with Intel compiler 16 is that it is safe regarding dependencies. Often #pragma ivdep is required to vectorise loops it can't accept otherwise.
This "leagues ahead" is simply not true experimentally. I ran the Polyhedron benchmarks on SKX with profile-guided optimization. Pre-release gfortran 10 was (insignificantly) faster on the bottom line than beta ifort (from oneapi). That was reversed for gfortran 8 v. ifort 18.
It's also not true that HPC performance is generally dominated by code generation rather than libraries and communication costs, but obviously mileage varies.
There are no great secrets to the HW or vectorization. It’s mostly a question of willingness to devote resources to the problem and hire folks (e.g. Aart Bik, who literally wrote the book on vectorization at Intel and is now working on the TF compiler team at Google and contributing to LLVM).
There will be cases for which it's not so, but every time recently that I can remember someone saying ifort/icc vectorizes some numerical code profitably, and gfortran doesn't, and I had the code, I got GCC vectorizing it with equivalent flags. ifort defaults to incorrect behaviour for floating point (something like gcc -ffast-math). [Edit: As I now see it says in a sibling.]
Let’s say your hardware cost $1 million. Then, a compiler that brings you a 1% speed increase compared to a free one should be break-even at a license price of $10,000, possibly more if you factor in the cost to run that machine. $1 million computers are rare, but not that rare (The fastest in the top-500 cost over $100 million), so, _if_ you can build a compiler that consistently does that, there should be a profitable business there.
I would think margins on CPU sales and on consulting would dwarf that 1% margin, though. Because of that, I think it’s more “winning the benchmarks game, because that’s what sells hardware there” then revenues from compiler sales that keeps intel’s compiler alive.
Both. They are able to optimize things on Intel processors that no one else since they are the ones that make the processors and have more proprietary information and such. It's closed-source freeware but they make money on support.
The great thing is anyone could take the output of the Fortran LLVM front-end, and then compile that however they wanted with their own compiler technology. LLVM doesn't need to mean consolidation of backends. It can just mean you get a front-end for free, if that's all you want.
It's stated that llvm ir is not stable (https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-comp...) and in my relatively limited experience with it there have been enough differences between every little release to make decoupling even official frontend and backend versions not safe or reliable (much less bolting on and maintaining an unofficial backend impl). If bundling/pinning the flang/clang versions with the supported llvm ir version supported by the alternate backend is feasible (which seems sane) then I definitely agree this is an awesome capability (clang-on-sulong) but given how unapologetically volatile llvm ir is I wouldn't exactly call it 'free' :p (but still a lot better than writing a c++ frontend)
Not quite. Every compiler typically comes with its own runtime library (RTL). Thus, an output of one frontend is likely to have RTL calls specific to that compiler. For example, gfortran makes calls to its RTL, which is not the same RTL used by Intel compiler.
Ah, ok yes. But in my OP I didn’t really mean technical problems, rather organisational/social.
I mean that if all top-class talents in compilers technology focuses on llvm there probably wouldn’t be a lot people to be both willing and able to write alternative backends.
There will always be somebody interested in doing something else. Somebody considers that something can't be represented nicely in LLVM IR and creates their own thing. Somebody can't work with LLVM folks for social reasons and creates their own thing. Reasons there are many and thus at some other point something successful will challenge them.
For the C++ world the competition by LLVM/clang was fruitful and triggered lots of improvements in gcc/g++. Produced code got faster, diagnostics better etc.
Sure, Fortran is a different area, with less commercial interest a d other challenges. (LLVM is pushed by Apple and Google for non-Fortran needs - it is thinkable that they push decisions, which hinder Fortran, whereas gcc has a different goal and might long term more receptive to Fortran needs?)
> Sure, Fortran is a different area, with less commercial interest a d other challenges.
Actually, it's quite the opposite. You're never going to make money selling a C/C++ compiler, but you can make loads of cash selling a Fortran compiler. It's just that the Fortran compiler is likely to come with your supercomputer.
C and C++ Compiler vendors targeting embedded, security critical domains, or Intel, PGI, Embarcadero, Microsoft, Oracle, Intel, HP, IBM, Unisys, CodePlay might disagree.
There is plenty of money to be done in C and C++ commercial compilers, it just depends on the customer base, and features missing from clang/gcc regarding overall tooling experience.
I've been thinking a lot about vectorizing loops recently, especially on AVX512 systems.
I've mostly been doing microbenchmarks, and I realize that microbenchmarks might not give a realistic full-program view.
The thing that struck me is that there is a 9 microsecond period where avx instructions (AVX2 and AVX512) operate at a small fraction of normal speed before the CPU decides to transition to a slower state.
If most of your code is running in L0 (max clock speed) license, then any vectorized code you run into will run at 1/4 speed for about 9 microseconds. If it does run for that amount of time, it'll transition with an 11 microseconds break. It'll have to keep running for a long time to amortize this penalty.
Then, once the function returns to the rest of your scalar code, it'll eventually have to speed up. Basically, large programs are probably fastest if they stay in relatively the same state.
By having a large scalar window, like LLVM does, it's less likely to change. Most loops are probably fairly short, and most code is also scalar, therefore you'll want the CPU to generally stick to scalar mode. Only if loops are very long and likely to take milliseconds would you want them to be vectorized.
Or if they're surrounding by other SIMD code, but that's a sort of global/whole program state you cannot infer while optimizing a single function.
It is likely best to go lean very heavily to one side in your preference of scalar vs vector, but which side is better varies by program. LLVM is essentially leaning heavily toward scalar in their loop behavior, which is probably best for most C and C++ programs. Many Fortran programs might prefer vector.
My own (Julia) code does. But I have a hard time talking about Julia or Fortran programs in the abstract. I tend to ensure vectorization. Most programmers don't, so even in these languages, they're likely to prefer and benefit from different defaults than I do.
>On the other I’m a bit worried that once LLVM backend will take off for real, so much focus will concentrate on it, that other compilers (PGI, gfortran, Intel Fortran) development will finally stall or even get dropped and we will wake up in future where LLVM behemoth swallowed all the competition.
It is likely it will take years if not decades for the llvm based fortran compiler to match the optimization ability of compilers on hpc systems. I also will clarify that the previous statement was an understatement. gfortran is one thing, PGI or Intel compilers won't stall or fall out of use.
Don’t worry; people said this about GCC (especially once Cygnus got going) and now llvm is challenging GCC. The existence of llvm has caused gcc to improve as well.
Gcc (well g++) is still my primary compiler though I do run my code through clang as well as llvm do catch different bugs. I can imagine llvm becoming my primary compiler at some point, but it will still be a few years.
As mostly user of commercial software, it hardly affects me, but it is going to be interesting to watch when all copyleft software gets into a niche and we get back what was basically shareware and PD on the home computing scene.
I am predicting that non-copyleft licenses will replace all uses of copyleft licenses, and everyone will get basic functionality in open source with juice bits only avaiable in the commercial versions, just like in the old days before of the rise of GPL.
The evolution of BSD's adoption by commercial entities shows the way.
I hope you're wrong. The strength of Linux, Apache, and AGPL points the other way. It's almost impossible to get a chip from the likes of Rockchip or Mediatek without a Linux and/or Android bistro attached. GPL-style licenses on IoT systems may be an important part of the only way to avoid an apocalypse of compromised crapware IoT devices.
The more serious problem is the weaponization of open source by the big actors, using it to simultaneously generate a scorched-earth moat around their respective castles while hiding proprietary extensions behind a network connection. I do not consider this to have been good for the software world at all.
As I'm sure you know your prediction has been made for 30 years. Doesn't mean it can't come true but unlike you I don't see the tide moving that way.
The affero license is useful for things accessed over the web, which of course became quite important and thankfully the GPL catching up.
I wrote the library license back in the early 90s because of a similar shift (Unix and Windows were late adopters of the the philosophy of libraries, not just programs, for non- system code, but once they finally started to get on board the GPL had to catch up)
I haven't seen that talk, but from a previous one, Flang is closely related to the PGI compiler, and F18 is to escape that.
I wouldn't miss the grief associated with ifort, which doesn't live up to the mythology. The salient feature of the PGI compiler is probably the offloading, and I assume that's Nvidia-specific, but I don't know how it compares with current, and upcoming, GCC support. The research computing world would be a better place if the money spent on proprietary compilers sponsored GCC improvements instead.
From what I've seen, PGI's offloading (with OpenACC) still delivers better performance than GCC, although GCC is improving. Also PGI usually give better error messages or comments about what it's doing when offloading.
But yes offloading with PGI is for Nvidia cards only, and GCC contributors are working on adding support for offloading to AMD GPUs.
LLVM's biggest weak spot as a compiler is its lack of good loop nest optimizations. And that hurts Fortran code a lot more than C/C++ code, because your 6-level loop nest code is more likely to be written in Fortran instead of C/C++.
A further point to make is that LLVM IR is strongly biased towards C, and this is especially true when it comes to the memory model. All memory has to be lowered to access via (essentially typeless) pointers, with optional aliasing qualifiers provided via a noalias parameter attribute (which breaks a lot, because restrict isn't all that common in C), and TBAA. And all higher-order information has to be reverse engineered from this starting point.
As far as I know, the main speed advantage is that Fortran supports arrays, which are known to not alias.
C/C++ does not have alias-annotations built-in. Although with template constructs like done in e.g. the Eigen library, alias-optimal code can be generated.
My main hope for flang is that it motivates LLVM to fix the Rust-relevant non-aliasing bugs. However, is it actually known that flang/f18 emits IR with the same patterns that exposed the Rust-relevant LLVM bugs?
Other posters mentioned "restrict". Rust is able to basically use "restrict" on steroids because of how much more info has, so much so that it's brought up bugs in LLVM that are not fixed yet.
I just thought of a stupid question. If gcc and LLVM both support fortran how actually hard would it be really to add real arrays to C? At least as an option? AKA add an option --fortran_arrays.
Fortran defaults to no-alias, C and C++ default to may-alias. The annotations exist, but you have to apply them all over the place to get the benefit. It’s doable, but the Fortran model is less error-prone for many users.
Actually, Fortran defines "storage association" rules, so code that doesn't obey them isn't conforming Fortran. I'm not sure about the error-prone, though, from long experience and users arguing with a compiler maintainer. One problem is lack of static checking.
Numeric code rarely (almost never, actually) requires aliasing, so defaulting to no-alias is less error prone for those users. If you’re writing an OS or runtime library, you want aliasing support and static checking, but that’s mostly out-of-scope for Fortran today.
The Fortran standard just doesn't talk about "aliasing" (in that context), and "defaulting" suggests you can turn off the storage association rules somehow.
I maintain from long research computing experience (at least back to the days of Alliant) that the rules are highly error prone in practice for users, who frequently deny they even exist, and blame the compiler bugs. (I'm surprised if that's not the case more generally.) I'm not saying they shouldn't exist, or that code needs to contravene them.
Yeah, I'm using the C nomenclature ("aliasing" vs "association") because HN commenters are broadly unfamiliar with Fortran.
What I mean by "default" is that if I declare a function with two array arguments in Fortran, with the simplest possible syntax, the compiler assumes that they do not alias (are not associated). By contrast, if I declare a function with two pointer arguments in C, with the simplest possible syntax, the compiler assumes that they may alias (are associated in some unspecified manner).
The C semantics are certainly safer, but they lead to lots of "Fortran is faster than C" blog posts by people who either don't know about or simply don't want to use the annotations.
Flang's code is somewhat slower than gfortran's from comparing the geometric bottom line from the Polyhedron benchmarks on SKX with profile-directed optimization. I won't probably argue argue with arguments about whether that's a reasonable comparison.
What exactly is the benefit of having multiple compilers? I'm not well versed in the area, but it seems like having one compiler is better from the perspectives of standardization, having a single way of handling undefined behaviors, etc?
Competition is a strong motivator to improve areas your implementation is bad at. E.g. with gcc/clang an often-mentioned example is error messages: They're not a benchmark you can optimize for, but turns out users really like helpful error messages and will start using a different compiler for them.
Actual standardization (as in, there is a document describing the what's supposed to happen) is helped by multiple implementations because their conflicts will help discover unclear parts. This in return helps new implementations get of the ground if there's a need to produce them. Some standards groups require multiple independent implementation of something to exist before it is allowed to be released as a standard.
Different implementations might be different enough that some changes are easier to make in one than the other. This makes it easier to test these changes, the other implementation(s) can then decide if the gains are worth their effort.
It provides an out if one project resists changes for human/political reasons.
I wouldn't be surprised if there's a few unusual architectures around that use Fortran but aren't handled by LLVM.
The ability to make it proprietary is in fact the reason why Apple didn't adopt gcc and became the main supporter of clang/llvm instead. So the issue is already here. Fortunately Apple's ambitions only center around keeping full control of the Apple walled garden (the apple main building is a wonderful symbol of it). And should a strong proprietary competitor outside of Apple's ecosystem emerge, people can always fork clang and put it under GPLv3 or similar.
If that really were the case, why did Chris Lattner try to give clang/llvm over to the FSF and merge it into gcc? Which failed only apparently due to Stallman not noticing the emails. After which Apple started pushing clang/llvm as in a way, they were shunned by gcc.
The rest of your comment is... way too political towards gpl.
Chris Lattner suggested it back when gcc was first working on LTO, and clang didn't exist. 2004 or 2005-ish, IIRC. Back then, he was still working at UIUC.
Apple pushed clang as a competitor to gcc much later, closer to 2010-ish.
IIRC the point of fortress was to win DARPA's high productivity computing challenge, having failed at that it just remained a random research language.
Which is a pity, because it had some very interesting ideas, imvho.
Looks like there is a Fortran compiler which is already emitting MLIR IR with few OpenMP constructs support and 2 SPEC CPU 2017 benchmarks running: https://github.com/compiler-tree-technologies/fc
I'm excited to see more front ends embracing MLIR.
A lot of new interesting languages (Julia, Rust, etc) were empowered by being able to take advantage of LLVM as a backend, so I can't imagine what new languages might spring up around the heterogenous capabilities of MLIR, not to mention existing languages targeting it as well.
Fortran has language-level support for things important for numerical scientific computing (complex numbers, multidimensional arrays, etc.), and it has had them since the beginning in the 1950-60s.
The convenience is not really matched by C or C++, where similar features have been added much later by language extensions or 3rd party libraries, resulting to more complicated usage, fragmentation, and interoperability problems. Newer languages also have similar issues, so for the user base that uses Fortran, there's a lack of viable competitors.
Also numerical codes can be trickier to write than you think. Rounding errors and other treacheries of the floating point approximation to the 'real' numbers can eat you alive.
If you want to do common numeric operations they might be a FORTRAN code that is battle tested and performance tuned and it usually not hard to call from C, Java, Python or some other language.
Fortran’s numerics model is actually somewhat weaker than C’s; if precise rounding behavior is your main concern C (while still not ideal) is a better option.
I write mostly C++ and Scheme and last year I wrote my first piece of code in modern Fortran (I had some experience porting/integrating old F77 libs). It's trivial to interface with C and lets you do numeric stuff immediately and without libraries using a plain syntax. Compared to C++ with template array libraries it compiles much faster and doesn't get in your way nearly as much. It has a module system... Compared to numpy/matlab/etc it takes some adjusting b/c it's still a lowish level language (errors lead to segfaults, etc.) but a lot less than going to C/C++.
It's still around in the high-performance scientific computing world. Feature-wise, C99 basically looks like playing catch-up with Fortran to better compete in that space (eg restrict, complex numbers, type-generic math functions, stack-allocated variable length arrays).
It's used in science (physics, chemistry, astronomy etc). Just like C++ is used not because it's a nice language but because of inertia and path dependecy, the same goes for fortran.
On the other I’m a bit worried that once LLVM backend will take off for real, so much focus will concentrate on it, that other compilers (PGI, gfortran, Intel Fortran) development will finally stall or even get dropped and we will wake up in future where LLVM behemoth swallowed all the competition.
This concern is not only limited to Fortran, but also other languages to be clear. Of course I’m very grateful for LLVM project, it’s absolutely great. I think that the idea to focus talent pool from many languages is brilliant and beneficial for everyone. The only problem is when it becomes so focused that it becomes only option. I am very happy for variety and options for compilers and backends we have now.