Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Flang: The Fortran frontend of LLVM [video] (fosdem.org)
135 points by pjmlp on Feb 9, 2020 | hide | past | favorite | 96 comments


On one hand this is a great news and as Fortran user pretty happy about it.

On the other I’m a bit worried that once LLVM backend will take off for real, so much focus will concentrate on it, that other compilers (PGI, gfortran, Intel Fortran) development will finally stall or even get dropped and we will wake up in future where LLVM behemoth swallowed all the competition.

This concern is not only limited to Fortran, but also other languages to be clear. Of course I’m very grateful for LLVM project, it’s absolutely great. I think that the idea to focus talent pool from many languages is brilliant and beneficial for everyone. The only problem is when it becomes so focused that it becomes only option. I am very happy for variety and options for compilers and backends we have now.


I'm hoping for Cranelift (https://github.com/bytecodealliance/cranelift) to deliver a more JIT-friendly alternative to LLVM (Though it can do AOT compilation too).

Another very interesting project, even if it is slightly intimidating in scope, is GraalVM (https://www.graalvm.org).


I browsed their github and their linked docs, and it wasn't obvious to me what benefit cranelift has over LLVM. There was one section on implementation differences, but I'm not familiar/smart enough to infer from that what use cases cranelift enables.


I believe compilation speed is much better, for one immediate practical benefit. I think there are others but not sure. Many many people have tried to use LLVM as a JIT and either given up, or had to add significant infrastructure, such as additional IRs, on top to make it work.


So are those the motivations/design-goals or just nice properties? In particular, is JIT/fast-compilation speed incidental to the relative youth of the project? I.e., presumably fewer optimizations vs LLVM allows the compiler to be faster?


It's design-goals and tradeoffs. It was a light JIT compiler first, and has been designed that way, while LLVM is an heavy AOT compiler and was designed that way.


I really doubt Intel Fortran will stall. ifort is pretty big business for Intel and the HPC space. It's probably one area where LLVM's advantages don't really matter.


I've been working on LoopVectorization in Julia, and benchmarking against a few compilers. Intel's compilers are far ahead of GCC and LLVM in vectorizing loops. LLVM (even with Polly) fairs worst in my benchmarks, so it does not look like a future where Intel is obsolete is on the horizon yet.

However, I am excited for Flang and its FIR MLIR-dialect. I haven't benchmarked MLIR-optimized code at all. I'm sure that will change things, but until I test I have no idea by how much.


Note that a lot of Intel’s advantage in SIMD, historically, came from optimization modes that simply (unsafely) assume that there are no dependencies that would block vectorization (instead of doing the conservative analysis like GCC and Clang). It makes for impressive benchmarks, but is borderline-unusable in much real code.

Intel’s compiler team has actually suggested some patches adding the same mode to LLVM, though I’m not sure what the current status is, since the initial reaction was not overwhelmingly positive.


Interesting. Would this be safer in a language like Fortran, where (without aliasing between separate arrays), loop dependencies should be more obvious?

I think it'd be nice to be able to activate this mode through pragmas.

Does "#pragma omp simd" result in more aggressive use of blocking?


I don't know how omp simd is related to "blocking", but it can slow code by a factor of two compared with GCC optimizing the loop normally, because you get AVX(51)2 but not FMA. (Observed with a generic C GEMM, which got about 60% of the micro-optimized one after removing such pragmas and just letting gcc do its -Ofast thing.)


My experience in optimising c++ and Fortran with Intel compiler 16 is that it is safe regarding dependencies. Often #pragma ivdep is required to vectorise loops it can't accept otherwise.


I'd say it's more than merely far ahead. When it's your hardware and you've been doing HPC optimization for decades, you are leagues ahead.


This "leagues ahead" is simply not true experimentally. I ran the Polyhedron benchmarks on SKX with profile-guided optimization. Pre-release gfortran 10 was (insignificantly) faster on the bottom line than beta ifort (from oneapi). That was reversed for gfortran 8 v. ifort 18.

It's also not true that HPC performance is generally dominated by code generation rather than libraries and communication costs, but obviously mileage varies.


There are no great secrets to the HW or vectorization. It’s mostly a question of willingness to devote resources to the problem and hire folks (e.g. Aart Bik, who literally wrote the book on vectorization at Intel and is now working on the TF compiler team at Google and contributing to LLVM).


There will be cases for which it's not so, but every time recently that I can remember someone saying ifort/icc vectorizes some numerical code profitably, and gfortran doesn't, and I had the code, I got GCC vectorizing it with equivalent flags. ifort defaults to incorrect behaviour for floating point (something like gcc -ffast-math). [Edit: As I now see it says in a sibling.]


Is ifort "big business" in terms of direct revenues from sales of the compiler or because it makes Intel hardware a faster platform for HPC?


Let’s say your hardware cost $1 million. Then, a compiler that brings you a 1% speed increase compared to a free one should be break-even at a license price of $10,000, possibly more if you factor in the cost to run that machine. $1 million computers are rare, but not that rare (The fastest in the top-500 cost over $100 million), so, _if_ you can build a compiler that consistently does that, there should be a profitable business there.

I would think margins on CPU sales and on consulting would dwarf that 1% margin, though. Because of that, I think it’s more “winning the benchmarks game, because that’s what sells hardware there” then revenues from compiler sales that keeps intel’s compiler alive.


Both. They are able to optimize things on Intel processors that no one else since they are the ones that make the processors and have more proprietary information and such. It's closed-source freeware but they make money on support.


The great thing is anyone could take the output of the Fortran LLVM front-end, and then compile that however they wanted with their own compiler technology. LLVM doesn't need to mean consolidation of backends. It can just mean you get a front-end for free, if that's all you want.


It's stated that llvm ir is not stable (https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-comp...) and in my relatively limited experience with it there have been enough differences between every little release to make decoupling even official frontend and backend versions not safe or reliable (much less bolting on and maintaining an unofficial backend impl). If bundling/pinning the flang/clang versions with the supported llvm ir version supported by the alternate backend is feasible (which seems sane) then I definitely agree this is an awesome capability (clang-on-sulong) but given how unapologetically volatile llvm ir is I wouldn't exactly call it 'free' :p (but still a lot better than writing a c++ frontend)


Is it really that simple (I don't know how decoupled it is.) ? There have been multiple Fortran front ends over the years, albeit not for F2018.


Not quite. Every compiler typically comes with its own runtime library (RTL). Thus, an output of one frontend is likely to have RTL calls specific to that compiler. For example, gfortran makes calls to its RTL, which is not the same RTL used by Intel compiler.


I'm well aware of Fortran runtimes. The question was getting at whether the front end is completely independent of the middle(?) end.


I think so - I’ve worked on an LLVM backend completely independent of LLVM.


Oh maybe I missed it.

Is there a way to compile LLVM IR with something other than LLVM? Is this what you mean by output of LLVM front-end output?


I do believe this is what OP meant. The LLVM IR is very well documented, and can in theory be plugged into a back-end that supports it.


Ah, ok yes. But in my OP I didn’t really mean technical problems, rather organisational/social.

I mean that if all top-class talents in compilers technology focuses on llvm there probably wouldn’t be a lot people to be both willing and able to write alternative backends.


There will always be somebody interested in doing something else. Somebody considers that something can't be represented nicely in LLVM IR and creates their own thing. Somebody can't work with LLVM folks for social reasons and creates their own thing. Reasons there are many and thus at some other point something successful will challenge them.

For the C++ world the competition by LLVM/clang was fruitful and triggered lots of improvements in gcc/g++. Produced code got faster, diagnostics better etc.

Sure, Fortran is a different area, with less commercial interest a d other challenges. (LLVM is pushed by Apple and Google for non-Fortran needs - it is thinkable that they push decisions, which hinder Fortran, whereas gcc has a different goal and might long term more receptive to Fortran needs?)


> Sure, Fortran is a different area, with less commercial interest a d other challenges.

Actually, it's quite the opposite. You're never going to make money selling a C/C++ compiler, but you can make loads of cash selling a Fortran compiler. It's just that the Fortran compiler is likely to come with your supercomputer.


C and C++ Compiler vendors targeting embedded, security critical domains, or Intel, PGI, Embarcadero, Microsoft, Oracle, Intel, HP, IBM, Unisys, CodePlay might disagree.

There is plenty of money to be done in C and C++ commercial compilers, it just depends on the customer base, and features missing from clang/gcc regarding overall tooling experience.


I think that is possible.

I've been thinking a lot about vectorizing loops recently, especially on AVX512 systems. I've mostly been doing microbenchmarks, and I realize that microbenchmarks might not give a realistic full-program view.

LLVM (through Julia and Clang) have a striking performance pattern as sizes vary: https://discourse.julialang.org/t/ann-loopvectorization/3284...

They are fast at multiples of 32, but then performance degrades. This is because it vectorizes loops by creating two loops:

  1. 4x unrolled and vectorized (with double precision and AVX512, that translates to 4 * 8 = 32 loop iterations)
  2. Scalar loop.
By avoiding that pattern, it was easy to get much better performance at most sizes in a lot of simple cases, like dot products.

In wondering about why LLVM's decisions made sense for them, I'm currently leaning towards AVX transition penalties being a big factor. Recently shared on HN: https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html

The thing that struck me is that there is a 9 microsecond period where avx instructions (AVX2 and AVX512) operate at a small fraction of normal speed before the CPU decides to transition to a slower state.

If most of your code is running in L0 (max clock speed) license, then any vectorized code you run into will run at 1/4 speed for about 9 microseconds. If it does run for that amount of time, it'll transition with an 11 microseconds break. It'll have to keep running for a long time to amortize this penalty.

Then, once the function returns to the rest of your scalar code, it'll eventually have to speed up. Basically, large programs are probably fastest if they stay in relatively the same state.

By having a large scalar window, like LLVM does, it's less likely to change. Most loops are probably fairly short, and most code is also scalar, therefore you'll want the CPU to generally stick to scalar mode. Only if loops are very long and likely to take milliseconds would you want them to be vectorized.

Or if they're surrounding by other SIMD code, but that's a sort of global/whole program state you cannot infer while optimizing a single function.

It is likely best to go lean very heavily to one side in your preference of scalar vs vector, but which side is better varies by program. LLVM is essentially leaning heavily toward scalar in their loop behavior, which is probably best for most C and C++ programs. Many Fortran programs might prefer vector. My own (Julia) code does. But I have a hard time talking about Julia or Fortran programs in the abstract. I tend to ensure vectorization. Most programmers don't, so even in these languages, they're likely to prefer and benefit from different defaults than I do.


You should try ISPC if you are this concerned about vectorization. You can try it out on godbolt.org it makes controlling vectorization much easier.


I wonder if this means LLVM should behave differently on AMD's Zen 2, which has full speed AVX2 (but no AVX512).


>On the other I’m a bit worried that once LLVM backend will take off for real, so much focus will concentrate on it, that other compilers (PGI, gfortran, Intel Fortran) development will finally stall or even get dropped and we will wake up in future where LLVM behemoth swallowed all the competition.

It is likely it will take years if not decades for the llvm based fortran compiler to match the optimization ability of compilers on hpc systems. I also will clarify that the previous statement was an understatement. gfortran is one thing, PGI or Intel compilers won't stall or fall out of use.


I second that.


Don’t worry; people said this about GCC (especially once Cygnus got going) and now llvm is challenging GCC. The existence of llvm has caused gcc to improve as well.

Gcc (well g++) is still my primary compiler though I do run my code through clang as well as llvm do catch different bugs. I can imagine llvm becoming my primary compiler at some point, but it will still be a few years.


As mostly user of commercial software, it hardly affects me, but it is going to be interesting to watch when all copyleft software gets into a niche and we get back what was basically shareware and PD on the home computing scene.


I'm not sure I understand. Are you predicting that open source is a passing fad, or that is will paint itself into a corner?


I am predicting that non-copyleft licenses will replace all uses of copyleft licenses, and everyone will get basic functionality in open source with juice bits only avaiable in the commercial versions, just like in the old days before of the rise of GPL.

The evolution of BSD's adoption by commercial entities shows the way.


I hope you're wrong. The strength of Linux, Apache, and AGPL points the other way. It's almost impossible to get a chip from the likes of Rockchip or Mediatek without a Linux and/or Android bistro attached. GPL-style licenses on IoT systems may be an important part of the only way to avoid an apocalypse of compromised crapware IoT devices.

The more serious problem is the weaponization of open source by the big actors, using it to simultaneously generate a scorched-earth moat around their respective castles while hiding proprietary extensions behind a network connection. I do not consider this to have been good for the software world at all.

As I'm sure you know your prediction has been made for 30 years. Doesn't mean it can't come true but unlike you I don't see the tide moving that way.


Android proves exactly my point of view, specially AOSP versus what is running on this phone.

The Linux kernel is the only surviving piece of GPL code on modern Android.

Just wait when Fuchsia becomes mature enough.


Is AGPL strong? I can't come up with many examples off the top of my head as much as I can think of, say, MIT/GPL/LGPL.

I mean, it has been around much less time but does not seem to be an especially common choice.


The affero license is useful for things accessed over the web, which of course became quite important and thankfully the GPL catching up.

I wrote the library license back in the early 90s because of a similar shift (Unix and Windows were late adopters of the the philosophy of libraries, not just programs, for non- system code, but once they finally started to get on board the GPL had to catch up)


I haven't seen that talk, but from a previous one, Flang is closely related to the PGI compiler, and F18 is to escape that.

I wouldn't miss the grief associated with ifort, which doesn't live up to the mythology. The salient feature of the PGI compiler is probably the offloading, and I assume that's Nvidia-specific, but I don't know how it compares with current, and upcoming, GCC support. The research computing world would be a better place if the money spent on proprietary compilers sponsored GCC improvements instead.


From what I've seen, PGI's offloading (with OpenACC) still delivers better performance than GCC, although GCC is improving. Also PGI usually give better error messages or comments about what it's doing when offloading. But yes offloading with PGI is for Nvidia cards only, and GCC contributors are working on adding support for offloading to AMD GPUs.


I am not a compiler expert but I wonder if FORTRAN will lose its speed advantages it still supposedly has if it goes through LLVM.


LLVM's biggest weak spot as a compiler is its lack of good loop nest optimizations. And that hurts Fortran code a lot more than C/C++ code, because your 6-level loop nest code is more likely to be written in Fortran instead of C/C++.

A further point to make is that LLVM IR is strongly biased towards C, and this is especially true when it comes to the memory model. All memory has to be lowered to access via (essentially typeless) pointers, with optional aliasing qualifiers provided via a noalias parameter attribute (which breaks a lot, because restrict isn't all that common in C), and TBAA. And all higher-order information has to be reverse engineered from this starting point.


As far as I know, the main speed advantage is that Fortran supports arrays, which are known to not alias.

C/C++ does not have alias-annotations built-in. Although with template constructs like done in e.g. the Eigen library, alias-optimal code can be generated.


There are bugs in llvm mostly found by rust that are related to this. This will be another source of pressure to fix these bugs, which is good.


My main hope for flang is that it motivates LLVM to fix the Rust-relevant non-aliasing bugs. However, is it actually known that flang/f18 emits IR with the same patterns that exposed the Rust-relevant LLVM bugs?


C has the "restrict" keyword to indicate that no aliasing should occur.


But is mostly unused and mostly irrelevant


Other posters mentioned "restrict". Rust is able to basically use "restrict" on steroids because of how much more info has, so much so that it's brought up bugs in LLVM that are not fixed yet.

https://github.com/rust-lang/rust/issues/54878


I just thought of a stupid question. If gcc and LLVM both support fortran how actually hard would it be really to add real arrays to C? At least as an option? AKA add an option --fortran_arrays.


It has been done multiple times in safe C dialects, the problem is the community, not technical.


C/C++ does not have alias-annotations built-in

Restrict?


Fortran defaults to no-alias, C and C++ default to may-alias. The annotations exist, but you have to apply them all over the place to get the benefit. It’s doable, but the Fortran model is less error-prone for many users.


Actually, Fortran defines "storage association" rules, so code that doesn't obey them isn't conforming Fortran. I'm not sure about the error-prone, though, from long experience and users arguing with a compiler maintainer. One problem is lack of static checking.


Numeric code rarely (almost never, actually) requires aliasing, so defaulting to no-alias is less error prone for those users. If you’re writing an OS or runtime library, you want aliasing support and static checking, but that’s mostly out-of-scope for Fortran today.


The Fortran standard just doesn't talk about "aliasing" (in that context), and "defaulting" suggests you can turn off the storage association rules somehow.

I maintain from long research computing experience (at least back to the days of Alliant) that the rules are highly error prone in practice for users, who frequently deny they even exist, and blame the compiler bugs. (I'm surprised if that's not the case more generally.) I'm not saying they shouldn't exist, or that code needs to contravene them.


Yeah, I'm using the C nomenclature ("aliasing" vs "association") because HN commenters are broadly unfamiliar with Fortran.

What I mean by "default" is that if I declare a function with two array arguments in Fortran, with the simplest possible syntax, the compiler assumes that they do not alias (are not associated). By contrast, if I declare a function with two pointer arguments in C, with the simplest possible syntax, the compiler assumes that they may alias (are associated in some unspecified manner).

The C semantics are certainly safer, but they lead to lots of "Fortran is faster than C" blog posts by people who either don't know about or simply don't want to use the annotations.


A bit of a nitpick maybe, but it's been Fortran and not FORTRAN for quite a while (since Fortran 90 in the 90's).


“since Fortran 90 in the 90's”

That’s when I used it last time :)


still being used in our research group on a daily basis.


Flang's code is somewhat slower than gfortran's from comparing the geometric bottom line from the Polyhedron benchmarks on SKX with profile-directed optimization. I won't probably argue argue with arguments about whether that's a reasonable comparison.


What exactly is the benefit of having multiple compilers? I'm not well versed in the area, but it seems like having one compiler is better from the perspectives of standardization, having a single way of handling undefined behaviors, etc?


Competition is a strong motivator to improve areas your implementation is bad at. E.g. with gcc/clang an often-mentioned example is error messages: They're not a benchmark you can optimize for, but turns out users really like helpful error messages and will start using a different compiler for them.

Actual standardization (as in, there is a document describing the what's supposed to happen) is helped by multiple implementations because their conflicts will help discover unclear parts. This in return helps new implementations get of the ground if there's a need to produce them. Some standards groups require multiple independent implementation of something to exist before it is allowed to be released as a standard.

Different implementations might be different enough that some changes are easier to make in one than the other. This makes it easier to test these changes, the other implementation(s) can then decide if the gains are worth their effort.

It provides an out if one project resists changes for human/political reasons.

I wouldn't be surprised if there's a few unusual architectures around that use Fortran but aren't handled by LLVM.


If you rely on whatever the one true compiler does to undefined behaviour, you are essentially making the compiler the spec.

That's workable, eg for Python (and in practice, Haskell). But is seen as less than ideal.


With a permissive license, it's vulnerable to embrace, extend, extinguish.


The ability to make it proprietary is in fact the reason why Apple didn't adopt gcc and became the main supporter of clang/llvm instead. So the issue is already here. Fortunately Apple's ambitions only center around keeping full control of the Apple walled garden (the apple main building is a wonderful symbol of it). And should a strong proprietary competitor outside of Apple's ecosystem emerge, people can always fork clang and put it under GPLv3 or similar.


If that really were the case, why did Chris Lattner try to give clang/llvm over to the FSF and merge it into gcc? Which failed only apparently due to Stallman not noticing the emails. After which Apple started pushing clang/llvm as in a way, they were shunned by gcc.

The rest of your comment is... way too political towards gpl.


Chris Lattner suggested it back when gcc was first working on LTO, and clang didn't exist. 2004 or 2005-ish, IIRC. Back then, he was still working at UIUC.

Apple pushed clang as a competitor to gcc much later, closer to 2010-ish.


I'm curious then how the email from Chris in 2005 comes from his apple.com email address?

https://gcc.gnu.org/ml/gcc/2005-11/msg00888.html

Doesn't seem to add up with your proposed timeline. Care to explain?


This reminded me of the Fortress language after a long time.

It was a MODERN Fortran like language that Guy Steele and all were developing at Sun https://en.wikipedia.org/wiki/Fortress_(programming_language...

It did not stand much of a chance once Oracle took over.


Fortress never got any real traction and was effectively dead well before Oracle purchased Sun.


IIRC the point of fortress was to win DARPA's high productivity computing challenge, having failed at that it just remained a random research language.

Which is a pity, because it had some very interesting ideas, imvho.


Mods, please add a [video] warning to the title.


Why should this be downvoted; it’s a standard thing to do on HN.

(Complaining about downvoting is not typically an acceptable thing on HN but I think this case is meta-respectable, though I appreciate the irony).


I had to laugh a little at F18, given my first job in the early 80's was in F78 I think, doing text processing no less.


Aside: this is one thing I found interesting about go. They maintain one compiler and two front ends:

llvm: https://go.googlesource.com/gollvm/

gcc: https://github.com/golang/gofrontend

Go: https://github.com/golang/go/tree/master/src/cmd/compile

And I believe all three are maintained by the go team. It should be noted that the LLVM implementation is used by tinygo: https://tinygo.org/


Looks like there is a Fortran compiler which is already emitting MLIR IR with few OpenMP constructs support and 2 SPEC CPU 2017 benchmarks running: https://github.com/compiler-tree-technologies/fc

Phoronix article: https://www.phoronix.com/scan.php?page=news_item&px=FC-LLVM-...


So when coarrays land in Flang in late 2021, will it be missing anything in gfortran? What will the X-vs-Y look like at that point, I wonder?

I'm new to the Fortran scene, btw. I probably can't do more than fix typos, but the Good First Issues on the github are:

https://github.com/flang-compiler/f18/labels/good%20first%20...


I'm excited to see more front ends embracing MLIR.

A lot of new interesting languages (Julia, Rust, etc) were empowered by being able to take advantage of LLVM as a backend, so I can't imagine what new languages might spring up around the heterogenous capabilities of MLIR, not to mention existing languages targeting it as well.


I'm thrilled about a Fortran front-end so I can run Fortran on the JVM via Sulong.


I've never seen a fortram codebase in my life. Is it still alive? What advantages does it offer over C/C++?


Fortran has language-level support for things important for numerical scientific computing (complex numbers, multidimensional arrays, etc.), and it has had them since the beginning in the 1950-60s.

The convenience is not really matched by C or C++, where similar features have been added much later by language extensions or 3rd party libraries, resulting to more complicated usage, fragmentation, and interoperability problems. Newer languages also have similar issues, so for the user base that uses Fortran, there's a lack of viable competitors.


Also numerical codes can be trickier to write than you think. Rounding errors and other treacheries of the floating point approximation to the 'real' numbers can eat you alive.

If you want to do common numeric operations they might be a FORTRAN code that is battle tested and performance tuned and it usually not hard to call from C, Java, Python or some other language.


Fortran’s numerics model is actually somewhat weaker than C’s; if precise rounding behavior is your main concern C (while still not ideal) is a better option.


For those of us not expert on the standards, could you explain in what way Fortran is weaker, and its rounding control deficient?


I write mostly C++ and Scheme and last year I wrote my first piece of code in modern Fortran (I had some experience porting/integrating old F77 libs). It's trivial to interface with C and lets you do numeric stuff immediately and without libraries using a plain syntax. Compared to C++ with template array libraries it compiles much faster and doesn't get in your way nearly as much. It has a module system... Compared to numpy/matlab/etc it takes some adjusting b/c it's still a lowish level language (errors lead to segfaults, etc.) but a lot less than going to C/C++.


It's still around in the high-performance scientific computing world. Feature-wise, C99 basically looks like playing catch-up with Fortran to better compete in that space (eg restrict, complex numbers, type-generic math functions, stack-allocated variable length arrays).


Every time Fortran is mentioned, this seems to come up. In one thread I referred to the usage figures for the UK "tier 1" system, for instance.


It's used in science (physics, chemistry, astronomy etc). Just like C++ is used not because it's a nice language but because of inertia and path dependecy, the same goes for fortran.


But it is a nice enough language, at least in its sphere: "ALGOL 60 is alive and well and living in FORTRAN [sic] 90." -- Tony Hoare


I would call it "Jortran". It's much less generic and with a little luck even convey what kind of language it is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: