A pattern in these is code that compiles until you change a small thing. A closu...

stouset · on Dec 24, 2024

Your supposition about Rust is correct.

I’ll add that—having paid that upfront cost—I am happily reaping the rewards even when I write code in other languages. It turns out the way that Rust “wants” to be written is overall a pretty good way for you to organize the relationships between parts of a program. And even though the borrow checker isn’t there looking out for you in other languages, you can code as if it is!

tonyarkles · on Dec 24, 2024

I had a similar experience with Erlang/Elixir. The primary codebase I work with in $DAYJOB is C++ but structured very OTP-like with message passing and threads that can crash (via exceptions, we’re not catching defaults for example) and restart themselves.

Because of the way we’ve set up the message passing and by ensuring that we don’t share other memory between threads we’ve virtually eliminated most classes of concurrency bugs.

pdimitar · on Dec 24, 2024

That's the main use-case of Erlang/Elixir anyway: eliminate concurrency / parallelism bugs by copying data in the dangerous places, and make sure not to use locks but message passing instead. These two alone have eliminated most of the bugs I've wrote in other languages.

So, same experience. And it taught me to be a better Golang and Rust programmer, too.

rikthevik · on Dec 25, 2024

This is why I think cross-training is so important and I should do more of it. Even something relatively minor, like doing the Advent of Code in elixir (nim last year) has made my python significantly easier to reason about. If you quit mutating state so much, you eliminate whole classes of bugs.

ngrilly · on Dec 25, 2024

That's also how I tend to program after having read Joe Armstrong's dissertation "Making reliable distributed systems in the presence of sodware errors" and having programmned in Go for a few years.

saghm · on Dec 24, 2024

I'm fortunate enough not to have to often write code in other languages anymore, but my experience that writing code in ways that satisfies the compiler actually ends up being code I prefer anyhow. I was somewhat surprised at the first example because I haven't run into something like that, but it's also not really the style I would write that function personally (I'm not a big fan of repetitions like having `Some(x)` repeated both as a capture pattern and a return value), so on a whim I tried what would have been the way I'd write that function, and it doesn't trigger the same error:

    fn double_lookup_mut(map: &mut HashMap<String, String>, mut k: String) -> Option<&mut String> {
        map.get_mut(&k)?;
        k.push_str("-default");
        map.get_mut(&k)
    }

I wouldn't have guessed that this happened to be a way around a compiler error that people might run into with other ways of writing it; it just genuinely feels like a cleaner way for me to implement a function like that.

khold_stare · on Dec 24, 2024

Isn't that the opposite of the intended implementation? I don't write Rust, but I think your implementation will always return either `None` or the "fallback" value with the `"-default"` key. In the article, the crucial part is that if the first `map.get_mut()` succeeds, that is what is returned.

saghm · on Dec 24, 2024

Whoops, you're definitely right. This is why I shouldn't try to be productive in the morning.

brabel · on Dec 24, 2024

A great example of how "if it compiles, it runs correctly" is bullshit.

unshavedyak · on Dec 24, 2024

You're reaching pretty hard there. Your assertion is a massive strawman, the implication seeming to be that "every problem in your logic won't exist if it compiles" - no one thinks you can't write bad logic in any language.

Rather it's about a robust type system that gives you tooling to cover many cases at compile time.

Even if we ignore logic, Rust has plenty of runtime tooling and runtime issues can still happen as a result. A complaint i often have about Bevy (despite loving it!) is that it has a lot of runtime based plugins/etc which i'd prefer to be compile time. Axum for example has a really good UX while still being heavily compile time (in my experience at least).

"If it compiles it works" is still true despite my complaints. Because i don't believe the statement even remotely implies you can't write bad logic or bad runtime code in Rust.

oivey · on Dec 24, 2024

This particular example explicitly dodges compile time checking for some ad-hoc (but likely safe) runtime behavior. It’s not a strawman at all. It’s a classic example of how sometimes the compiler can’t help you, and even worse, how programmers can defeat their ability to help you.

unshavedyak · on Dec 24, 2024

Right but their statement (as i parsed it) was that the the "if it compiles it works" phrase is bullshit. Since there's some cases where it obviously won't be true.

At best it's ignorant of what that phrase means, in my view.

lelanthran · on Dec 25, 2024

> Since there's some cases where it obviously won't be true.

That's not how I've seen it used every that I have seen it used.

> At best it's ignorant of what that phrase means, in my view.

I think your opinion on what the phrase means is a minority opinion.

unshavedyak · on Dec 26, 2024

If it was as wide as the OP said then it means errors, panics, and especially Unsafe wouldn't exist. Even if we ignore unclean sources (say network errors/etc), this isn't a Proofed language where programs in it cannot possibly fail.

Besides, it likely is very possible to write programs that cannot fail in Rust. This usually means encoding state into the type system (enums included), but few go through that work. They know what corners they're cutting. Further proving that they know their program can fail.

Hell, Rust itself can fail. I struggle to imagine how this is perceived as some long-con from the Rust PR team to convince people Rust programs cannot be written with incorrect logic.

oivey · on Dec 25, 2024

I think they were highlighting that that phrase is bullshit. It’s trivial to escape many compile time checks.

unshavedyak · on Dec 25, 2024

Yea, but that's my argument - that they're being dense (i imagine on purpose?). The phrase doesn't mean that nothing can fail at runtime. Of course it doesn't.

Rather that we have many tools to write a program that can be written with many compile time checks. For example many representations of state can be describe in compile time checks via enums, type transitions, etc.

brabel · on Dec 25, 2024

> The phrase doesn't mean that nothing can fail at runtime.

That's exactly what that phrase means, you're twisting the actual words in the phrase to be able to arrive at something more acceptable in your own mind. All you're saying is "oh you shouldn't take it literally". I can guarantee most people take that literally and it's bullshit.

unshavedyak · on Dec 25, 2024

So you think everyone using that phrase thinks it's not possible to fail at all at runtime, in any manner?

Unsafe, Panic and Error would like a word.

saghm · on Dec 24, 2024

Honestly, I think the majority of the times I've said that sentence has been after running code that has an obvious mistake (like the code I posted above)!

farresito · on Dec 24, 2024

As someone who is interested in getting more serious with Rust, could you explain the essence of how you should always approach organizing code in Rust as to minimize refactors as the code grows?

oconnor663 · on Dec 24, 2024

In my experience there are two versions of "fighting the borrow checker". The first is where the language has tools it needs you to use that you might not've seen before, like enums, Option::take, Arc/Mutex, channels, etc. The second is where you need to stop using references/lifetimes and start using indexes: https://jacko.io/object_soup.html

meindnoch · on Dec 24, 2024

>and start using indexes

So basically raw pointers with extra hoops to jump through.

quotemstr · on Dec 24, 2024

Yep. The array index pattern is unsafe code without the unsafe keyword. Amazing how much trouble Rust people go through to make code "safe" only to undermine this safety by emulating unsafe code with safe code.

dmkolobov · on Dec 24, 2024

It’s not the same. The term “safe” has a specific meaning in rust: memory safety. As in:

- no buffer overflows - no use after free - no data races

These problems lead to security vulnerabilities whose scope extends beyond your application. Buffer overflows have historically been the primary mechanism for taking over entire machines. If you emulate pointers with Rust indices and don’t use “unsafe”, those types of attacks are impossible.

What you’re referring to here is correctness. Safe Rust still allows you to write programs which can be placed in an invalid state, and that may have security implications for your application.

It would be great if the compiler could guarantee that invalid states are unreachable. But those types of guarantees exist on a continuum and no language can do all the work for you.

quotemstr · on Dec 24, 2024

"Safe" as a colloquial meaning: free from danger. The whole reason we care about memory safety is that memory errors become security issues. Rust does nothing to prevent memory leaks and deadlocks, but it does prevent memory errors becoming arbitrary code execution.

Rust programs may contain memory errors (e.g. improper use of interior mutability and out of bounds array access), but the runtime guarantees that these errors don't become security issues.

This is good.

When you start using array indices to manage objects, you give up some of the protections built into the Rust type system. Yes, you're still safe from some classes of vulnerability, but other kinds of vulnerabilities, ones you thought you abolished because "Rust provides memory safety!!!", reappear.

Rust is a last resort. Just write managed code. And if you insist on Rust, reach for Arc before using the array index hack.

dmkolobov · on Dec 24, 2024

I tend to agree w.r.t. managed languages.

Still, being free from GC is important in some domains. Beyond being able to attach types to scopes via lifetimes, it also provides runtime array bounds checks, reference-counting shared pointers, tagged unions, etc. These are the techniques used by managed languages to achieve memory-safety and correctness!

For me, Rust occupies an in-between space. It gives you more memory-safe tools to describe your problem domain than C. But it is less colloquially "safe" than managed languages because ownership is hard.

Your larger point with indices is true: using them throws away some benefits of lifetimes. The issue is granularity. The allocation assigned to the collection as a whole is governed by rust ownership. The structures you choose to put inside that allocation are not. In your user ID example, the programmer of that system should have used a generational arena such as:

https://github.com/fitzgen/generational-arena

It solves exactly this problem. When you `free` any index, it bumps a counter which is paired with the next allocated index/slot pair. If you want to avoid having to "free" it manually, you'll have to devise a system using `Drop` and a combination of command queues, reference-counted cells, locks, whatever makes sense. Without a GC you need to address the issue of allocating/freeing slots for objects within in an allocation in some way.

Much of the Rust ecosystem is libraries written by people who work hard to think through just these types of problems. They ask: "ok, we've solved memory-safety, now how can we help make code dealing with this other thing more ergonomic and correct by default?".

quotemstr · on Dec 24, 2024

Absolutely. If I had to use an index model in Rust, I'd use that kind of generational approach. I just worry that people aren't going to be diligent enough to take precautions like this.

lenkite · on Dec 25, 2024

'generational-arena' is unmaintained and archived now.

whytevuhuni · on Dec 24, 2024

Even when you use array indices, I don't think you give those protections up. Maybe a few, sure, but the situation is still overall improved.

Many of the rules references have to live by, are also applied to arrays:

- You cannot have two owners simultaneously hold a mutable reference to a region of the array (unless they are not overlapping)

- The array itself keeps the Sync/Send traits, providing thread safety

- The compiler cannot do provenance-based optimizations, and thus cannot introduce undefined behavior; most other kinds of undefined behavior are still prevented

- Null dereferences still do not exist and other classes of errors related to pointers still do not exist

Logic errors and security issues will still exist of course, but Rust never claimed guarantees against them; only guarantees against undefined behavior.

I'm not going to argue against managed code. If you can afford a GC, you should absolutely use it. But, compared to C++, if you have to make that choice, safety-wise Rust is overall an improvement.

screcth · on Dec 24, 2024

You can still have use-after-free errors when you use array indices. This can happen if you implement a way to "free" elements stored in the vector. "free" should be interpreted in a wide sense. There's no way for Rust to prevent you from marking an array index as free and later using it.

oconnor663 · on Dec 24, 2024

> There's no way for Rust to prevent you from marking an array index as free and later using it.

I 2/3rds disagree with this. There are three different cases:

- Plain Vec<T>. In this case you just can't remove elements. (At least not without screwing up the indexes of other elements, so not in the cases we're talking about here.)

- Vec<Option<T>>. In this case you can make index reuse mistakes. However, this is less efficient and less convenient than...

- SlotMap<T> or similar. This uses generational indexes to solve the reuse problem, and it provides other nice conveniences. The only real downside is that you need to know about it and take a dependency.

dmkolobov · on Dec 25, 2024

The consequences of use-after-free are different for the two.

In rust it is a logic error, which leads to data corruption or program panics within your application. In C it leads to data corruption and is an attack vector for the entire machine.

And yes, while Rust itself doesn’t help you with this type of error, there are plenty of Rust libraries which do.

umanwizard · on Dec 24, 2024

The difference is that the semantics of your program are still well-defined, even with bugs in index-based arenas.

quotemstr · on Dec 24, 2024

The semantics of a POSIX program are well-defined under arbitrary memory corruption too --- just at a low level. Even with a busted heap, execution is deterministic and the every interaction with the kernel has defined behavior --- even if they behavior is SIGSEGV.

Likewise, safe but buggy Rust might be well-defined at one level of abstraction but not another.

Imagine an array index scheme for logged-in-user objects. Suppose we grab an index to an unprivileged user and stuff it in some data structure, letting it dangle. The user logs out. The index is still around. Now a privileged user logs in and reuses the same slot. We do an access check against the old index stored in the data structure. Boom! Security problems of EXACTLY the sort we have in C.

It doesn't matter that the behavior is well-defined at the Rust level: the application still has an escalation of privilege vulnerability arising from a use-after-free even if no part of the program has the word u-n-s-a-f-e.

IX-103 · on Dec 24, 2024

Undefined behavior in C/C++ has a different meaning than you're using. If a compiler encounters a piece of code that does something whose behavior is undefined in the spec, it can theoretically emit code that does anything and still be compliant with the standards. This could include things like setting the device on fire and launching missiles, but more typically is something seemingly innocuous like ignoring that part of the code entirely.

An example I've seen in actual code: You checked for null before dereferencing a variable, but there is one code path that bypasses the null check. The compiler knows that dereferencing a null pointer is undefined so it concludes that the pointer can never be null and removes the null checks from all of the code paths as an "optimization".

That's the C/C++ foot-gun of undefined behavior. It's very different from memory safety and correctness that you're conflating it with.

quotemstr · on Dec 24, 2024

From the kernel's POV, there's no undefined behavior in user code. (If the kernel knew a program had violated C's memory rules, it could kill it and we wouldn't have endemic security vulnerabilities.) Likewise, in safe Rust, the access to that array might be well defined with respect to Rust's view of the world (just like even UB in C programs is well defined from the kernel POV), but it can still cause havoc at a higher level of abstraction --- your application. And it's hard to predict what kind of breakage at the application layer might result.

oconnor663 · on Dec 24, 2024

Sort of. But you still get guaranteed-unaliased references when you need them. And generational indexes (SlotMap etc) let you ask "has this pointer been freed" instead of just hoping you never get it wrong.

throwawaymaths · on Dec 24, 2024

and you now have unchecked use-after-decommisioning-the-index and double-decommission-the-index errors, which could be security regressions

estebank · on Dec 24, 2024

That's true only if you use Vec<T> instead of a specialized arena, either append only, maybe growable, or generational, where access invalidation is tracked for you on access.

oconnor663 · on Dec 24, 2024

Yeah if you go with Vec, you have to accept that you can't delete anything until you're done with the whole collection. A lot of programs (including basically anything that isn't long running) can accept that. The rest need to use SlotMap or similar, which is an easy transition that you can make as needed.

nordsieck · on Dec 24, 2024

> So basically raw pointers with extra hoops to jump through.

That's one way to look at it.

The other way is: raw pointers, but with mechanical sympathy. Array based data structures crush pointer based data structures in performance.

jpc0 · on Dec 24, 2024

> Array based data structures crush pointer based data structures in performance

Array[5] And *(&array + 5) generates the same code... Heap based non-contiguous data structures definitely are slower than stackbased contiguous data structures.

How you index into them is unrelated to performance.

Effectively pointers are just indexes into the big array which is system memory... I agree with parent, effectively pointers without any of the checks pointers would give you.

frutiger · on Dec 24, 2024

> pointers are just indexes into the big array which is system memory...

I’m sure you are aware but for anyone else reading who might not be, pointers actually index into your very own private array.

On most architectures, the MMU is responsible for mapping pages in your private array to pages in system memory or pages on disk (a page is a subarray of fixed size, usually 4 KiB).

Usually you only get a crash if you access a page that is not currently allocated to your process. Otherwise you get the much more insidious behaviour of silent corruption.

_dain_ · on Dec 24, 2024

>How you index into them is unrelated to performance.

Not true. If you store u32 indices, that can impose less memory/cache pressure than 64-bit pointers.

Also indices are trivially serializable, which cannot be said for pointers.

jpc0 · on Dec 24, 2024

I'll happily look at a benchmark which shows that the size of the index has any significant performance implications vs the work done with the data stored at said index, never mind the data actually stored there.

I haven't looked closely at the decompiled code but I wouldn't be surprised if iterating through a contiguous data structure has no cache pressure but is rather just incrementing a register without a load at all other than the first one.

And if you aren't iterating sequentially you are likely blowing the cache regardless purely based on jumping around in memory.

This is an optimisation that may be premature.

EDIT:

> Also indices are trivially serializable, which cannot be said for pointers

Pointers are literally 64bit ints... And converting them to an index is extremely quick if you want to store an offset instead when serialising.

I'm not sure if we are missing each other here. If you want an index then use indices. There is no performance difference when iterating through a data structure, there may be some for other operations but that has nothing to do with the fact they are pointers.

Back to the original parent that spurred this discussion... Replacing a reference (which is basically a pointer with some added suger) with an index into an array is effectively just using raw pointers to get around the borrow checker.

trealira · on Dec 24, 2024

> Pointers are literally 64bit ints... And converting them to an index is extremely quick if you want to store an offset instead when serialising.

I'm not them, but they're saying pointer based structures are just less trivial to serialize. For example, to serialize a linked list, you basically need to copy them into an array of nodes, replacing each pointer to a node with a local offset into this array. You can't convert them into indices just with pointer arithmetic because each allocation was made individually. Pointer arithmetic assumes that they already exist in some array, which would make the use of pointers instead of indices inefficient and redundant.

jpc0 · on Dec 24, 2024

I understand that entirely, a link list is a non-contiguous heap based data structure.

What I am saying is if you store a reference to an item in a Vec or an index to an item to a Vec it is an implementation detail and looking up the reference or the index generates effectively the same machine code.

Specifically in the case that I'm guessing they are referring to which is the optimisation used in patterns like ECS. The optimisation there is the fact that it is stored contiguously in memory and therefore it is trivial to use SIMD or a GPU to do operations on the data.

In that case whether you are storing a u32 or size_t doesn't exactly matter and on a 32bit arch is literally equivalent. It's going to be dwarfed by loading the data into cache if you are randomly accessing the items or by the actual operations done to the data or both.

As I said, sure use an index but that wasn't the initial discussion. The discussion was doing it to get around the borrow check which is effectively just removing the borrow checker from the equation entirely and you may as well have used a different language.

IX-103 · on Dec 24, 2024

The main benefit from contiguous storage is it can be a better match to the cache. Modern CPUs read an entire cache line in a burst. So if you're iterating through a contiguous array of items then chances are the data is already in the cache. Also the processor tends to prefetch cache lines when it recognizes a linear access pattern, so it can be fetching the next element in the array while it's working on the one before it.

mrkeen · on Dec 24, 2024

> Pointers are literally 64bit ints... And converting them to an index is extremely quick if you want to store an offset instead when serialising.

This implies serialisation/deserialisation passes, so you can't really let bigger-than-ram data live on your disk.

simgt · on Dec 24, 2024

> stop using references/lifetimes and start using indexes

Aren't arenas a nicer suggestion? https://docs.rs/bumpalo/latest/bumpalo/ https://docs.rs/typed-arena/latest/typed_arena/

Depending on the use case, another pattern that plays very nicely with Rust is the EC part of ECS: https://github.com/Ralith/hecs

oconnor663 · on Dec 24, 2024

Yes, Slab and SlotMap are the next stop on this train, and ECS is the last stop. But a simple Vec can get you surprisingly far. Most small programs never really need to delete anything.

ajross · on Dec 24, 2024

> It turns out the way that Rust “wants” to be written is overall a pretty good way for you to organize the relationships between parts of a program

That's what it promised not to do, though! Zero cost abstractions aren't zero cost when they force you into a particular design. Several of the cases in the linked article involve actual runtime and code size overhead vs. the obvious legacy/unchecked idioms.

oconnor663 · on Dec 24, 2024

> vs. the obvious legacy/unchecked idioms

You can go crazy with legazy/unchecked/unsafe stuff if you want to in Rust. It's less convenient and more difficult than C in some ways, but 1) it's also safer and more convenient in other ways, and 2) "this will be safe and convenient" isn't exactly the reason we dive into legacy/unchecked/unsafe stuff.

And of course the greatest strength of the whole Rust language is that folks who want to do crazy unsafe stuff can package it up in a safe interface for the rest of us to use.

ajross · on Dec 24, 2024

> crazy unsafe stuff

The first example in the linked article is checking if a value is stored in a container and doing something different if it's not than if it is. Hardly "crazy unsafe stuff".

oconnor663 · on Dec 24, 2024

I think there's an important distinction here. A systems programming language needs to have all the max speed / minimum overhead / UB-prone stuff available, but that stuff doesn't need to be the default / most convenient way of doing things. Rust heavily (both syntactically and culturally) encourages safe patterns that sometimes involve runtime overhead, like checked indexing, but this isn't the same as "forcing" you into these patterns.

ajross · on Dec 24, 2024

And I can only repeat verbatim: The first example in the linked article is checking if a value is stored in a container and doing something different if it's not than if it is.

Hardly "max speed / minimum overhead / UB-prone stuff"

estebank · on Dec 24, 2024

It is also behavior that we want to change, it's just non-trivial to do so.

https://smallcultfollowing.com/babysteps/blog/2024/06/02/the...

https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-u...

RandomThoughts3 · on Dec 26, 2024

It’s sad to see your comment this far in the thread after having had go be coxed to make it and after all the usual defensive arguments from the Rust crowd. It’s a real issue that it’s so hard to make the Rust community admits that obvious flaws actually exist.

As someone who spent a significant time working on static analysers and provers, it annoys me to no end how most of the Rust community will happily take what the borrow checker imposes on them as some form of gospel and never questions the tool. It’s bordering on Stockholm syndrome sometimes.

estebank · on Dec 26, 2024

I think a lot of pushback is because people are talking past each other.

The reality:

- the borrow checker has limitations that doesn't accept some constructs that could be be proved safe, given Rust's own rules

- the borrow checker is a net positive as it does push you towards better constructs, and the (rare) times it forbids you from doing something that could be safe, you have safe (sometimes not zero-cost) and unsafe escape hatches

But these are then understood by the intransigent as:

- the borrow checker is always detrimental

- the borrow checker can do nothing wrong

At that point, no one can understand why the other person is being obtuse, and you end up with... well, the comment section under every Rust article.

underdeserver · on Dec 25, 2024

Rust absolutely forces you into a particular design - that's what the borrow checker is about.

It's definitely possible to write correct code (in unsafe Rust or other languages) that wouldn't satisfy the borrow checker.

Rust restricts you, and in turn gives you guarantees. Zero cost only means that you don't pay extra (CPU/memory) at runtime, whereas in interpreted/GC languages you do.

ajross · on Dec 25, 2024

> Zero cost only means that you don't pay extra (CPU/memory) at runtime, whereas in interpreted/GC languages you do.

But... again, you do. The examples in the linked articles have runtime overhead. Likewise every time you wrap something in with Box or vec or Arc to massage the lifetime analysis, you're incurring heap overhead that is actually going to be worse than what a GC would see (because GC's don't need the extra layer of indirection or to deal with reference counts).

It's fine to explain this as acceptable or good design or better than the alternatives. It's not fine to call it "Zero Cost" and then engage in apologia and semantic arguments when challenged.

tialaramex · on Dec 25, 2024

We should distinguish code that's correct using the borrowing metaphor but won't pass borrowck in current Rust (such code will inevitably exist thanks to Rice's Theorem) from code that's not correct under this metaphor but would actually work, under some other model for reference or pointer types.

Because Rust is intended for systems programming it is comfortable (in unsafe) expressing ideas which cannot be modelled at all. Miri has no idea how the MMIO registers for the GPIO controller work, so, too bad, Rust can't help you achieve assurance that your GPIO twiddling code is correct in your 1200 byte firmware. But, it can help you when you're talking about things it does have a model for, such as its own data structures, which obey Rust's rules (in this case the strict provenance rule) not "memory" that's actually a physical device register.

ajb · on Dec 24, 2024

It sounds like the ideal, then, would be to detect the problematic patterns earlier so people wouldn't need to bang their heads against it.

kazinator · on Dec 24, 2024

Why would you cling to some cockamamie memory management model, where it is not required or enforced?

That's like Stockholm Syndrome.

resonious · on Dec 24, 2024

Maybe I'm just brainwashed, but most of the time for me, these "forced refactors" are actually a good thing in the long run.

The thing is, you can almost always weasel your way around the borrow checker with some unsafe blocks and pointers. I tend to do so pretty regularly when prototyping. And then I'll often keep the weasel code around for longer than I should (as you do), and almost every time it causes a very subtle, hard-to-figure-out bug.

twic · on Dec 24, 2024

I think the problem isn't that the forced changes are bad, it's that they're lumpy. If you're doing incremental development, you want to be able to to quickly make a long sequence of small changes. If some of those changes randomly require you to turn your program inside-out, then incremental development becomes painful.

Some people say that after a while, they learn how to structure their program from the start so that these changes do not become necessary. But that is also partly giving up incremental development.

NotCamelCase · on Dec 24, 2024

My concern is slightly different; it's the ease of debugging. And I don't mean debugging the code that I (or sb else) wrote, but the ability to freely modify the code to kick some ideas around and see what sticks, etc. which I frequently need to do, given my field.

As an example, consider a pointer to a const object as a function param in C++: I can cast it away in a second and modify it as I go on my experiments.

Any thoughts on this? How much of an extra friction would you say is introduced in Rust?

resonious · on Dec 24, 2024

I would say it's pretty easy to do similar stuff in Rust to skirt the borrow checker. e.g. you can cast a mut ref to a mut ptr, then back to a mut ref, and then you're allowed to have multiple of them.

The problem is Rust (and its community) does a very good job at discouraging things like that, and there are no guides on how to do so (you might get lambasted for writing one. maybe I should try)

stouset · on Dec 24, 2024

I don’t really think it gives up incremental development. I’ve done large and small refactors in multiple Rust code bases, and I’ve never run into one where a tiny change suddenly ballooned into a huge refactor.

joshka · on Dec 24, 2024

Rust definitely forces you to make more deliberate changes in your design. It took me about 6 months to get past hitting that regularly. Once you do get past it, rust is awesome though.

brabel · on Dec 24, 2024

I suppose you haven't had to refactor a large code base yet just because a lifetime has to change?

nostradumbasp · on Dec 24, 2024

Nope.

I have worked professionally for several years on what would now be considered a legacy rust code base. Probably hundreds of thousands of lines, across multiple mission critical applications. Few applications need to juggle lifetimes in a way that is that limiting, maybe a module would need some buffing, but not a major code base change.

Most first pass and even refined "in production" code bases I work on do not have deeply intertwined life-times that require immense refactoring to cater to changes. Before someone goes "oh your team writes bad code!", I would say that we had no noteworthy problems with lifetimes and our implementations far surpassed performance of other GC languages in the areas that mattered. The company is successful built by a skeleton crew and the success is owed too an incredibly stable product that scales out really well.

I question how many applications truly "need" that much reference juggling in their designs. A couple allocations or reference counted pointers go a really long way to reducing cognitive complexity. We use arenas and whatever else when we need them, but no I've never dealt with this in a way that was an actual terrible issue.

merb · on Dec 24, 2024

actually the higher versions of rust actually do need these refactors way less often since more lifetimes can be elided and when using generic code or impl traits you can basically scrap a ton of it. I still sometimes stumble upon the first example tough but most often it happens because I want to encapsulate everything inside the function instead of doing some work outside of it.

amelius · on Dec 24, 2024

> I suppose if you repeat this enough you learn how to write code that Rust is happy with first-time.

But this assumes that your specifications do not change.

Which we know couldn't be further from the truth in the real world.

Perhaps it's just me, but a language where you can never change your mind about something is __not__ a fun language.

Also, my manager won't accept it if I tell him that he can't change the specs.

Maybe Rust is not for me ...

stouset · on Dec 24, 2024

I genuinely don’t know where you’ve gotten the idea that you can “never change your mind” about anything.

I have changed my mind plenty of times about my Rust programs, both in design and implementation. And the language does a damn good job of holding my hand through the process. I have chosen to go through both huge API redesigns and large refactors of internals and had everything “just work”. It’s really nice.

If Rust were like you think it is, you’re right, it wouldn’t be enjoyable to use. Thankfully it is nothing like that.

stephenbennyhat · on Dec 24, 2024

"Malum est consilium, quod mutari non potest" you might say.

oconnor663 · on Dec 24, 2024

My recommendation is that you do whatever you feel like with ownership when you first write the code, but then if something forces you to come back and change how ownership works, seriously consider switching to https://jacko.io/object_soup.html.

amelius · on Dec 24, 2024

Isn't that just reinventing the heap, but with indexes in a vector instead of with addresses in memory?

oconnor663 · on Dec 24, 2024

You could look at it that way. But C++ programs often use similar strategies, even though they don't have to. Array/Vec based layouts like this give you the option of doing some very fancy high-performance stuff, and they also happen to play nicely with the borrow checker.

amelius · on Dec 24, 2024

It's very basic and not a general solution because the lifetimes of objects are now set equal. And there's no compaction, so from a space perspective it is worse than a heap where the space of deleted objects can be filled up by new objects. It is nice though that you can delete an entire class of objects in one operation. I have used this type of memory management in the context of web requests, where the space could be freed when the request was done.

perrygeo · on Dec 24, 2024

> make a seemingly small change, it balloons into a compile error that requires large refactoring to appease the borrow checker and type system

Same experience, but this is actually why I like Rust. In other languages, the same seemingly small change could result in runtime bugs or undefined behavior. After a little thought, it's always obvious that the Rust compiler is 100% correct - it's not a small change after all! And Rust helpfully guides me through its logic and won't let my mistake slide. Thanks!

rendaw · on Dec 24, 2024

Yo, everyone's interpreting parent's comment in the worst way possible: assuming they're trying to do unsound refactorings. There are plenty of places where a refactoring is fine, but the rust analyzer simply can't verify the change (async `FnOnce` for instance) gives up and forces the user to work around it.

I love Rust (comparatively) but yes, this is a thing, and it's bad.

joshka · on Dec 24, 2024

Yeah, Rust-analyzer's palette of refactorings is woefully underpowered in comparison to other languages / tooling I've used (e.g. Resharper, IntelliJ). There's a pretty high complexity bar to implementing these too unfortunately. I say this as someone that has contributed to RA and who will contribute more in the future.

alfiedotwtf · on Dec 24, 2024

I don’t know anyone who has gotten Rust first time around. It’s a new paradigm of thinking, so take your time, experiment, and keep at it. Eventually it will just click and you’ll be back to having typos in syntax overtake borrow checker issues

summerlight · on Dec 24, 2024

This is because programming is not a work in a continuous solution space. Think in this way; you're almost guaranteed to introduce obvious bugs by randomly changing just a single bit/token. Assembler, compiler, stronger type system, etc etc all try to limit this by bringing a different view that is more coherent to human reasoning. But computation has an inherently emergent property which is hard to predict/prove at compile time (see Rice's theorem), so if you want safety guarantee by construction then this discreteness has to be much more visible.

ajross · on Dec 24, 2024

> A pattern in these is code that compiles until you change a small thing.

I think that's a downstream result of the bigger problem with the borrow checker: nothing is actually specified. In most of the issues here, the changed "small thing" is a change in control flow that is (1) obviously correct to a human reader (or author) but (2) undetectable by the checker because of some quirk of its implementation.

Rust set out too lofty a goal: the borrow checker is supposed to be able to prove correct code correct, despite that being a mathematically undecidable problem. So it fails, inevitably. And worse, the community (this article too) regards those little glitches as "just bugs". So we're treated to an endless parade of updates and enhancements and new syntax trying to push the walls of the language out further into the infinite undecidable wilderness.

I've mostly given up on Rust at this point. I was always a skeptic, but... it's gone too far at this point, and the culture of "Just One More Syntax Rule" is too entrenched.

redman25 · on Dec 24, 2024

If you are trying overly hard to abstract things or work against that language, then yes, things can be difficult to refactor. Here's a few things I've found:

- Generics

- Too much Send + Sync

- Trying too hard to avoid cloning

- Writing code in an object oriented way instead of a data oriented way

Most of these have to do with optimizing too early. It's better to leave the more complex stuff to library authors or wait until your data model has settled.

nostradumbasp · on Dec 24, 2024

"Trying too hard to avoid cloning"

This is the issue I see a certain type of new rustaceans struggle with. People get so used to being able to chuck references around without thinking about what might actually be happening at run-time. They don't realize that they can clone, and even clone more than what might "look good", and that it is super reasonable to intentionally make a clone, and still get incredibly acceptable performance.

"Writing code in an object oriented way instead of a data oriented way" The enterprise OOP style code habits also seem to be a struggle for some but usually ends up really liberating people to think about what their application is actually doing instead of focusing on "what is the language to describe what we want it to do".

IshKebab · on Dec 24, 2024

Yeah I think this becomes more true the closer your type system gets to "formal verification" type systems. It's essentially trying to prove some fact, and a single mistake anywhere means it will say no. The error messages also get worse the further along that scale you go (Prolog is infamous).

Not really unique to Rust though; I imagine you would have the same experience with e.g. Lean. I have a similar experience with a niche language I use that has dependent types. Kind of a puzzle almost.

It is more work, but you get lots of rewards in return (including less work overall in the long term). Ask me how much time I've spent debugging segfaults in C++ and Rust...

binary132 · on Dec 24, 2024

That’s not what OP is discussing. OP is discussing corner cases in Rust’s typesystem that would be sound if the typesystem were more sophisticated, but are rejected because Rust’s type analysis is insufficiently specific and rejects blanket classes of problems that have possible valid solutions, but would need deeper flow analysis, etc.

IshKebab · on Dec 24, 2024

Yes I know. You get the same effect with type systems that are closer to formal verification. Something you know is actually fine but the prover isn't quite smart enough to realise until you shift the puzzle pieces around so they are just so.

binary132 · on Dec 25, 2024

Ahh, I see what you mean

ykonstant · on Dec 24, 2024

Lean is far more punishing even for simple imperative code. The following is rejected:

  /- Return the array of forward differences between consecutive
     elements of the input. Return the empty array if the input
     is empty or a singleton.
  -/

  def diffs (numbers : Array Int) : Array Int := Id.run do
    if size_ok : numbers.size > 1 then
      let mut diffs := Array.mkEmpty (numbers.size - 1)
      for index_range : i in [0:numbers.size - 2] do
        diffs := diffs.push (numbers[i+1] - numbers[i])
      return diffs
    else
      return #[]

QuadDamaged · on Dec 24, 2024

When this happens to me, it’s mostly because my code is written with too coarse separation of concerns, or I am just mixing layers

dinosaurdynasty · on Dec 24, 2024

And in C++, those changes would likely shoot yourself in the foot without warning. The borrow checker isn't some new weird thing, it's a reification of the rules you need to follow to not end up with obnoxious hard to debug memory/threading issues.

But yeah, as awesome as Rust is in many ways it's not really specialized to be a "default application programming language" as it is a systems language, or a language for thorny things that need to work, as opposed to "work most of the time".

cogman10 · on Dec 24, 2024

C++ allows both more incorrect and correct programs. That's what can be a little frustrating about the BC. There are correct programs which the BC will block and that can feel somewhat limiting.

stouset · on Dec 24, 2024

While this obviously and uncontroversially true in an absolute sense (the borrowck isn’t perfect), I think in the overwhelming majority of real-world cases its concerns are either actual problems with your design or simple and well-known limitations of the checker that have pretty straightforward and idiomatic workarounds.

I haven’t seen a lot of programs designs in practice that are sound but fundamentally incompatible with the borrow checker. Every time I’ve thought this I’ve come to realize there was something subtly (or not so subtly) wrong with the design.

I have seen some contrived cases where this is true but they’re inevitably approaches nobody sane would actually want to use anyway.

cogman10 · on Dec 26, 2024

Here's something not contrived that does pretty frequently come up [1]

The gist of it is even though tasks can be somewhat guaranteed to live only for the scope of the current method, you can't guarantee that with the rust type system. The end result is needing to copy immutable data or use ARC needlessly.

It's annoying enough that the language added "scoped_threads" which work great, but can't be translated into rust async tasks.

[1] https://without.boats/blog/the-scoped-task-trilemma/

AlotOfReading · on Dec 24, 2024

In most cases, those "correct" C++ are also usually buggy in situations the programmer simply hasn't considered. That's why the C++ core guidelines ban them and recommend programs track ownership with smart pointers that obey essentially the same rules as Rust. The main difference is that C++ smart pointers have more overhead and a bunch of implicit rules you have to read the docs to know. Rust tells you in (occasionally obscure) largely helpful compiler errors at the point where you've violated them, rather than undefined behavior or a runtime sanitizer.