Hacker Newsnew | past | comments | ask | show | jobs | submit | flakes's commentslogin

These models are getting crazy good at examining things like core dumps and disassembly. I've been using an agent to write compiler logic, and its amazing the kind results you can get by having the agent examine the raw binary outputs. I would not be surprised to see agents excel at identifying and labeling patterns for decompilation.

Really? Could you share your techniques that get you there?

Inspired by https://github.com/scosman/cursed_browser, I have a little art project going where the CPU of a virtual machine is entirely LLM-powered. But even though the ISA is well known and clearly in the LLM's training data (it answers question about it mostly fine), I can rarely get it to even decode a handful of instructions in a row correctly. It'll e.g. do 10 instructions right (even execute right!), then just lose the ability to do bit manipulation all of a sudden and fail miserably at even decoding the 11th. If I try to help it along it'll apologize profusely, do it wrong in five novel ways, before it gaslights me saying I'm in fact mistaken.


One of the classic examples is highway traffic. You want to prevent traffic jams, so you increase the number of lanes. However, now that there are more lanes, people see less “cost” in driving, leading to even more people driving (e.g. to go on more day trips or as alternative to public transport). This can cause the traffic jams to become even worse.

So, increased efficiency can sometimes not lead to reduced latency, which goes against our natural thinking.


Recompressing should be guaranteed deterministic. It’s the packing/unpacking of tar archives to/from directories on disk that leads to the non-determinism (such as timestamps and ownership metadata). If the tar is left intact, both zstd and gzip should produce byte for byte identical outputs given the same compression parameters.


You are correct; I confused archiving with compression. However, even considering only the compression process, same compression parameters cannot be guaranteed, as it is unknown which compression parameters the image publisher used.


Thats true. And regardless of compressed vs regular tar, I think the OCI format working with opaque archives is extremely limiting. I hope the industry will eventually redesign to use content addressable storage per file and have metadata to describe the layer/disk layout instead. That would allow per file deduplication, and we can use tar for just bulk transfer over the wire, rather than using tar for the data at rest.


containerd 2.3 has support for erofs which does a direct import of the layer. It can even convert the tar based layers to erofs, faster than extracting the tar normally.

Also looking at block-based content store so that blocks can be deduped across images.


That is not correct. You would have to use the same compression tool (and likely version) for this to match.

Old docker discarded the compressed bits but kept some metadata about the the so it can at least recreate the tar.

It also recreated the manifest o push.


Thanks for the correction. I did mean given the same tooling version/parameters, but (as you and others pointed out) preserving and recreating that state is not at all straightforward.


When a child process finishes (that is not actively being waited on) it is left in a "defunct" or "zombie" state and will stick around in the process table until the parent process waits on them to fetch exit code. When you kill a parent process with active children, these subprocesses will become orphaned and re-parented to the OS pid 1 (or another "sub-reaper" process depending on your setup).

The OS will typically not kill orphaned/re-parented processes for you. It will simply wait/reap them so they are not left as zombies once they complete. If your parent process spawns something like a daemon server that needs an explicit signal to be stopped (e.g. SIGINT/SIGTERM), these processes will continue to run in the background until they are manually killed or they crash.


I see, so I might still need to hunt down non-daemon but hung processes even after I kill tmux server in which I ran them. Might explain a couple of odd occurrences in the past…


> "this test failure is preexisting so I'm going to ignore it"

Critical finding! You spotted the smoking gun!


I would say that this is limited to a targeted Go version and architecture. For example, the filter checks for the goroutine pointer on `r28` which is correct for arm64 but not universally true.

Any changes to the struct layouts, stack, or heap layouts would also cause failures in these lookups. E.g., in Go 1.17, many functions now use direct register mappings for arguments rather than always placing arguments in the stack.

Would need to thoroughly vet compatability with each new Go version before using something like this in production.


You’ll probably want to look to the PEPs. Havent dug into this topic myself but looks related https://peps.python.org/pep-0744/


I think CPython already had tier2 and some tracing infrastructure when the copy-and-patch JIT backend was added; it's the "JIT frontend" that's more obscure to me.


I'd say a lot of users are going to borrow patterns from Go, where you'd typically check the error first.

    resource, err := newResource()
    if err != nil {
        return err
    }
    defer resource.Close()
IMO this pattern makes more sense, as calling exit behavior in most cases won't make sense unless you have acquired the resource in the first place.

free may accept a NULL pointer, but it also doesn't need to be called with one either.


This example is exactly why RAII is the solution to this problem and not defer.


I love RAII. C++ and Rust are my favourite languages for a lot of things thanks to RAII.

RAII is not the right solution for C. I wouldn't want C to grow constructors and destructors. So far, C only runs the code you ask it to; turning variable declaration into a hidden magic constructor call would, IMO, fly in the face of why people may choose C in the first place.


defer is literally just an explicit RAII in this example. That is, it's just unnecessary boiler plate to wrap the newResource handle into a struct in this context.

In addition, RAII has it's own complexities that need to be dealt with now, i.e. move semantics, which obviously C does not have nor will it likely ever.


> RAII has it's own complexities that need to be dealt with now, i.e. move semantics, which obviously C does not have nor will it likely ever.

In the example above, the question of "do I put defer before or after the `if err != nil` check" is deferred to the programmer. RAII forces you to handle the complexity, defer lets you shoot yourself in the foot.


Thats interesting. If you’re on Chrome I’d try out Firefox just to see. I haven’t had any issues for a long time.


I find it better to bubblewrap against a full sandbox directory. Using docker, you can export an image to a single tarball archive, flattening all layers. I use a compatible base image for my kernel/distro, and unpack the image archive into a directory.

With the unpack directory, you can now limit the host paths you expose, avoiding leaking in details from your host machine into the sandbox.

bwrap --ro-bind image/ / --bind src/ /src ...

Any tools you need in the container are installed in the image you unpack.

Some more tips: Use --unshare-all if you can. Make sure to add --proc and --dev options for a functional container. If you just need network, use both --unshare-all and --share-net together, keeping everything else separate. Make sure to drop any privileges with --cap-drop ALL


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: