Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GNU file doesn't just check magic numbers against libmagic, it also defines a scripting language and a series of tests/printers written in it.

That's what allows it to do complex things, eg identify all the flavors of ELF objects even though they share a magic number or determine if something is JSON or CSV without one.



The point is that, fundamentally, the concept of a file type is undecidable or not well defined.

Think about it: A JSON file can also be considered a text file. It could also be some higher level type of file, depending on whether it conforms to some application-specific JSON schema. Thus the kind of file it is has more to do with what you want to do with it; it's not some intrinsic property of the file.


Ok? I was replying to a comment asking about how such systems can work. GNU file is example of a program that makes a best effort to classify file type in a useful way.


A JSON file is also valid YAML.


so much of our mathematical theories are based on the idea that we have objects which are certainly of one type or another.

what would happen to type theory if, say, the type of an object has a probability attached to it?


> A JSON file can also be considered

A Curly braces Separated file filled with Values


I'm pretty sure the file command is not GNU btw, and I can't find anything about a GNU version.. do you have a reference?

https://en.wikipedia.org/wiki/File_(command)


https://www.darwinsys.com/file/

This is the authors website. Apparently yeah its not part of GNU utils, I had no idea, I knew it came with most Linux systems so I looked for the Debian package and found the site linked above.

https://packages.debian.org/bookworm/file


My bad, I thought it was a part of coreutils and didn't check. I've only ever dug into the Linux utility and assumed.

Too late to edit!


Not sure why it would think a module javascript file is java. Maybe just not updated very often.

    % file crawl.mjs 
    crawl.mjs: Java source, ASCII text


probably looks for "import" at the start

Edit: yep, the test for Java source is just /^import.*;$/


I just think of every novice who uses Java when referring to JavaScript, even software does it! Heh




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: