More

kbumsik · 2026-05-26T14:30:14 1779805814

> performance often degrades under different chat templates, long-context inputs, or out-of-distribution system prompts.

I heard that speculative decoding doesn't affect performance (I meant accuracy). Am I wrong about it?

ketchup32613 · 2026-05-26T14:41:48 1779806508

You're not wrong about that. Speculative decoding does not affect the quality of tokens generated, as each token has to be verified by the parent model before it is output.

Each of the tokens generated by the draft model has to be verified by the parent/original model, but if this acceptance rate falls, then the speedup from speculative decoding would be eliminated. This acceptance rate, and more directly the speedup from draft models, is what "performance" refer s to in the article.

kbumsik · 2026-05-26T14:51:17 1779807077

So the draft model's performance is directly linked to the overall speed. Thank you for the explanation!

By the way, can it be slower than without speculative decoding in worst case then?

daemonologist · 2026-05-26T15:48:50 1779810530

    > can it be slower than without speculative decoding in worst case then?

Yes - running the draft model costs compute and memory bandwidth, and running the drafted futures through the main model costs compute. If the draft model were really inaccurate or you're already compute-limited (usually: running large batches) you would expect some slowdown.

In practice, for single-user (non-batched) inference with a working configuration, you pretty much always get some speedup. For non-coding tasks I've seen it be nearly a wash for some people, in which case you might want to avoid it due to the extra memory usage (you'd rather use that memory to run a bigger quant/model, even at a slightly lower speed).

kbumsik · 2026-03-23T22:45:54 1774305954

Right? Even enterprise routers, e.g. Cisco, are not produced in USA.

kbumsik · 2025-11-29T06:32:05 1764397925

We switched from Celery to Temporal. Temporal is such a great piece of distributed system.

kbumsik · 2025-10-20T08:51:39 1760950299

> My impression is that OCR is basically solved at this point.

Not really in practice to me. Especially they still struggle with Table format detection.

coulix · 2025-10-20T08:54:55 1760950495

This.

Any complex parent table span cell relationship still has low accuracy.

Try the reverse, take a complex picture table and ask Chatgpt5, claude Opus 3.1, Gemini Pro 2.5 to produce a HTML table.

They will fail.

bobsmooth · 2025-10-20T08:59:00 1760950740

Maybe I misunderstood the assignment but it seems to work for me.

https://chatgpt.com/share/68f5f9ba-d448-8005-86d2-c3fbae028b...

Edit: Just caught a mistake, transcribed one of the prices incorrectly.

kbumsik · 2025-10-20T09:13:37 1760951617

Right, I wouldn't use full table detection to VLM model because they tend to mistake with numbers in table...

pietz · 2025-10-20T09:40:12 1760953212

Maybe my imagination is limited or our documents aren't complex enough, but are we talking about realistic written documents? I'm sure you can take a screenshot of a very complex spreadsheet and it fails, but in that case you already have the data in structured form anyway, no?

kbumsik · 2025-10-20T10:39:02 1760956742

> realistic written documents?

Just get a DEF 14A (Annual meeting) filing of a company from SEC EDGAR.

I have seen so many mistakes when looking at the result closely.

Here is a DEF 14A filing from Salseforce. You can print it to a PDF and then try converting.

https://www.sec.gov/Archives/edgar/data/1108524/000110852425...

grosswait · 2025-10-20T12:34:18 1760963658

Historical filings are still a problem, but hasn’t the SEC required filing in an XML format since the end of 2024?

richardlblair · 2025-10-20T13:28:58 1760966938

It's not really about SEC filings, though. While we folks on HN would never think of hard copies of invoices, but much of the world still operates this way.

As mentioned above I have about 200 construction invoices. They are all formatted in a way that doesn't make sense. Most fail both OCR and OpenAI

KoolKat23 · 2025-10-20T16:13:42 1760976822

OpenAI has unusuably low image DPI. Try Gemini.

daemonologist · 2025-10-20T14:17:08 1760969828

Now if someone mails or faxes you that spreadsheet? You're screwed.

Spreadsheets are not the biggest problem though, as they have a reliable 2-dimensional grid - at worst some cells will be combined. The form layouts and n-dimensional table structures you can find on medical and insurance documents are truly unhinged. I've seen documents that I struggled to interpret.

KoolKat23 · 2025-10-20T16:03:42 1760976222

To be fair, this is problematic for humans too. My old insurer outright rejected things like that stating it's not legible.

(I imagine it also had the benefit of reducing fraud/errors).

In this day and age, it's probably easier/better to change the process around that as there's little excuse for such shit quality input. I understand this isn't always possible though.

richardlblair · 2025-10-20T13:16:49 1760966209

I had mentioned this when the new QWEN model dropped - I have a stack of construction invoices that fail through both OCR and OpenAI.

It's a hard (and very interesting) problem space.

kbumsik · 2025-09-27T00:08:43 1758931723

In case if you don't know, Auth.js is not a frontend-only framework. It uses a backend server to make it secure.

So it basically has no difference from the alternatives you mentioned.

pmdr · 2025-09-27T07:25:52 1758957952

> make it secure

It's convenient, I'll give them that. Secure? https://projectdiscovery.io/blog/nextjs-middleware-authoriza...

kbumsik · 2025-09-27T09:06:38 1758963998

Well, that has nothing to do with JS itself though.

kbumsik · 2025-09-19T01:44:10 1758246250

It is really easy to tell the difference btw. You will always see "4" or "7" in the middle.

kbumsik · 2025-07-11T20:36:05 1752266165

How does it compare to frp, one of the most popular Open Source Cloudflare Tunnel alternative?

https://github.com/fatedier/frp

oschwartz10612 · 2025-07-11T23:52:02 1752277922

I think with the web UI it is a little more user friendly but not as super familiar with FRP. I think we might have a little more authentication control on top of the tunnel for web traffic as well.

kbumsik · on May 28, 2025

Artificial Analysis is the only stable source. Don't look at others like HF Leaderboard.

https://artificialanalysis.ai/

kbumsik · on April 16, 2025

Because gov infra also relies on CVE?

kbumsik · on April 16, 2025

> it probably is an anomaly that this was government funded

Companies can definitely fund it. But to be fair the gov, including NIST, also relies on CVE.