How can developers reliably generate, control, and inspect large volumes of realistic dialogue data without building a custom simulation stack every time? Meet SDialog, an open sourced Python toolkit ...
Tencent Hunyuan has released HunyuanOCR, a 1B parameter vision language model that is specialized for OCR and document understanding. The model is built on ...
Optical character recognition has moved from plain text extraction to document intelligence. Modern systems must read scanned and digital PDFs in one pass, preserve layout, detect tables, extract key ...
How can a small model learn to solve tasks it currently fails at, without rote imitation or relying on a correct rollout? A team of researchers from Google Cloud AI Research and UCLA have released a ...
How do you design a single model that can listen, see, read and respond in real time across text, image, video and audio without losing the efficiency? Meituan’s LongCat team has released LongCat ...
Most agent frameworks still run a predefined Reason, Act, Observe loop, so the agent can only use the tools that are injected in the prompt. This works for small tasks, but it fails when the toolset ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results