Recent Posts
Androidfags zara idhar aana
XHDATA D-808 DX-ing setup, analogue modulation
i don't understand
RCE on Pocketbase possible?
Shifting to linux mint
AI Impact Summit 2026
/emacs/ general
Simple Linux General /slg/ - Useful Commands editi...
/wpg/ - Windows & Powershell General
Sarvam Apology Thread
Some cool tech in my college
the hmd touch 4g
Holy Shit
Saar american companies have best privacy saaar
/desktop thread/
Forking jschan to submit a PR for captcha logic
JEEFICATION OF GSOC
4Chan bypass?
/g/ related blogpost - backup thread
Android Hygiene
My favorite game rn
COOKED
Are we getting real 5g?
I want to create my own forum but I don't know how...
Is my psu not compatible with my mobo?
/i2pg/ - I2P general
ISP Throttling
zoomies jara idhar ana
Zerodha just donated 100,000 usd to FFMPEG
Jio Fiber
/compiler_develoment/ thread
what is computer science?
just installed Arch Linux bros
Sketch - A simple 2D graphics library in C
Gemini+Figma for UI
LEARNING SPREADSHEETS
/GDG/
Sarvam Apology Thread


vG49v7
No.3262
>32bn parameter 32k context length
>105bn parameter with 105k context length
>Great performance in multiple benchmark
>Can browse web, multi modal
>Bunch of small models - which can run locally btfo'ing bigger models - in speech to text, asr, ocr, etc.
>32bn and 105bn models are going to be open source.
>The platform is going to be live tomm or soon
All these models perform even better in native tongue. I kneeell....their voices are one of the most natural i have ever heard.


vG49v7
No.3263

yaKeP0
No.3264
>>3262(OP)
sarvam is the shit. fucking brilliant and not too costly either. love it! sarvam khalu idam brahma!


vG49v7
No.3265
>>3264
It's really complete W in all terms. I didn't expect was not hoping but they really did it.

+++1QA
No.3267
>>3265
i have used several models like ai4bharat and openai whisper for my personal transcription project and all i got from them was garbage devanagari text. i thought i was doing the chunking wrong. but sarvam gave me like 60% accurate results on the first try. wth


vG49v7
No.3268
>>3267
Thing is the previous models were nothing, this is bigger than Deepseek R1 moment the models themselves. I guess it will take atleast 2 days for the news to hit the mainstream.
I can run the 30bn model locally. It's funny how it is beating likes of 4o in some benchmarks. Scam altman all that gloating now feels funny.
We will see more attempt by foreign labs to poach these guys and cause issues hopefully we persist.

1JKj36
No.3270
>>3268
oh yeah. just like they poached clawdbot guy. i can see that happening


rRfQ/N
No.3291
>>3262(OP)
>All these models perform even better in native tongue. I kneeell....their voices are one of the most natural i have ever heard.
Hmmm, do we finally get Jeets who will do the bare minimum of excelling in native script and what not?


vG49v7
No.3292
>>3291
>Hmmm, do we finally get Jeets who will do the bare minimum of excelling in native script and what not?
sorry didn't get you
DwYIKL
No.3293
pwivAn
No.3303
>>3262(OP)
54 million as total investment. Main advantage being cleaner input data as opposed to compute maxxing.
Best part is it is more useful than your usual LLM for Indians
nN81mJ
No.3304
>>3262(OP)
Just another psyop to save face after Galgotia disaster
6HpeFU
No.3305


vG49v7
No.3307
>Yesterday, we released Sarvam 30B and Sarvam 105B. Built from scratch, both models leverage a Mixture of Experts (MoE) architecture, delivering stronger performance at scale while using compute more efficiently
>Sarvam 30B activates just 1B non-embedded parameters per token, so it runs far more efficiently while maintaining strong capability.
>The model was pretrained on 16 trillion tokens spanning code, web, multilingual, and mathematical data, and supports a 32K context window that enables long-running agentic interactions.
>It is ideal for real-time applications like conversational AI and high-throughput workflows where latency matters
>Sarvam 105B model follows the same MoE design, activating 9B parameters per token to combine large-scale capability with efficient execution.
>With a 128K context window, it is built for more demanding tasks including complex reasoning, agentic task completion, tool use, coding, mathematics, and science.
>This makes it well suited for enterprise and population-scale deployments that require deeper reasoning and structured problem solving
>Both models will be released as open weights on Hugging Face soon. API access and dashboard support to follow!
























































