Rio de Janeiro's city government model Rio3.5 beats Qwen3.7 in recent benchmarks
Posted by lucasfcosta 3 days ago
Comments
Comment by VoidWhisperer 3 days ago
Seems that they didn't make/train a new novel model, they did a mix of two existing models and then gave it an instruction to say it was 'Rio, trained by Rio AI Labs'
Comment by w4yai 3 days ago
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...
Comment by daquisu 3 days ago
Comment by giancarlostoro 2 days ago
Comment by danieldrehmer 2 days ago
Comment by scotty79 2 days ago
Comment by giancarlostoro 2 days ago
Comment by urbnspacecowboy 2 days ago
Comment by pixel_popping 1 day ago
Comment by mettamage 3 days ago
Correct me if I'm wrong but reading through the comments of the thread this seems to be post training/fine tuning.
Comment by oceansky 3 days ago
Comment by hedgehog 3 days ago
Comment by rafaquintanilha 3 days ago
Comment by hedgehog 3 days ago
Comment by Kelteseth 3 days ago
Comment by drnick1 3 days ago
Comment by adrian_b 3 days ago
Model Card:
Comment by arjie 3 days ago
Comment by kruxigt 3 days ago
Comment by Aurornis 3 days ago
As for the benchmarks: If you spend any time playing with fine tunes of published models you know that benchmarks are gamed so much that they're a useless indicator of performance for models from small teams. It's too easy to fine tune a model to perform well on the benchmarks, release it, put a line on your resume saying you released a model that beat the major labs on benchmarks, and then try to use that to jump into a new job. The temptation is high.
There are a lot of fringe models and fine tunes that claim to have better performance on some benchmark. Then you try to use them and find they're often worse at general tasks than the base model.
I would wait and see if these results hold across other benchmarks. It's cool that the city is doing something with AI, but this is something where extraordinary claims require extraordinary evidence. I doubt a small, previously unknown team has unlocked something secret that the team who made Qwen couldn't figure out. It's more likely it was fine tuned for a specific outcome (possibly these benchmarks) and performance in other areas was reduced as a consequence.
Comment by marcosdumay 3 days ago
Looks like it's an IT services government-owned company.
Most likely, they saw some business opportunity on selling it around for cities.
Comment by embedding-shape 3 days ago
Comment by HeliumHydride 3 days ago
Comment by dizhn 2 days ago
Comment by betimsl 2 days ago
Rio3.5 with Qwen compatible tool calling, we need that :)
Comment by pelasaco 2 days ago
Comment by mrandish 3 days ago
Because... lack of a good open weight LLM is a pressing need high on the municipal priorities list for Rio de Janeiro citizens?
Comment by true_religion 3 days ago
Or is it that it’s a city doing this?
Now Brazil does know how to boondoggle its finances for a prestigious cause with little return (e.g. the Olympics games) but this is far smaller a cost, more akin to a city setting up a tech accelerator or making a media campaign about how important STEM is.
Comment by senorrib 2 days ago
Comment by xbar 3 days ago
Comment by cuzezzzbbfofai 3 days ago
Comment by atoav 3 days ago
But a specific type of person appears to labour under the illusion that somehow we can get by without we all collectively steering our direction and choosing people who do what needs to be done without commercial interest. Their idea is that instead of choosing people who do it, we just make them compete for who can squeeze the most profit out of dealing with a problem and "somehow" that leads to a better result. When you press them for the details on that part of the mechanism, you will usually get crickets.
Comment by cassianoleal 3 days ago
Interestingly, the people who try to separate themselves from "the government" also seem to be the kind of people who want to "spread our model of democracy to the rest of the world".
How they can even reconcile being such a great democracy that the world needs to ~copy~ be force-fed with having an adversary government I don't know. The cognitive dissonance is so great that it's hard to fathom.
Comment by hgoel 3 days ago
Comment by latency-guy2 3 days ago
You do not agree with me. You can't claim to have my interests or my will if you are against it.
Comment by atoav 2 days ago
Explain how it is wrong and why it would be. If it is always wrong it follows it has to be wrong here too. The answer is that the meaning of "we all" is context dependent and that friend of yours that argues that we all somehow includes people in the whole city is an oddball that doesn't pick up the context within the words have been said.
We can all go around and make each others day worse with deliberate pedantry by ignoring the context of words, but that is basically just a waste of human energy. If you disagree with the fundamental point I made, argue against it based on the merits of the idea instead of arguing semantics.
Comment by naasking 3 days ago
No, we should substitute "unaccountable bureaucrats". The people who enter and leave power from elections are not the source of the daily frustrations people have with government, it's the rest.
Comment by atoav 3 days ago
Comment by naasking 2 days ago
Comment by airstrike 3 days ago
Comment by naasking 2 days ago
Comment by blahblaher 3 days ago
Comment by hmokiguess 3 days ago
Comment by ramon156 3 days ago
Information is power, dick measurements are not.
Comment by itsthecourier 3 days ago
Comment by reed1234 3 days ago