Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model
Posted by unrvl22 2 days ago
Comments
Comment by rafaquintanilha 2 days ago
1. They claim the official model is based on Qwen 397B. It's likely they didn't disclose Nex Pro at all because Nex itself is based on the same base model (not saying they shouldn't).
2. The improvement would come from merging the weights PLUS on-policy distillation. The confusion is that the uploaded model didn't have the distillation at all.
3. It's important to notice they didn't advertise the model besides posting it on Reddit 2 days ago. It became viral organically, over the weekend, and during Brazil's World Cup debut (Brazilians will understand). Of course the mayor of Rio took the opportunity to capitalize over the free coverage, but that wasn't done in conjunction with the researchers.
4. I don't see why they would disclose Qwen 397B as base and mention the SwiReasoning paper but not mention Nex if all they did was to merge both models.
5. In any case, what they are claiming is easily verifiable once (if) they upload the right model.
Comment by throwa356262 2 days ago
Comment by xiphias2 2 days ago
Comment by jwitthuhn 2 days ago
Comment by xiphias2 2 days ago
Anyways SwiTransformer paper looks interesting and doing a post training to optimize for it looks interesting as well.
Comment by 00index 1 day ago
Comment by matheusmoreira 2 days ago
Comment by airstrike 2 days ago
Rio has a strong engineering talent pool, along with many other major capitals in Brazil
Comment by matheusmoreira 2 days ago
What Brazil doesn't have is a history of properly rewarding talent, which often causes it to migrate elsewhere. So it's definitely surprising when any sort of technological development happens in Brazil: it implies someone who stayed managed to get something done, most likely for much less than what that something is actually worth, while also being crushed by extremely high taxes that essentially doubles the cost of computer hardware.
Comment by red-iron-pine 1 day ago
I think people are missing the last few words -- cost of computing hardware
when I used to do ISP work I did a lot for LATAM. The joke was that you'd get better bandwidth for Brazil routing out of the country and through Miami than going across the country. The reason? crazy high tariffs on hardware.
No reason to base anything locally, and if you're not basing it locally then there isn't really much reason to stick around, either. Go to other hot markets like Zona America, Austin, CDMX, Miami, Los Angeles, etc. and make the big $$$.
I worked with 2 Brazilian engineers who were in country (and currently work with a 3rd now, based in Monteal) and they were very good but all said they had to get out of country to lock in the serious engineering roles.
Comment by rbanffy 2 days ago
I always find this funny. Brazilian taxes are nowhere near what I would say “high”. I pay about twice as much out of my compensation as I would pay in Brazil, and that would be as if I did zero tax optimisation back then.
Comment by fabioz 2 days ago
Compared to many countries Brazil doesn't have such high taxes (I'd say that if you work remotely for a company outside of Brazil, you'll probably have much lower taxes compared to almost any other country -- working locally the difference isn't as big, but you have higher taxes in many other places).
What it really lacks is access to capital (which is the real "mojo" of the US compared to the rest of the world).
Comment by iterateoften 1 day ago
Incorporating and getting a functional business entity in Brazil is harder. In USA I literally do in 5min online including bank account. In Brazil they are taking out microscopes to verify your signature on the paperwork matches.
And in the USA if you have one bad employee, just fire them any time. In Brazil for better or for worse nowhere near as easy. Obviously better for employees but businesses don’t like it because you can get stuck with a employee dragging down everyone unless you pay them a years salary etc.
Comment by rbanffy 16 hours ago
It’s still something you can do in less than a month. That’s nothing compared to the life of a functional enterprise.
For you to have to pay “years of salary” the employee must have been with you for a very long time - as in decades. Considering most companies don’t live to 2 years, the odds of it happening to you are very slim. Labor laws exist to protect the weak part against the possible abuses of the company. You must agree the harm a company can do to its employees is far greater than what an individual employee can do to a company. And this is why unions exist: because collectively is the only way the employees have teeth.
Comment by persedes 1 day ago
Comment by dlisboa 1 day ago
Comment by persedes 1 day ago
Comment by matheusmoreira 1 day ago
Comment by rglullis 2 days ago
As a business owner: not so bad if you are a freelancing or just a few business partners providing some type of service, but terrible the moment you start considering employing other people.
Comment by rbanffy 2 days ago
Have you seen the public services of countries with lower taxes? Their public hospitals?
> but terrible the moment you start considering employing other people.
Employing people isn't cheap anywhere (except, perhaps, in the US, where labour rights are kind of nonexistent)
Comment by rglullis 2 days ago
I quick visit to the dermatologist to check for some tiny bumps that showed up in my forehead: 60€, out of pocket, because the insurer doesn't cover it.
Comment by rbanffy 1 day ago
Comment by rglullis 1 day ago
All in all, my point was only that the amount of taxes that people pay and quality of services are not necessarily related. Germany has high taxes and expensive-but-adequate healthcare. Greece has high taxes and expensive-and-inadequate healthcare. Switzerland has low taxes and universal/cheap healthcare (max. $5000/year deductible, max charge per hospitalization of $700).
Comment by rbanffy 1 day ago
That's how public health works. It's the same as mortgage insurance in Brazil (where I come from), which is mandatory, and, since it's mandatory, it doesn't consider actuarial risk.
Comment by rglullis 1 day ago
Comment by rbanffy 16 hours ago
BTW, “right” and “wrong” are human constructs, so you are totally free to organise a society around the survival of the richest. It’s just not a society you’ll like living in.
Comment by rglullis 4 hours ago
There are other ways to properly fund universal healthcare that does not involve creating artificial taxes:
- Increase the actual income taxes.
- Policies that mandate the allocation of revenue from vice and environmental taxes (alcohol, smoking, gambling, fossil fuels) to the health system.
- Social security reform. Pensions should not be an investment, only insurance. Turn pensions into an UBI program that has a hard cap.
- Supplemental and voluntary insurance plans to work on top of the existing public network. This might be risky as it can introduce a dual-class system, but (a) it's already the reality in most countries anyway and (b) could be avoided if the system is built around a "cafe sopezzo" [0] mechanism (the priority is given to you when you use the service, but redirected to non-paying members for the months when you don't)
Comment by throw-the-towel 1 day ago
Comment by rbanffy 16 hours ago
If you go public, it’s free, but you’ll be in line based on the risk to your life, so it might be a long while. You really don’t want to check in to a public hospital and get an MRI right away because that means whatever you have is really, really bad.
Comment by matheusmoreira 1 day ago
The result is a nearly 100% tax on computers and consumer electronics.
One for you, one for the government.
And it's getting worse. Tariffs on computer hardware were raised only a few months ago.
Comment by drdexebtjl 1 day ago
The tariffs for commercial importations are much lower and depend on the part. For SSDs, for example, II is around 10%. With other fees and ICMS, you're looking at around +60% total. Still high, but not nearly as high.
But large businesses would rather really prefer if you continued to believe they pay +88% just like you. That way they get to point at the government while keeping their fat margins.
Comment by matheusmoreira 10 hours ago
Nope. ICMS is calculated "por dentro". It's base / (1 - rate), not base * rate.
Product = 100
II = 60
= 160
ICMS total = 160 / (1 - 0.18)
= 195.12
ICMS = 195.12 - 160
= 35.12
That's +95.12% tax. Also known as nearly 100% tax. One for me, one for the government.> The tariffs for commercial importations are much lower and depend on the part.
Fair point, but they're sure as hell not "much" lower. As you yourself noted, they are still high, and obviously those companies are going to want to add their own margins on top.
Comment by rbanffy 1 day ago
And, in the meantime, they help push for more "grift-friendly" politicians. For them, it's a win-win situation.
Comment by rbanffy 1 day ago
Apart from that, this is something that affects the HN crowd and almost nobody else.
Comment by matheusmoreira 12 hours ago
Comment by troauei 7 hours ago
Brazil relies heavily on indirect taxation, not income tax.
The average Brazilian effectively pays about *41.1%* of their gross income in taxes when all major taxes are included. (https://ibpt.org.br/brasileiro-trabalha-150-dias-por-ano-ape...)
They break it down as: roughly 15.2% from income-related taxes, 3.1% from property taxes, and *22.9% from consumption taxes*.
---
And anyway, the real problem is that the burden of tax is way way higher for a brazilian than someone in the global north.
As a general rule, in Brazil only people who do qualified work can reach the (converted) 1000 USD threshold. People who spent years in university or trade school, and are either very good at their jobs or graduated in prestigious professions such as engineering, law or medicine.
Even if they were to pay no taxes, most 40-hour week professional programmers here would still earn, at the end of the day, less than a high-school diploma 20-hour worker in US. Let that sink in.
Now take into account that on average, out of those $1000, $400 becomes taxes: In practice, there are lots of qualified workers here who don't even make $8k/year after taxes...
And no, the lower living cost does NOT offset it. Imported/technology goods are disproportionately expensive relative to income since we receive in BRL but still pay the dollar price, if not higher, due to said consumption taxes.
It's nigh-impossible to have true disposable income that your average Joe or plain Jane can use to dedicate to homebrew their personal projects in Brazil. And when it happens, it tends to in software form, since software still is relatively cheap to make.
People whose income in the top 25% bracket here make less in dollars than US's bottom 25% bracket, simple as that.
This is, of course also true of many many countries, which is why you usually don't hear about the new cool tech from Nigeria, Brazil, Bangladesh, Congo, etc. The people who are qualified enough tend to leave the country for better conditions. Of course, I'm not saying it's impossible, but people get surprised, just as you can see in this thread.
Comment by jdahlin 2 days ago
Comment by matheusmoreira 1 day ago
The result is a nearly 100% tax on computers and consumer electronics. One for you, one for the government.
That 6% figure is just the Simples Nacional rate for micro-businesses making less than 35kUSD/year. The actual income tax tops out at 27.5% at middle class thresholds. On top of that Brazil stacks social security tax, payroll taxes and a yet more taxes embedded in every single purchase. If you calculate all of this you can figure out something like up to 70% of a brazilian's income can flow to the government.
You say swedish companies pay 70% taxes. Well, swedish citizens get excellent services and a generally functioning country in return. Brazilian citizens pay 70% taxes and they get... Brazil.
Comment by drdexebtjl 1 day ago
I'm not doing anything creative accounting-wise, I just max out my contributions to retirement accounts (PGBL) and get the correct tax deductions for all medical and education expenses.
We do have high import tariffs for individuals, and especially for consumer goods, as it's been pointed out in a different comment.
This does make it a very expensive country indeed if you want to live your life worshiping consumerism. But if you don't, you'll find that individuals don't really pay that much compared to other countries.
Comment by matheusmoreira 1 day ago
It's your comment that's misleading. I was trying to account for the numberless taxes that exist and get applied to every single transaction. You zeroed in on income taxes then stacked some deductions on top.
> tax deductions
Discounting deductions from the nominal tax rate doesn't change the fact those taxes are high, nor does it change the fact you max out your tax bracket at middle class incomes.
Deductions are actually the bare minimum. If you're using them, it means the state failed to provide you with proper education and health services, forcing you to spend money on things that are theoretically your constitutional rights. Not deducting these expenses would be robbery. The fact most brazilians have plenty of deductions at their disposal is only evidence of how absurdly tax inefficient this country is.
These deductions aren't automatic either, you have to spend time and effort accounting for all of this so that you can make the government give back some of the money it took from you. Time is money, so this is just yet another stealthy tax.
Finally, other countries no doubt have deductions too. I know for a fact that the US does, and european countries almost certainly do too. Accounting for these will probably only make Brazil look even worse by comparison.
> This does make it a very expensive country indeed if you want to live your life worshiping consumerism.
What a dismissive comment.
US government just banned Fable for foreign peasants like us. If you want a computer that can properly run LLMs locally, you're going to be forced to shell out money in the 40-100kBRL range. Computers are in the same price range as cars now.
If you think having some degree of sovereignty over our computing is "worshipping consumerism", then I don't know what to say to you.
Europe is currently fighting tooth and nail to develop some technological independence. China is creating Manhattan projects to catch up to the west in semiconductor manufacturing and kick them out of their supply chains. If we keep up these nonsense taxes, AI will be just yet another area where Brazil is half a century behind.
Brazil taxes foreign products in order to "protect local industry", then it taxes the local industry as well, which means pretty much nothing higher up in the value chain gets made here. Brazilian efforts at creating national computer technology date back to the military dictatorship, to the import substitution policies. The same time period that birthed Lua, in fact. What have we been doing since then? Nothing. Don't have our own industries, and we can't really buy the products produced by other nations either. This is why people leave: Brazil combines the worst of both worlds.
Comment by drdexebtjl 1 day ago
You're the one that brought up a comically inflated 70% number as if it were realistic. You can't act as if the nominal rate is the effective rate, then complain when I bring up numbers based on the effective rate.
> If you're using them, it means the state failed to provide you with proper education and health services, forcing you to spend money on things that are theoretically your constitutional rights.
No, it means I'm picky about my doctors. You seem to have ignored the tax-advantaged retirements accounts, though.
> These deductions aren't automatic either, you have to spend time and effort accounting for all of this so that you can make the government give back some of the money it took from you. Time is money, so this is just yet another stealthy tax.
You just need to ask for receipts and put them in a (digital) folder. Then you spend 5 minutes tops _per *year*_ reporting their sums on your tax forms. If that's not enough, most of the numbers are pre-filled for you, you just have to review it. And you can download past receipts from the federal government's website.
> I know for a fact that the US does, and european countries almost certainly do too. Accounting for these will probably only make Brazil look even worse by comparison.
Then do it. Tax legislation is very different across countries and even municipalities. Comparing nominal tax rates is completely meaningless. You need to compare the effective tax rate.
> If you want a computer that can properly run LLMs locally, you're going to be forced to shell out money in the 40-100kBRL range. Computers are in the same price range as cars now.
What part of that is due to an increase in taxes? Hardware prices have skyrocketed around the world due to limited supply. In fact, there's a record high number of computer hardware parts in the most recent list of products exempt of import taxes.
> If we keep up these nonsense taxes, AI will be just yet another area where Brazil is half a century behind.
Our government is doing exactly that. The latest project in discussion in the Senate will give import tax exemptions and export tax exemptions to data center projects that reserve 10% capacity to the national market, invest 2% locally in R&D, and use clean energy. I think these numbers are ridiculously small.
If we had lower import taxes on data center hardware, how else would the government negotiate with data center companies to reserve capacity for our national interests?
Finally, I think it's a bit silly to think that _you and me_ running agentic coding LLMs at home furthers national interests. It does not. It furthers our hobbies. It's not even the kind of hobby that gives you relevant career experience which then goes on to strengthen our industry.
> The same time period that birthed Lua, in fact.
Lua was created in 1993 in a lab doing research for Petrobrás. I happened to graduate from PUC-Rio, so I know this personally: the Computer Science labs are receiving much more funding nowadays than they did in 1993. They're still cranking out excellent research, and, if I may say so myself, excellent alumni as well.
> What have we been doing since then? Nothing.
- Our electronic voting system; - Pix, the largest and most popular payment network in the world; - Elixir, LangFlow, Neovim, just to name a few that you probably know about.
Comment by matheusmoreira 10 hours ago
Nothing "comically inflated" about it. That's pretty much the upper bound. It's not just income taxes, we've got property taxes, vehicle taxes, financial taxes, not to mention taxes on consumption, especially fuel, electronics, telecommunications, food, clothing, medicine, you name it. Employment taxes also reduce potential salaries. The brazilian doesn't have any savings. Sum all of this up, it can definitely reach 60% to 70% of income.
> You can't act as if the nominal rate is the effective rate, then complain when I bring up numbers based on the effective rate.
You dismissed entire categories of taxes in one fell swoop then started hedging with some deductions. Come on now.
> No, it means I'm picky about my doctors.
What, SUS not good enough? Of course not.
> You seem to have ignored the tax-advantaged retirements accounts, though.
Because it's tax deferral, not tax deduction. You're still gonna get taxed. You're basically claiming you're saving money by using credit cards. Brazil has no equivalent to the american Roth IRA either.
> You just need to ask for receipts and put them in a (digital) folder.
So... Time and effort. Multiplied by every deductible transaction. Plus the cost of actually learning how to deal with all this nonsense in the first place.
I know how much my time is worth. Even if it were "five minutes tops", it would definitely be a tax. And it absolutely isn't "five minutes tops".
> Then do it.
OK. Data from OECD's Taxing Wages 2026 provides total tax as % of labor cost figures.
Germany 49.3%
France 47.2%
Sweden 40.6%
OECD average 35.1%
Brazil is not in this dataset. Add INSS + employer contributions and Brazil can definitely reach the ~34-39% range, which by itself is already comparable to or exceeds the OECD average. USA is at around ~30%.However, Brazil actually shifts most of its taxes onto consumption. It leads the world in that particular tax burden: ~12.5% of GDP, roughly double the OECD average, ~56% of total revenue. Add that and we're in the ~49-55% range.
Meanwhile, Brazil ranks literally dead last in the world's most-taxed countries on welfare return to citizens. Literally 30th out of 30. It collects taxes at European levels and delivers developing country services. Europe has world class schools, healthcare, infrastructure. Brazil has drug gangs that dominate vast swaths of our territory and which perpetrate more homicides than active war zones. Quite the quality of life.
> Lua was created in 1993 in a lab doing research for Petrobrás.
And Reserva de Mercado was in force from 1977 to 1992. Petrobras had to follow strict rules under these restrictions, which included software acquisition. This directly drove the creation of Lua and Lua's predecessors DEL and SOL.
Lua is a direct child of the policies of the brazilian military dictatorship. Born from the constraints they imposed.
> What part of that is due to an increase in taxes?
The part where half of the price we pay is taxes. On top of the global shortage price increases.
> In fact, there's a record high number of computer hardware parts in the most recent list of products exempt of import taxes.
No. GECEX 852 (2026-02-04) actually raised taxes on 1252 IT/capital goods. What you're citing are the follow up tax exemption requests. Which only exist because of the tax hike. Which only apply to companies with actual industry projects. You and me won't see a dime out of this.
> just to name a few that you probably know about
Brazil does not have a single competitive semiconductor fab. No actual computer technology to speak of. Half a century behind China, to say nothing of the US. Electronic voting machines are commodity hardware, just assemblies of somebody else's computers with Linux and some custom software on top. Pix is the only brazilian achievement on your list that's impressive, and even that is still completely dependent on foreign technology.
> Elixir, LangFlow, Neovim
First you say that individuals doing cool things is irrelevant to the national interest. You just dismiss it all as hobby tier. Then you cite open source projects by individuals as evidence of Brazil's strength? Which is it?
Not the kind of hobby that gives you career experience? Ridiculous.
> I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu)
Linus Torvalds, 1991. Need I say more?
I'm going to say more.
The demoscene influenced the games industry. Minecraft was made by a single guy just messing around. Bought for 2.5 billion. Tim Sweeney built ZZT, became Epic games, led to Unreal Engine, also drew heavily from the home computer scene, as anyone who ever played Unreal Tournament knows. Wozniak built the Apple I to show off at the Homebrew Computer Club, a hobbyist club. Trillion dollar company. Facebook was once a literal campus dorm room project. Linus wrote git in 10 days to scratch an itch. Fabrice Bellard built ffmpeg. Python was Guido's christmas break project.
Individual computing capability is the seedbed of national computing capability. Everything is a throwaway toy joke project, until it goes into production and starts making money.
The brazilian government is not furthering the national interest. It's holding it back.
Comment by mathattack 2 days ago
Comment by cscheid 2 days ago
Comment by Aurornis 2 days ago
They merged the base model with another lab’s fine tuned model. The improvements could have come from getting some of the fine tuned weights from the other model.
If they really had a better performing model that they “accidentally” forgot to upload, they could have uploaded the correct file by now.
Comment by croes 2 days ago
Comment by ipieter 2 days ago
I am willing to give them the benefit of the doubt, but we've seen this before: a model gets released that is supposedly state-of-the-art, yet seems to be a an other repackaged model without any training. Reflection 70B was the most similar example, all they now need is an api that rewrites "Claude" to "Rio".
Comment by s1artibartfast 2 days ago
Comment by motbus3 1 day ago
Comment by matheusmoreira 1 day ago
That's what makes this hilariously sad. Brazil could have done some good work here, but it just didn't. Brazil merged two models on a workstation.
Comment by hintymad 2 days ago
I find it amazing how robust the current deep learning models are. A simple linear combination of every weight did not degrade the performance of the model, but enhanced it.
Comment by Aurornis 2 days ago
Enhanced it on a couple benchmarks, supposedly.
The game is to turn knobs until you get a benchmark run that shows an improvement, then ship it. There are a lot of fine tunes and chimera models on HuggingFace that are supposedly better at some specific test, but when you use them for anything else they're usually worse.
This happens with a lot of the models that are modified to remove censorship. They succeed in getting the model to emit previously censored outputs, but the overall output quality decreases.
Comment by andai 2 days ago
https://web.archive.org/web/20260614082641/https://huggingfa...
And the Nex benchmarks for comparison
https://huggingface.co/nex-agi/Nex-N2-Pro
Rio seems to be about halfway between Qwen 3.5 and Nex, as you'd expect?
Comment by monster_truck 2 days ago
Comment by Aurornis 2 days ago
Many of the “uncensored” model providers also do some fine tuning on the models. Some of them target better benchmarks or other measures, but outside of the benchmarks and metrics they’re fine tuned for they are generally noticeably worse than the original model.
Comment by yowlingcat 2 days ago
Comment by weitendorf 2 days ago
I guess I’m looking for a kind of bulk/sticky dropout (which was in fashion way back when I studied DNN in school).
Comment by avadodin 2 days ago
Abliteration whilst a neologism implies a surgical ablation of refusal.
Earlier approaches post–trained the model to refuse less and, much like other kinds of fine–tuning, it degraded performance. They were "uncensored".
Abliteration has seen some improvement to this day but it always was close to equivalent performance to the original when compared to those earlier techniques.
Comment by ls612 2 days ago
Comment by tredre3 2 days ago
They're more prone to getting stuck in loops, becoming unresponsive, and hallucinating more (presumably because of the reduced desire to not answer).
I've tried all the popular heretic peddlers, but if you have one that you can vouch for maybe I've simply missed it.
Comment by antonvs 2 days ago
Comment by manquer 2 days ago
i.e reinforcement learning against a weak reward function - benchmark is insufficiently complex and is not representative of the real world sufficiently.
The "game", i.e. decision tree can be modeled as a multi-arm bandit problem, to deploy finite resources ( compute) toward exploitation/exploration .
The main issue is each training / fine-tune is very expensive so number of chances at the slot so to speak is pretty limited today.
Comment by x312 2 days ago
I don't believe this would work on two LLMs that have different pretraining. Even if it did you would need two LLMs that have exact same internal activation shapes, dimensions, expert counts, token vocabulary, realistically it would never happen outside of finetunes or academic experiments.
Comment by oofbey 2 days ago
Comment by hashmap 2 days ago
Comment by woadwarrior01 2 days ago
Comment by kolanos 2 days ago
Comment by itkovian_ 2 days ago
It is not understood why it works so well.
Comment by teravor 2 days ago
Comment by tarruda 2 days ago
Comment by themafia 2 days ago
Which could be a signal that your "performance" was so abysmal in the first place that even randomly applied training methods can't make it _worse_.
Comment by kristjansson 2 days ago
Comment by moritzwarhier 2 days ago
Comment by Davidzheng 2 days ago
Comment by Davidzheng 2 days ago
Comment by randall 2 days ago
Comment by meindnoch 2 days ago
Comment by kristjansson 2 days ago
Comment by antonvs 2 days ago
Comment by unrvl22 2 days ago
Comment by DonsDiscountGas 2 days ago
Comment by bwhitty 2 days ago
Comment by nightpool 2 days ago
Comment by hypercube33 2 days ago
Comment by baobabKoodaa 1 day ago
Comment by Lucasoato 2 days ago
Comment by Aurornis 2 days ago
Then researchers looked at the weights and there is no post training at all.
They are now attributing both models they merged, but their excuse for the lack of post training is to claim they accidentally uploaded the wrong files.
Comment by serial_dev 2 days ago
Comment by evilduck 2 days ago
Look up "Reflection 70B" drama.
Comment by clear-octopus 2 days ago
Comment by vasco 2 days ago
Comment by vitorgrs 2 days ago
Comment by zinodaur 2 days ago
Comment by Aurornis 2 days ago
The dispute is that they released it with claims about having done some post training that improved the outputs. It was discovered that the model was not post trained like they claimed.
The HF page now says it’s a merge of models, which wasn’t there before. They’re trying to claim they accidentally uploaded the wrong model to HF and that they’ll upload the real one soon.
Basically, they thought they could splice two open weights models together and claim their team had accomplished some amazing post training, but they weren’t smart enough to realize that other researchers would discover that there wasn’t any post training.
Comment by moritzwarhier 2 days ago
But it's impossible to form a nuanced opinion when political association has a higher priority than the facts; which, again, don't look flattering for the implementers.
Comment by iknowstuff 2 days ago
Comment by Aurornis 2 days ago
In the early days of Llama there were a lot of experiments like this. There were even some interesting combinations of models where they stacked layers of different models together or even added more layers with interesting results.
But announcing that you spliced two models together isn't very impressive in 2026, so they announced that they had done their own post training and outdid the big labs. They thought nobody would look close enough to notice.
Comment by ninja3925 2 days ago
Comment by Aurornis 2 days ago
Scroll past the first issue to find it. It’s further down.
Comment by jdiff 2 days ago
Comment by valleyer 2 days ago
Comment by jdiff 2 days ago
Comment by s1artibartfast 2 days ago
Comment by carlosjobim 2 days ago
Comment by hootz 2 days ago
Comment by jdiff 2 days ago
> An open AI model trained in Rio with public funding over the last year by @Prefeitura_Rio surpassing all other models.
Comment by jrm4 2 days ago
Comment by philipallstar 2 days ago
Comment by jrm4 2 days ago
You'll have to let me know when that finally happens, because that ain't now.
Comment by philipallstar 2 days ago
Your second one - that's how everything public is paid for. Private individuals pay tax, either through their corporations paying corporation tax or the tax bill on top of their wage bills, which a) drives up prices of the goods and services they offer, or depresses wages, and b) funds all the public sector employees and orgs that don't pay tax (orgs) or don't pay net tax (employees).
Comment by jrm4 1 day ago
The point of my first sentence is; private individuals and small businesses generally pay their fair share. Larger corporations emphatically do not.
Comment by philipallstar 1 day ago
Larger corporations pay loads of tax. Shed loads. They pay all the employee and income tax, as well as corporation tax and their sales generate VAT. Small businesses are the ones most likely to have softer tax burdens due to progressive taxation.
Comment by jrm4 22 hours ago
Comment by philipallstar 22 hours ago
Comment by jrm4 13 hours ago
Comment by carlosjobim 2 days ago
A child caught doing something bad will cry "but my friends also did it!", is that the level of reasoning hackers want to be at?
Comment by blanched 2 days ago
Comment by carlosjobim 2 days ago
Comment by blanched 2 days ago
Comment by carlosjobim 2 days ago
Comment by blanched 2 days ago
Comment by sdevonoes 2 days ago
Comment by dmix 2 days ago
Comment by jrm4 2 days ago
They can both be bad.
Comment by lostlogin 2 days ago
I might be missing something, but I don’t see anyone defending the the scams.
Comment by internet2000 2 days ago
Comment by Planktonne 2 days ago
Comment by dofm 2 days ago
(It's not news to anyone who has worked in sales-led businesses that salespeople are prone to believing the claims of other salespeople, I guess).
Comment by selcuka 2 days ago
Lying about your lab's capabilities != Lying about model capability
Exaggerating the capabilities of a new model that you've actually trained in press bulletins can be called marketing. Merging two models and claiming that you trained a new model is plain lazy.
Comment by low_tech_love 2 days ago
Comment by vips7L 2 days ago
Comment by themafia 2 days ago
Comment by functionmouse 2 days ago
Comment by outside2344 2 days ago
Comment by adrian_b 2 days ago
The model card says:
> Post-trained from Qwen 3.5 397B
The model card also says that they use an inference framework based on "SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs" by Shi et al.:
https://arxiv.org/abs/2510.05069
So the sources seem properly attributed.
They only claim that what they did to "Qwen 3.5 397B" has improved the LLM, including, as expected, with "strong performance in Portuguese".
Comment by petu 2 days ago
There (is/was) no attribution to Nex team (they've released a model based on Qwen 3.5 397B as well).
As per OP link Nex claims that what Rio team released (so far) is just linear interpolation of weights between Nex and OG Qwen model. With no attribution to Nex and zero signs of Rio doing any training of their own.
Comment by 00index 2 days ago
Comment by clear-octopus 2 days ago
Comment by bachmeier 2 days ago
I'd say it's more like someone forking a Linux distro, adding a few themes and fonts, and then complaining when someone else forks their distro and adds another theme.
Comment by dghlsakjg 2 days ago
Comment by bachmeier 2 days ago
I understand how the internet works and how people respond to others in this type of setting, but the comment I replied to did not in any way make the point I was making about the disproportionate nature of relative contributions.
Comment by vasco 2 days ago
You should frame this as a reminder to be more charitable in your positions because sometimes you can be wrong. This subthread ended being one of the funniest I've read recently.
Comment by idiotsecant 2 days ago
Comment by dghlsakjg 2 days ago
It is.
> I understand how the internet works and how people respond to others in this type of setting, but the comment I replied to did not in any way make the point I was making about the disproportionate nature of relative contributions.
Do you understand?
Jokes aren’t that funny when you have to dig into an explanation on the nuance of why the hidden meaning doesn’t match the surface meaning in exact degree and proportions. That turns a joke into a pedantic comment. And paradoxically muddies the point by explaining it.
We aren’t morons. We understand that Picasso is doing something on a different level than someone feeding bulk scraped JPGs of paintings into a python script. You really don’t have to explain.
Comment by bachmeier 2 days ago
Comment by bwilliams18 2 days ago
Comment by JoshStrobl 2 days ago
Comment by harikb 2 days ago
Comment by idiotsecant 2 days ago
Comment by AlienRobot 2 days ago
>The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.
Incidentally are people using Github issues as blogs now?
Comment by jonchurch_ 2 days ago
It wasnt framed as an issue which is the norm breakage I think you’re reacting to, as in they didnt ask that the readme be updated etc, but it is common now for folks to use a project’s issue tracker to name and shame them in a place they cant easily ignore.
Whether that’s right, prosocial, or professional is up for debate (as well as if any single definition of etiquette can be expected in 2026 on an issue tracker).
But surely you can see the optics reason why someone would take their complaint to the repo directly? It pressures the maintainers to respond, it allows for a pile on from the internet, and makes any decision to lock down a hostile thread into its own kind of statement.
The maintainers should absolutely post an official response and lock the thread though, it will likely get ugly in there.
Comment by ChoosesBarbecue 2 days ago
i.e. this is the maintainer posting on their own GitHub Issues.
Comment by jrm4 2 days ago
-- Bill Gates
Comment by ckcheng 2 days ago
> Bill Gates had somehow manifested, alone, surrounded by ten Apple employees. … Steve started yelling at Bill, asking him why he violated their agreement.
And what’s more interesting is the conclusion:
> Apple filed a monumental copyright lawsuit against Microsoft in 1988, but they eventually lost on a technicality (the judge ruled that Apple inadvertently gave Microsoft a perpetual license to the Mac user interface in November 1985).
Microsoft didn’t steal Apple’s GUI … Apple gave it to them.
Comment by alexgoodhart 2 days ago
Microsoft claimed that its software’s use of various visualizations related to window state was covered by the 1985 agreement, and Apple claimed that this was not true; those window states were produced by Macintosh while Microsoft’s software was being rendered in the Mac environment.
> In his March 20, 1989 Order, Judge Schwarzer declined to consider whether the visual displays in issue were generated by the Microsoft application programs or by the Macintosh system software. The point arose in connection with Microsoft's argument that the 1985 Agreement licensed to Microsoft all visual displays that could possibly be called up by running the five Microsoft application programs on the Macintosh system software then or in the future. 709 F. Supp. at 929. Judge Schwarzer concluded that Microsoft's contention would "defy common sense." Id.
Comment by themafia 2 days ago
That this moment is held up as some great exchange in business is annoying. That our regulatory agencies are perennially sleep at the switch and allow this nonsense to keep happening is extremely frustrating.
Comment by ChrisClark 2 days ago
Comment by Scroll_Swe 2 days ago
Comment by themafia 2 days ago
Comment by Scroll_Swe 20 hours ago
I live in Sweden but I worry about my country due to online freaks like you. Fair?
Comment by wunderlotus 2 days ago
Comment by ckcheng 2 days ago
Comment by jordz 2 days ago
Comment by calebkaiser 2 days ago
But yes, in general, merging refers to techniques that directly blend the weights of different models mathematically. It had a big moment of popularity ~2 years ago, with many so-called "Frankenmodels" popping up on leaderboards.
I tend to think of merging as belonging to the same general umbrella as things like "abliteration", or other techniques that surgically modify the weights of a model without a traditional training/tuning loop. Maxime Labonne is a great person to follow if you're interested in this general area.
Comment by jxmorris12 2 days ago
Model A: A_1, …, A_n Model B: B_1, …, B_n
C_i = A_i * p + B_i * (1 - p)
In other words, it’s just a linear combination of the other models’ weights, per position.
Comment by joe_the_user 2 days ago
Comment by fkozlowski 2 days ago
Comment by Havoc 2 days ago
Comment by axus 2 days ago
Comment by dormento 2 days ago
Source: am Huelander.
Comment by seba_dos1 2 days ago
Comment by mgambati 2 days ago
Comment by fkozlowski 2 days ago
Comment by matheusmoreira 2 days ago
Still, I'm actually impressed that this even happened at all. "Rio de Janeiro's homegrown LLM" is the last headline I expected to read on HN.
Comment by aaronbrethorst 2 days ago
Comment by ekjhgkejhgk 2 days ago
Comment by root-parent 2 days ago
Comment by vvpan 2 days ago
Comment by root-parent 2 days ago
Comment by carlosjobim 2 days ago
Comment by thimabi 2 days ago
Comment by antonvs 2 days ago
Comment by arcticfox 2 days ago
Comment by reese_john 2 days ago
Comment by thimabi 2 days ago
In an ideal world, Brazil would have a thriving private sector, capable of competing even in the AI sector. Unfortunately, that’s not the case, and I believe that without government action such endeavors won’t really succeed.
Comment by jkwang 2 days ago
Comment by thelonelyborg 2 days ago
Comment by FooBarWidget 2 days ago
Comment by antonvs 2 days ago
I'm not an expert in this area, but it's not too hard to see how a merge like that could turn out ok.
Comment by RandyOrion 2 days ago
Check how the "authors" of "this model" react to this problem [1]. See how they deal with this problem by first changing their affiliation from https://iplanrio.rio.rj.gov.br to https://iplanrio.prefeitura.rio [2], then saying that they are sorry for being caught [3], then just remove all their affiliations once for all [4].
I think the "authors" of "this model" [5] should be held accountable until they upload new checkpoints, and the performance of the new model is verified by third-parties.
P.S. To people who downvoted me, show me why you're doing this.
[1] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...
[2] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...
[3] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...
[4] https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B/comm...
Comment by delusional 2 days ago
I would like to downvote this please.
Comment by vor_ 2 days ago
Comment by blitzar 2 days ago
Comment by rgbrth 2 days ago
Could be from Rio, could be from any municipality anywhere in the world. The fact that the account is actually from the town hall rahter than a personal account also makes it funnier.
Comment by rsynnott 1 day ago
Ah, yes, the Nobel Prize for Fraud.
(I'm seriously kind of amazed they're still publishing those.)
Comment by nicman23 2 days ago
Comment by AnotherGoodName 2 days ago
Comment by wds 2 days ago
Comment by booleandilemma 2 days ago
Comment by _3u10 2 days ago
Comment by nylonstrung 2 days ago
Everything is using Stable Diffusion as underlying model, then most of the usage is merged of checkpoints
Comment by avereveard 2 days ago
also only work on matching architectures (i.e. finetunes/loras of the same model)
Comment by vor_ 2 days ago
Comment by dindunuf 2 days ago
Comment by yieldcrv 2 days ago
Its a fine tune of Qwen
Not a conspiracy
Comment by daemonologist 2 days ago
Comment by yieldcrv 2 days ago
Not to me, what would people like to happen? Who are those people? And why do they care?
Comment by antonvs 2 days ago
> why do they care?
Why does anyone ever care about having their time wasted by fraudulent claims?
Comment by yieldcrv 2 days ago
Comment by PixComicOS 2 days ago
Comment by Aurornis 2 days ago
Comment by hottrends 2 days ago
Comment by flowbarai 2 days ago
Comment by jing09928 2 days ago
Comment by antii 2 days ago
Comment by elzbardico 2 days ago
Comment by guiraldelli 2 days ago
I have been involved in academia, including in Brazil, and I don't find academia there any more copycat than any other institution, including top tier ones.
Comment by boca_honey 2 days ago
[1] https://www.sciencedirect.com/science/article/abs/pii/S17511...
[2] https://www.scielo.br/j/aac/a/xNytDrrrHdyK4XPcHBRJZmd/?lang=...
Comment by avdelazeri 2 days ago
Comment by dghlsakjg 2 days ago
What does it have to do with Brazilian academia?
Comment by _3u10 2 days ago
Comment by matheusmoreira 2 days ago
Comment by knuppar 2 days ago
Comment by dghlsakjg 2 days ago
That’s a pretty impressive accomplishment.
If true.
Comment by cassiogo 2 days ago
Comment by stymaar 2 days ago
Comment by knuppar 2 days ago
Comment by diego_moita 2 days ago
Oh, I am so SHOCKED, so SHOCKED! /s
Explaining the joke: in Brazil, Rio de Janeiro is known as "Terra de bandido" (Gangster's Land).
Kinda like Chicago in the 20's or Naples and Palermo in the 90s.
Comment by Scroll_Swe 2 days ago
Comment by vvpan 2 days ago
Comment by Scroll_Swe 1 day ago
Comment by antonvs 2 days ago
Comment by Scroll_Swe 1 day ago
Comment by antonvs 14 hours ago
Comment by MadrasTh0rn 2 days ago
Comment by nom 2 days ago
Comment by diego_moita 2 days ago
The majority of their politicians have ties to organized crime. There is a virtual revolving door between police and crime, where people migrate from one to the other.
It is like Chicago in the 20s, Naples and Medelin in the 80s or Moscow and Culiacan (Sinaloa, Mexico) today.
Comment by dormento 2 days ago
BTW wasn't it a few months ago the current governor wanted to leave to be able to run as a candidate, so he asked a supreme justice to step in in as governor, since there wasn't anyone else that technically could?
Comment by brunoarueira 2 days ago
Comment by alexgoodhart 2 days ago
Comment by sebastianconcpt 2 days ago
Comment by afh1 2 days ago
Comment by pelasaco 2 days ago
Comment by Havoc 2 days ago
Comment by alfiedotwtf 2 days ago
Comment by intoXbox 2 days ago