Google de-indexed Bear Blog and I don't know why
Posted by nafnlj 11 hours ago
Comments
Comment by firefoxd 5 hours ago
1. Ai overview: my page impressions were high, my ranking was high, but click through took a dive. People read the generated text and move along without ever clicking.
2. You are now a spammer. Around August, traffic took a second plunge. In my logs, I noticed these weird queries in my search page. Basically people were searching for crypto and scammy websites on my blog. Odd, but not like they were finding anything. Turns out, their search query was displayed as an h1 on the page and crawled by google. I was basically displaying spam.
I don't have much control over ai overview because disabling it means I don't appear in search at all. But for the spam, I could do something. I added a robot noindex on the search page. A week later, both impressions and clicks recovered.
Edit: Adding write up I did a couple weeks ago https://idiallo.com/blog/how-i-became-a-spammer
Comment by dazc 4 hours ago
You can avaoid this by no caching search pages and applying noindex via X-robots tag https://developers.google.com/search/docs/crawling-indexing/...
Comment by motbus3 3 hours ago
They say the data before and after is not comparable anymore as they are not counting certain events below a threshold anymore. You might need to have your own analytics to understand your traffic from now own.
Comment by jgalt212 20 minutes ago
This been our experience with out content-driven marketing pages in 2025. SERP results constant, but clicks down 90%.
This not good for our marketing efforts, and terrible for ad-supported public websites, but I also don't understand how Google is not terribly impacted by the zero-click Internet. If content clicks are down 90%, aren't ad clicks down by a similar number?
Comment by bootsmann 4 hours ago
Comment by firefoxd 4 hours ago
example.com/search?q=text+scam.com+text
On my website, I'll display "text scam.com text - search result" now google will see that link in my h1 tag and page title and say i am probably promoting scams.Also, the reason this appeared suddenly is because I added support for unicode in search. Before that, the page would fail if you added unicode. So the moment i fixed it, I allowed spammers to have their links displayed on my page.
Comment by Calavar 4 hours ago
[1] https://cyberinsider.com/threat-actors-inject-fake-support-n...
Comment by francisofascii 23 minutes ago
Comment by Neil44 3 hours ago
Since these are very low quality results surely one of Google's 10000 engineers can tweak this away.
Comment by input_sh 2 hours ago
That's trivially easy. Imagine a spammer creating some random page which links to your website with that made up query parameter. Once Google indexes their page and sees the link to your page, Google's search console complains to you as the victim that this page doesn't exist. You as in the victim have no insight into where Google even found that non-existent path.
> Since these are very low quality results surely one of Google's 10000 engineers can tweak this away.
You're assuming there's still people at Google who are tasked with improving actual search results and not just the AI overview at the top. I have my doubts Google still has such people.
Comment by layer8 1 hour ago
Comment by jdiff 58 minutes ago
Comment by indymike 1 hour ago
Comment by chii 4 hours ago
Google got smart and found out such exploits, and penalized sites that do this.
Comment by donatj 1 hour ago
I used to work for an SEO firm, I have a decent idea of best practices for this sort of thing.
BAM, I went from thousands of indexed pages to about 100
See screenshot:
https://x.com/donatj/status/1937600287826460852
It's been six months and never recovered. If I were a business I would be absolutely furious. As it stands this is a tool I largely built for myself so I'm not too bothered but I don't know what's going on with Google being so fickle.
Updated screenshots;
Comment by motbus3 12 minutes ago
Comment by dmboyd 47 minutes ago
Comment by AznHisoka 44 minutes ago
Comment by FuturisticLover 6 hours ago
Just the open is similar, but the intent is totally different, and so is the focus keyword.
Not facing this issue in Bing and other search engines.
Comment by daemonologist 6 hours ago
Some popular models on Hugging Face never appear in the results, but the sub-pages (discussion, files, quants, etc.) do.
Some Reddit pages show up only in their auto-translated form, and in a language Google has no reason to think I speak. (Maybe there's some deduplication to keep machine translations out of the results, but it's misfiring and discarding the original instead?)
Comment by kace91 43 minutes ago
It’s also clearly confusing users, as you get replies in a random language, obviously made by people who read an auto translation and thought they were continuing the conversation in their native language.
Comment by sischoel 2 hours ago
I think at least for Google there are some browser extensions that can remove these results.
Comment by black_puppydog 2 hours ago
Comment by Aldipower 2 hours ago
Comment by adaptbrian 2 hours ago
This is what has caused the degradation of search quality since then.
Comment by Iulioh 35 minutes ago
Comment by dev_l1x_be 4 hours ago
Comment by bjt12345 7 hours ago
However, if they do it for the statutory term, they can then successfully apply for existing-use rights.
Yet I've seen expert witnesses bring up Google pins on Maps during tribunal over planning permits and the tribunal sort of acts as if it's all legit.
I've even seen the tribunals report publish screenshots from Google maps as part of their judgement.
Comment by deltoidmaximus 35 minutes ago
Comment by rcxdude 3 hours ago
Comment by oakwhiz 5 hours ago
Comment by actionfromafar 3 hours ago
Comment by 01HNNWZ0MV43FF 3 hours ago
Comment by hyruo 5 hours ago
Comment by scosman 2 hours ago
I have a page that ranks well worldwide, but is completely missing in Canada. Not just poorly ranked, gone. It shows up #1 for keyword in the US, but won't show up with precise unique quotes in Canada.
Comment by graeme 7 hours ago
The amount of spam has increased enormously and I have no doubt there are a number of such anti-spam flags and a number of false positive casualties along the way.
Comment by Eisenstein 5 hours ago
Comment by quietfox 4 hours ago
Comment by xeonmc 4 hours ago
Comment by Bengalilol 4 hours ago
Comment by binarymax 2 hours ago
Comment by saint_yossarian 39 minutes ago
Comment by p0w3n3d 7 hours ago
Comment by cosmicgadget 7 hours ago
https://gehrcke.de/2023/09/google-changes-recently-i-see-mor...
The wrong RSS thing may have just tipped the scales over to Google not caring.
Comment by cyberrock 6 hours ago
That's not to say I don't have gripes with how Google Maps works, but I just don't know why the other factors were not considered.
Comment by leoedin 5 hours ago
I just checked a few local restaurants to me in London that opened in the last few years, and the ratio of reviews is about 16:1 for google maps. It looks like stuff that’s been around longer has a much better ratio towards trip advisor though.
Almost certainly Instagram/tiktok are though. I know a few places which have been ruined by becoming TikTok tourist hotspots.
Comment by dazc 4 hours ago
Counterpoint: I have met people in the UK who's lives revolve around doing nothing but.
Comment by paganel 4 hours ago
Comment by Aldipower 2 hours ago
Comment by nottorp 6 hours ago
Comment by DeathArrow 3 hours ago
Comment by dazc 4 hours ago
Comment by p410n3 1 hour ago
Comment by cabirum 4 hours ago
Comment by mariusor 4 hours ago
Comment by motbus3 3 hours ago
We have a consultant for the topic but I am not sure how much of that conversation I could share publicly so I will refrain myself of doing so.
But I think I can say that it is not only about data structure or quality. The changes in methodology applied by Google in September might be playing a stronger role than what people initially thought
Comment by p410n3 1 hour ago
Comment by huksley 6 hours ago
Primary domain cannot be found via search - Bing knows about brand, LinkedIn, YouTube channel and but refuses to show search results about primary domain.
Bing search console does not give any clue, force reindexing does not help. Google search works fine.
Comment by econ 2 hours ago
https://search.yahoo.com/search?p=blog.james-zhan.com&fr=yfp...
Comment by guerrilla 2 hours ago
Comment by watwut 1 hour ago
Even when I knew the exact name of article I was looking for google was unable to find it. And yes it still existed,
Comment by sethops1 27 minutes ago
Comment by Havoc 4 hours ago
Comment by p410n3 4 hours ago
But basically what happened: In august 2025 we finished the first working version of our shop. I wanted to accelerate indexing after some weeks because only ~50 of our pages were indexed and submitted the sitemap and everything got de-indexed within days. I thought for the longest time that its content quality because we sell niche trading cards and the descriptions are all one liners i made in Excel. ("This is $cardname from $set for your collection or deck!"). And because its single trading cards we have 7000+ products that are very similiar. (We did do all product images ourselves I thought google would like this but alas).
But later we added binders, whole sets and took a lot of care with their product data. The frontpage also got a massive overhaul - no shot. Not one page in index. We still get traffic from marketplaces and our older non-shop site. The shop itself lives on a subdomain (shop.myoldsite.com). The normal site also has a sitemap but that one was submitted 2022. I later rewrote how my sitemaps were generated and deleted the old ones in search console hoping this would help. It did not. (The old sitemap was generated by the shop system and was very large. Some forums mentioned that its better to create a chunked sitemap so I made a script that creates lists with 1000 products at a time as well as an index for them.)
Later observations are:
- Both sitemaps i deleted in GSC are still getting crawled and are STILL THERE. You cant see them in the overview but if you have the old links they still appear as normal.
- We eventually started submitting product data to google merchant center as well. It works 100% fine and our products are getting found and bought. The clicks still even show up in search console!!!! So I have a shop with 0 indexed pages in GSC that gets clicks every day. WTHeck?
So like... I dont even know anymore. Maybe we also have to restart like the person in the blog did and move the shop to a new domain and NEVER give google a sitemap. If I really go that route I will probably delete the cronjob that creates the sitemap in case google finds it by itself. But also like what the heck? I have worked in a web agency for 5 years and created a new webpage about every 2-8 weeks so i roughly launached about 50-70 webpages and shops and i NEVER saw that happen. Is it an ai hallucinating? Is it anti spam gone too far? Is it a straight up bug that they dont see? Who knows. I dont
(Good article though and I hope maybe some other people chime in and googlers browsing HN see this stuff).
Comment by xnx 2 hours ago
Comment by g947o 2 hours ago
Author's fault, Google's fault, someone else's fault.
From the post, while it is hard to completely rule out the possibility that author did something wrong, they likely did everything they could to remove the suspicion. I assume they consulted all documentation or other resources.
Someone else's fault? It is unlikely, since there isn't (obviously) another party involved here.
Which leaves us to Google's fault.
Also, I mean, if a user can't figure out what's wrong, the blame should just go to the vendor by default for poor user experience and documentation.
Comment by nmeofthestate 4 hours ago
Comment by DeathArrow 3 hours ago
Comment by inglor_cz 3 hours ago
Together with deleting my Facebook and Twitter accounts, this removed a lot of pressure to conform to their unclear policies. Especially around 2019-21, it was completely unclear how to escape their digital guillotine which seemed to hit various people randomly.
The deliverability problem still stands, though. You cannot be completely independent nowadays. Fortunately my domain is 9 years old.
Comment by digitalgravix 6 hours ago
Comment by throwaway984393 7 hours ago
Comment by echelon 7 hours ago
No more Google. No more websites. A distributed swarm of ephemeral signed posts. Shared, rebroadcasted.
When you find someone like James and you like them, you follow them. Your local algorithm then prioritizes finding new content from them. You bookmark their author signature.
Like RSS but better. Fully distributed.
Your own local interest graph, but also the power of your peers' interest graphs.
Content is ephemeral but can also live forever if any nodes keep rebroadcasting it. Every post has a unique ID, so you can search for it later in the swarm or some persistent index utility.
The Internet should have become fully p2p. That would have been magical. But platforms stole the limelight just as the majority of the rest of the world got online.
If we nerds had but a few more years...
Comment by nottorp 6 hours ago
Isn't what you're describing something like mastodon or usenet?
Comment by p0w3n3d 6 hours ago
On the other side of the same coin there are already governments that will make you legally responsible of what your page's visitors write in comments. This renders any p2p internet legally unbearable (i.e. someone goes to your page, posts some bad word and you get jailed). So far they say "it's only for big companies" but it's a lie, just boiling frogs.
Comment by vladms 5 hours ago
"cannot do anything" is relative. Google did something about it (at least for the first 10-15 years) but I am sure that was not their primary intention nor they were sure it will work. So "we have no clue what will work to reduce it" is more appropriate.
Now I think everybody has tools to build stuff easier (you could not make a television or a newspaper 50 years ago). That is just an observation of possibility, not a guarantee of success.
Comment by baq 6 hours ago
Comment by doganugurlu 4 hours ago
Most efficient = cheaper. A lot of times cheaper sacrifices quality, and sometimes safety.
Comment by AnthonyMouse 3 hours ago
How do you think Google or Cloudflare actually work? One big server in San Francisco that runs the whole world, or lots of servers distributed all over?
Comment by baq 2 hours ago
Why do you think they're a monopoly in the first place? Obviously because they were more efficient than the competition and network effects took care of the rest. Having to make choices is a cost for the consumer - IOW consumers are lazy - so winners have staying power, too. It's a perfect storm for a winner-takes-all centralization since a good centralized service is the most efficient utility-wise ('I know I'm getting what I need') and decision-cost-wise ('I don't need to search for alternatives') for consumers until it switches to rent seeking, which is where the anti-monopoly laws should kick in.
Comment by AnthonyMouse 2 hours ago
In other words, open source decentralized systems are the most efficient because you don't have to reduplicate a competitor's effort when you can just use the same code.
> Obviously because they were more efficient than the competition and network effects took care of the rest.
In most cases it's just the network effect, and whether it was a proprietary or open system in any given case is no more than the historical accident of which one happened to gain traction first.
> Having to make choices is a cost for the consumer
If you want an email address you can choose between a few huge providers and a thousand smaller ones, but that doesn't seem to prevent anyone from using it.
> until it switches to rent seeking
If it wasn't an open system from the beginning then that was always the end state and there is no point in waiting for someone to lock the door before trying to remove yourself from the cage.
Comment by baq 2 hours ago
This is the great lie. Approximately zero end consumers care about code, the product they consume is the service, and if the marginal cost of switching the service provider is zero, it's enough to be 1% better to take 99% of the market.
Comment by 0xbadcafebee 7 hours ago
You know what else we need? We need food to be free. We need medicine to be free, especially medicines which end epidemics and transmissible disease. We need education to be free. We need to end homelessness. We need to end pollution. We need to end nationalism, racism, xenophobia, sexism. We need freedom of speech, religion, print, association. We need to end war.
There are a lot of things we as a society need. But we can't even make "p2p internet" work, and we already have it. (And please just forget the word 'distributed', because it's misleading you into thinking it's a transformative idea, when it's not)
Comment by vladms 4 hours ago
I would settle for simpler, attainable things. Equal opportunity for next generation. Quality education for everybody. Focus on merit not other characteristics. Personal freedom if it does not infringe on the freedom of people around you (ex: there can't be such thing as a "freedom to pollute").
In my view Internet as p2p worked pretty well to improve the previous status quo in many areas (not all). But there will never be a "stable solution", life and humans are dynamic. We do have some good and free stuff on the Internet today because of the groundwork laid out 30 years ago by the open source movement. Any plan started today will have noticeable effect in many years. So "we can't even make" sounds more of an excuse to not start, rather than an honest take.
Comment by azangru 1 minute ago
What does this mean? I suppose it can't literally mean equal opportunity, because people aren't equal, and their circumstances aren't equal; but then, what does this mean?
Comment by sam_goody 3 hours ago
Every family should be provided with a UBI that covers food and rent (not in the city). That is a more attainable goal and would solve the same problems (better, in fact).
(Not saying that UBI is a panacea, but I've lived in countries that have experimented with such and it seems the best of the alternatives)
Comment by bsder 6 hours ago
YouTube should get split out and then broken up. Google Search should get split out and broken up. etc.
This is not a problem you solve with code. This is a problem you solve with law.
Comment by AnthonyMouse 2 hours ago
When the DMCA was a bill, people were saying that the anti-circumvention provision was going to be used to monopolize playback devices. They were ignored, it was passed, and now it's being used to monopolize not just playback devices but also phones.
Here's the test for "can you rely on the government here": Have they repealed it yet? The answer is still no, so how can you expect them to do something about it when they're still actively making it worse?
Now try to imagine the world where the Free Software Foundation never existed, Berkeley never released the source code to BSD and Netscape was bought by Oracle instead of being forked into Firefox. As if the code doesn't matter.
Comment by emsign 5 hours ago
Comment by Terr_ 2 hours ago
Comment by ErroneousBosh 3 hours ago
From what you've described, you've just re-invented webrings.
Comment by qwertox 2 hours ago
Request URL: https://journal.james-zhan.com/google-de-indexed-my-entire-b...
Request Method: GET
Status Code: 304 Not Modified
So maybe it's the status code? Shouldn't that page return a 200 ok?
When I go to blog.james..., I first get a 301 moved permanently, and then journal.james... loads, but it returns a 304 not modified, even if i then reload the page.
Only when I fully sumbit the URL again in the URL-bar, it responds with a 200.
Maybe crawling also returns a 304, and Google won't index that?
Maybe prompt: "why would a 301 redirect lead to a 304 not modified instead of a 200 ok?", "would this 'break' Google's crawler?"
> When Google's crawler follows the 301 to the new URL and receives a 304, it gets no content body. The 304 response basically says "use what you cached"—but the crawler's cache might be empty or stale for that specific URL location, leaving Google with nothing to index.
Comment by jorams 2 hours ago
Your LLM prompt and response are worthless.
Comment by qwertox 1 hour ago
Request URL: https://news.ycombinator.com/item?id=46196076
Request Method: GET
Status Code: 200 OK (from disk cache)
I just thought that it would be worthwhile investigating in that direction.
Comment by jorams 1 hour ago