Temperamental site access.

WelshGas · Wednesday at 2:47 PM

Over the past 7-10 days accessing the site from my iMac, iPad or Android phone has become increasingly temperamental. Frequently an error message “ Site cannot be reached - Reload “ or a screen as below.
Done all the usual things Clear Cache etc, different Browsers without change.
Other sites such as BBC, T6 Forum and the T4 Forum do not have this problem.

Anyone else noticed anything similar.?

Dave bee · Wednesday at 2:54 PM

I’ve had exactly the same over the last week or so, most recently being yesterday. It didn’t do I today as I left it on the message when I tried to get in and was still showing when I opened the page today. Definitely something wrong, I’m sure we won’t be alone!!!

WelshGas · Wednesday at 2:56 PM

Dave bee said:
I’ve had exactly the same over the last week or so, most recently being yesterday. It didn’t do I today as I left it on the message when I tried to get in and was still showing when I opened the page today. Definitely something wrong, I’m sure we won’t be alone!!!

Thank you. I’m glad it’s not just me .

calikev · Wednesday at 3:43 PM

Just spoke to our web guy his reply

Yeah ive been monitoring it mate, more bot attacks, saturates the resources for a minute before the firewalls catch up. It's not the typical type of attack though so harder to stop it, most of the time it falls over when there is an attack at the same time the backups run as there is alot of processing power being used
yeah just a matter of being there when its at its worse to monitor it , as obviously 99% of the time its fast as lighting which doesnt tell us much when its like that ( like now )

All over, which is what makes it harder as we can't simply block one country or IP range. There is a server update later so hopefully this will help

kpttnuts-beach · Wednesday at 3:45 PM

I've had the same thing over the last week or so...I ended up using old bookmarked links to get it to load normally...that was on an iPhone, been fine on Mac for me, both using chrome as the web browser.

Chriskend · Wednesday at 4:14 PM

Me too, have to screen refresh or reload on the iPad, not tried it on the laptop tho.

sidepod · Wednesday at 4:55 PM

Plus one for me. I assumed it was just my outdated iphone starting to play silly buggers.

Betsycalifornia · Wednesday at 5:43 PM

calikev said:
Just spoke to our web guy his reply

Yeah ive been monitoring it mate, more bot attacks, saturates the resources for a minute before the firewalls catch up. It's not the typical type of attack though so harder to stop it, most of the time it falls over when there is an attack at the same time the backups run as there is alot of processing power being used
yeah just a matter of being there when its at its worse to monitor it , as obviously 99% of the time its fast as lighting which doesnt tell us much when its like that ( like now )

All over, which is what makes it harder as we can't simply block one country or IP range. There is a server update later so hopefully this will help

Just my two Penneth in case it hasn’t been considered, can the web guy check the most common referrer header string used by the bots and if it’s unusual or as is often the case, blank, filter traffic on that?

Chriskend · Wednesday at 8:46 PM

It’s running on cloudflare….. easily sorted, if that is the issue

Chriskend · Wednesday at 8:46 PM

Seems to be getting worse

cpaharley2008 · Wednesday at 8:47 PM

Yeah I've been getting the same issues on my Android, thought it was just my rubbish internet connection at first! Good to know it's being looked into.

calikev · Wednesday at 9:42 PM

Yeah it’s more about monitoring all traffic during the 2 or 3 min bursts, I’ve filtered out a lot of ranges and ASIN tonight as origin pretends to be an Indian hit, but we know it’s spread across multiple ASINs as the country code blocks alone don’t consider it as a county based IP.

Referrer header strings and blank strings are basic detections compared to what’s being monitored.

If it was a consistent hit, it would be easy, but 2 min bursts to saturate the upstream is harder to log when each burst has different footprints.

Nothing we haven’t seen before though, just takes a bit of firewall tweaking and blocking

Ultimately the firewall is already doing its job, we just want to match the pattern instantly rather than the firewall playing catch up after a few mins

VW Cali Forum · Thursday at 10:57 PM

Sorry guys, I didnt realise Kev was relaying my quick responses via whatsapp to the forum, while trying to monitor the spikes. If I had known I would have been a bit more informational. Since a few of you may, or may not understand what most of it actually means i'll try and do an overview of findings.

So, although the site is behind cloudflare, that only offers slight protection when the attack is being forced via the proxy and the origin server is unknown. When you are faced with relatively small attacks front facing, UDP floods and port flooding at origin, cloudflare doesnt help a great deal and being the scale they were we knew there was something else at play as they shouldnt be making a dent, but couldnt pin point it.

Because it was so hit and miss so to speak in terms of it being perfectly fine one moment and not the next, most of the time with normal load averages sitting around the 1-2 range. It's not as simple as looking for traffic, blocking, filtering etc in cloudflare. And we eventually found the compounding factor during one of the spikes.

Bash:

%Cpu(s): 20.2 us, 12.5 sy, 3.0 ni, 19.9 id, 0.0 wa, 0.0 hi, 7.4 si, 45.1 st

What we call noisy neighbours ( Other VPS's on the node stealing resources and limitng ours, slowing down our processes, which soon compounds considerably when you have as many users, small attacks and normal resource usage).

The spikes are caused by external hypervisor contention (steal time), amplified by occasional CPU-heavy PHP requests on our side, resulting in load amplification that's disproportionate to actual work being done.

This is further confirmed when viewing our monitoring graphs, during the spike at 20:13 today ( and multiple times since ), CPU usage on the VM dropped to 15% while load average climbed to 45. Memory usage stayed flat. Network TX briefly dropped to zero.

This pattern is consistent with CPU starvation by the hypervisor, the workload on the server wanted to run but couldn't get scheduled. If it was a localised load spike, it's likely the CPU would increase, possibly to 100%+ not reduce.

I've raised this with the datacenter, awaiting their response. But typically this will mean the DC will offer to move the VM to a new Node, which has less contention. So for the time being, we may see the same issue for a day or so, and then we'll monitor further once the VM has been moved to make sure it was actually that.

calikev · Friday at 6:45 AM

That’s exactly what I was going to say :thumb

VW Cali Forum · Friday at 9:54 AM

Just to update, the VPS has now been moved to a new node so in theory we should see the improvement now. I will continue to monitor this instance just incase through the day and weekend.

The datacenter are investigating the cause more closely as typically a noisy neighbour is easy to spot, in this case it's causing sporadic spikes on the node rather than a constant (and easy to spot) load.

Either way, fingers crossed it should be resolved.

WelshGas · 2026-05-17T22:34:26+0100

100% improvement. Thank you.

Temperamental site access.

WelshGas

Retired after 42 yrs and enjoying Life.

Dave bee

WelshGas

Retired after 42 yrs and enjoying Life.

calikev

Administrator

kpttnuts-beach

Chriskend

sidepod

Betsycalifornia

Chriskend

Chriskend

cpaharley2008

calikev

Administrator

VW Cali Forum

Administrator

calikev

Administrator

VW Cali Forum

Administrator

WelshGas

Retired after 42 yrs and enjoying Life.

Similar threads

Useful links

VW California Club

About us

Our Partners

Our Websites