Lower-cased URLs in Apache Access Log
For example, a file named /Pages/Home.php will be requested as /pages/home.php.
It doesn't happen a lot, but enough to make me wonder…
Has anyone seen this before? Know what causes it?
6 Replies
The log files should contain the user agent and the referer (if any). These would help you identify the offending browser and/or link.
xxx.xxx.xxx.xxx - - [05/Dec/2011:08:34:25 -0800] "GET /xxx/login-MySQL.php HTTP/1.1" 404 1210 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 1068) AppleWebKit/534.52.7 (KHTML, like Gecko) Version/5.1.2 Safari/534.52.7"
xxx.xxx.xxx.xxx - - [05/Dec/2011:10:05:26 -0800] "GET /xxx/teleprompter-richtext2.php HTTP/1.1" 404 2958 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)"
xxx.xxx.xxx.xxx - - [05/Dec/2011:10:05:47 -0800] "GET /xxx/teleprompter-richtext2.php HTTP/1.1" 404 2958 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0; GTB6.3; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729; InfoPath.2; OfficeLiveConnector.1.5;
````
Here are three from today. The user agent strings are all different. In fact, the only similarity I see in the entries is that none of them provide a referrer.
I'm fairly confident it's not my links. I searched my entire code base for "teleprompter-richtext2.php", and there weren't any occurrences of it.
@jzimmerlin:
In fact, the only similarity I see in the entries is that none of them provide a referrer.
Sometimes people type URLs into the address bar:)
Otherwise, maybe it's some sort of broken user script?
@jzimmerlin:
Yeah, I don't know what to make of it. The URLs aren't really public, in the sense that they're not posted on the home page or something. They're only for clients. So I don't know how a bot would discover them.
Because your clients are stupid, typical end-users and their computers are massively infected with viruses or other mal/spyware and every link they visit is sent somewhere else.
Hit arin.net and find out who owns the IPs of the computers visiting your site, that, the time stamp and the page they are trying to hit are nearly the only parts of those entries that are not spoof-able.
For the most part, don't bother, those log entries you listed are all 404 errors, so they are not even pages that exist on your site. Just people who have malware installed on their computers trying to find software that's easily compromised.