no clues in access log / grab images

Hi All,

I suspect that images from my web pages are grabbed in some irregular way for post-processing and use in another pages.

In the access log, I see a regular access to grab those images but I can not see the method (log attached below). I wonder what system they may be using.

I suspect of a former client, so I will just invoice them and if that does not work I will ban their IP. Any legal advice there if they refuse payment?

> 89.31.97.111 - - [17/Apr/2010:10:17:01 +0000] "GET /snow/figures/snowmap72h.png HTTP/1.0" 200 492177 "-" "-"

89.31.97.111 - - [17/Apr/2010:10:18:02 +0000] "HEAD /snow/figures/snowmap48h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [17/Apr/2010:10:18:02 +0000] "GET /snow/figures/snowmap48h.png HTTP/1.0" 200 491843 "-" "-"

89.31.97.111 - - [18/Apr/2010:06:10:02 +0000] "HEAD /snow/figures/snowmap72h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:06:10:03 +0000] "GET /snow/figures/snowmap72h.png HTTP/1.0" 200 496274 "-" "-"

89.31.97.111 - - [18/Apr/2010:06:16:02 +0000] "HEAD /snow/figures/snowmap48h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:06:16:02 +0000] "GET /snow/figures/snowmap48h.png HTTP/1.0" 200 484551 "-" "-"

89.31.97.111 - - [18/Apr/2010:06:19:02 +0000] "HEAD /snow/figures/snowmap24h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:06:19:02 +0000] "GET /snow/figures/snowmap24h.png HTTP/1.0" 200 491152 "-" "-"

89.31.97.111 - - [18/Apr/2010:10:11:02 +0000] "HEAD /snow/figures/snowmap24h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:10:11:02 +0000] "GET /snow/figures/snowmap24h.png HTTP/1.0" 200 491152 "-" "-"

89.31.97.111 - - [18/Apr/2010:10:14:02 +0000] "HEAD /snow/figures/snowmap48h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:10:14:07 +0000] "GET /snow/figures/snowmap48h.png HTTP/1.0" 200 484551 "-" "-"

89.31.97.111 - - [18/Apr/2010:10:17:01 +0000] "HEAD /snow/figures/snowmap72h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:10:17:01 +0000] "GET /snow/figures/snowmap72h.png HTTP/1.0" 200 496274 "-" "-"

89.31.97.111 - - [18/Apr/2010:10:18:01 +0000] "HEAD /snow/figures/snowmap48h.png HTTP/1.1" 200 - "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

89.31.97.111 - - [18/Apr/2010:10:18:01 +0000] "GET /snow/figures/snowmap48h.png HTTP/1.0" 200 484551 "-" "-"

The images are daily snow maps that I am producing from meteoexploration for the skiing tourism industry.

Thanks a lot for any clues.

9 Replies

Soliciting legal advice from a VPS Community forum - yeah, that's a great idea.

You have NO evidence yet you're willing to invoice a client and then take them to court if they don't pay.

I'm guessing you used to work for the RIAA or MPAA - right?

You're going to have good fun trying to convince your client to pay and you'd need considerably more information than that to have a chance in court.

Just set up rewrite rules to block remote image access also known as hot-linking.

vonskippy, obs

Thank you for your effort in replying to my query.

The question about legal advice is just a little addition at the end, if that hurt your sensibilities I am quite happy to remove it. That is the reason I didn't elaborate on the additional evidence I have, which is quite substantial.

My main concern is to know which system are they using to grab the images so that I can ban their IPs. Since the access log shows a "GET" to the correct images but no method I was wondering how they grab them.

Cheers

P.S.
@obs:

Just set up rewrite rules to block remote image access also known as hot-linking.

It is not hot-linking, they manipulate the image and mask the copyright notice with their own logo. What I was charging them was only a fraction of the cost of producing the maps, including server costs, which makes the whole thing rather miserable on their side

@patagon:

My main concern is to know which system are they using to grab the images so that I can ban their IPs. Since the access log shows a "GET" to the correct images but no method I was wondering how they grab it.
"GET" is the method - unless you mean some other use of that term than in the HTTP protocol.

If you are referring to the fact that you don't see any agent information or other data such as referral data that just means the client used to issue the request didn't bother to include those headers. That's pretty trivial to do with most tools (wget, curl, etc..) or libraries, and such headers aren't officially required.

– David

89.31.97.111

That's the ip it's coming from it's in the netherlands.

Put this as a rewrite rule

RewriteCond %{HTTP_REFERER} !^http://(www.)?yourdomain.com/.*$ [NC]

RewriteRule .(gif|jpg|png)$ - [F]

It'll blog all referrers that don't come from your site including direct requests to all gif jpg and png files.

Thanks dbl3

Silly me, I was assuming that curl and wget would leave a bigger "fingerprint"

Things clearer now.

@obs:

RewriteCond %{HTTP_REFERER} !^http://(www.)?yourdomain.com/.*$ [NC]

RewriteRule .(gif|jpg|png)$ - [F]

Thanks obs. I am happy to allow people to use up to three of my forecasts in their web pages, blogs etc. Many are doing so, and that rule would block all of them. The IP and corresponding whois info is another bit of evidence (they are not very clever)

@patagon:

Silly me, I was assuming that curl and wget would leave a bigger "fingerprint"
They typically will by default (e.g., wget by default uses an agent string like wget/), but changing the agent is just a command line option away. E.g., using -U "" with wget will stop the inclusion of the agent header.

– David

Or you could just put a digital watermark on your photos.

Put a small "warning caption" on each photo (i.e. this photo contains a digital watermark with my copyright info), which links to a page detailing what a digital watermark is, what copyright is, and what you intend to do to people that steal your work.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct