Following my own advice in yesterday’s post, I was checking my backlinks this morning and found multiple links from a domain I didn’t recognize. Ahrefs.com let me know the exact link to my site and anchor text used, and the backlink turned out to be an image (Elder Scrolls Online logo). Now in this case, they were linking to my hosted image but the image actually displayed was a copy uploaded to their servers—hat tip to them.
Not everyone is that nice and especially on forums or other blogs will display the image straight from your post. This is a practice called “leeching”, “piggy-backing”, but most often known as “hotlinking” and it’s stealing, even if unintentional. When you deploy a site to the Internet, you’re not just paying for the server space but also the traffic to that space. Linking content straight from someone’s site to display on another site means you’re making person #1 pay for that traffic even though they never get any credit, exposure, ad revenue, etc–all that goes to site #2. It’s like copying an individual’s video from YouTube and uploading as your own content. It’s a dick move, but you’re not powerless against this.
Every time information is sent over the Internet, it comes to a server as an HTTP request. For example, when you load Aggro-Range and your computer asks for the logo of my site, it sends a request to my server saying “This is what browser I’m using, this is today’s date, here is the URL of the site where I want to see the image…” and a whole slew of information proving it’s a legitimate request for the image I’m hosting.
Most personal sites have little need to work with all that information, but in this case I’m going to take advantage of the “Referer” field of the request and tell my server, “If the request doesn’t come from aggro-range.com, I don’t want you to send the image.” Furthermore, I’m going to rewrite the request so an entirely different image is sent–because if someone doesn’t know why they’re being blocked when hotlinking, they’ll never learn how to make better choices.
I use an open-source web server called Nginx (pronounced “engine-ex”) for Aggro-Range. To change the image sent to sites other than my own when they request images, into my server I dive to find the file where I control my virtual hosts–kind of like a roadmap that directs different requests and responses. I’m adding a block to the server section that says “Hey, if a file of this type is requested, make sure it’s coming from my domain or a subdomain on my site. If the request doesn’t match, give an error that breaks the link.”
Hopping over to a site that lets me check hotlinks, I see a broken image icon instead of my site logo. Almost perfect!
For my last step, I want to add an image that will be sent instead of the broken image error.
Lines 68-70 are very important: they allow linking to the hotlink image I created. Otherwise, requests for that image get looped in an endless redirect which most browsers know to break BUT endless requests on my web server from multiple sources is exactly what I’m trying to avoid here. Anyway, instead of the 403 error I tell the server to change the request to my hotlink image, which results in this when testing:
I’m anticipating some bugs–for example, Firefox likes to add custom request headers at times and it makes the web server think the request is not legitimate, and like any digital work there’s pretty much guaranteed to be a workaround if someone puts their mind to it. Still, a lot of web developing is anticipating issues to the best of my ability, testing against those issues to the best of my ability, and responding to issues as needed…to the best of my ability.
If you’re using IIS or Apache for your webserver or want to see an example of pretty nefarious content stealing (especially when the author is a creative commons and open-source advocate, wtf), try Scott Hanselman’s post on the subject.
(Hat tip to MegaMortFan on Funnyjunk.com for header image)