Good! Wordprexy has removed the ads and readers in Turkey can see WordPress blogs…

More power to them for breaching the “Great Firewall of Turkey” and, now, in a way that does not violate bloggers’ or readers’ rights or sensibilities.

For the background on the banning of WordPress in Turkey, see “Why We’re Blocked in Turkey: Adnan Oktar” by Matt on WordPress.com


I’m all for open access to the Internet, so I was prepared to be somewhat lenient in my views towards wordprexy.com in their stated efforts to get around Turkey’s ban on WordPress. What I wasn’t prepared for was to go onto my blog on wordprexy’s mirror and see porn ads. Enough already.

Freedom of speech and open access or no, and I do make my allusions from time to time, bethonged buttocks and promises of 3 more inches in three weeks are not what I think my readers expect to see here. I’ve sent in the email requesting to be removed. After today’s spammer/scraper trench warfare and hours of research on copyright issues, plus several buggy things on my Drupal dev project, I’m done in for today. I hope that they really do remove my blog, but I’m tired and cranky enough at the moment to be more than a little cynical.

Even if wordprexy’s actions are for the stated purpose, I just don’t want my work being used in that way. Did they think it through first or try to come up with a better solution? The vitriol in the comments on their blog makes me wonder at the whole premise. Get a sponsor or a recognized NGO to help ahead of time…something…find a way to do it right, folks.

While it was a great diversion from the other frustrations in my life, I have to admit that I behaved rather badly in this matter of dealing with content scrapers. After reading “What to Do When Someone Steals Your Content,” by Lorelle on WordPress.com, I’m ashamed of myself.

I read most of the article before I realized that I simply don’t have time do those things right now. (And it will take several readings and some note-taking to get all the action steps lined out.) Until I get settled in Phoenix, go through all the information and do everything in a professional manner – which is what I have thus far failed miserably at – I am going to have to accept that I can only do the bare minimum to deal with the content thieves right now and simply endure my indignation and wounded pride.

That’s just the way it is. I can sally forth in defense of intellectual property rights in a month or two, and do so in an a more honorable manner, instead of being such a barbarian about it. Mea culpa. Sometimes I still act like there’s a fire on the other side of the door and I’ve got a halligan in my hands.

I don’t know why I get such a kick out of stuff like this, but I do. A couple of days ago I signed up for a free account on Clustrmaps.com and they just drew up my first map. (See lower portion of the sidebar to the right.)

They do a very good job of explaining how it all works on their site. They are evidently experiencing major growth – and the associated growing pains – so updates are a little slow and I only got a partial day’s worth data on the first one, but I’m a happy little camper.

Ari, signing off for TCU, September 21, 17:21, UTC.

Now I’m really getting cynical. I did my Alexa search and after half-a-dozen pages of seeing my scraped writings (and with more to go), I got angry all over again. I clicked on one of the sites for “free guitar lessons” and saw pretty much what I expected with my content. Gee, I didn’t know I wrote for “Guitar News.” (This is not the site’s actual name, just what they are using in their scraped pages. I don’t want to malign any legitimate site that may use that moniker.) My evil twin must be ghost-writing out in cyberspace.

I went to the home page of the site and saw they’d gone one better on their AdSense links. They’d used CSS to turn a row of AdSense titles into navigation links! Gotta give them one for deviousness. I was impressed enough to go in and look at their page source. What an education! They’d stolen and/or repurposed numerous scripts and plug-ins, and had some fairly sophisticated javascripting going on, as well.

Well, well. What do you do when you are sailing in pirate infested waters? There’s no turning back. I’m already adrift on the bright blue sea and I’ve gotten plundered more than once. This deserves some real thought and considered decisions. It will definitely affect how I promote and monetize http://www.flamencophile.com and other sites I develop.

I keep going back to the original premise of the scraper sandwich I devised – humans can scan and discern genuinely valuable content when they want to. I can only conclude that the people that click on the scraper sites ads are being lazy and clueless, and that the scrapers do not have as good a click-through rate as a truly worthwhile site. It is the “volume” paradigm at work as opposed to the “value paradigm.

Parasites. Hey! “Para” – “sites,” that’s for sure. “Para-” meaning alongside of, beside, similar to or resembling. Definitely parasites – the tapeworms, liver flukes, and cattle bots of the web. Just have to be a hardy enough host to withstand a few, I guess.

That’s all for now. This is Ariel Laurel Strong signing off for the Cloud of Unknowing on September 21, 2007, at 17:20 UTC.

OMG. The scraper site that I originally made the “spam sandwich” for scarfed up both of my “vigilante blog posts,” one of which even has the unchanged link in it to the original stolen work…on their own site.

And it looks like another blogger is using a similar tactic. There’s an obviously scraped post with links back to the originating site in the first and last paragraphs with a content summary sandwiched in between. Way to go Binary Moon.

There is a lot of opinion out there regarding how Google penalizes people in their search rankings for duplicate content. I wonder who will listen if I submit this site showing the links and dates…the search engineers and the AdSense team would seem to be at cross purposes with one another.

This whole experience is making me rethink my strategy on my upcoming websites like http://www.flamencophile.com. I had been just assuming that I’d have a bar of ads down one side. Though I hate subjecting people to the advertising, I need to make something for my time and effort and it’s been sticking in my craw how others are making money from illegally reusing my content. I’ve been reading too many webmaster discussions lately about whether to even syndicate your content, the hours of time it takes to go after the scrapers, and the generally poor results of doing so.

I remember back in 1996 having discussions with colleagues how the imminent monetization of the web would affect its usefulness and integrity. Back then, we never dreamed of the vast junkyards that would come to lie scattered across the virtual landscape. (Erluvi.com, this means you…)

What would an internet renewal look like? Green zones…places where there was quality content and ad-free sites could flourish. Pardon my dreaming, or reminiscing, as the case may be. As in all things, the best and the worst of humanity seems to flow towards the newest frontier, and left in that wake, the little settlements pop up and are subject to the predations of bandits and marauders. The good, the bad, and the ugly…

This is Ariel Laurel Strong signing off for the Cloud of Unknowing on September 21, 2007, at 15:38 UTC.

It all started out innocently enough. I just got mad about my content being scraped over and over. I’m all for humans using technology to eliminate drudgery and expand opportunity. What I don’t like is humans using technology to use other humans.

After railing about the injustice and applying a few ineffective remedies, I decided to try something else. What follows won’t work against every scraper site, but it does alert users that they are reading stolen content and asserts your claim on your intellectual property. It’s kind of funny, too. Note: This is mainly for a hosted blog. If you have access to the server where your blog resides, there are better remedies.

I used two simple principles to design a “spam sandwich” to bait a scraper’s spider:

1) A human can quickly and easily scan to see if content is relevant and interesting. A human can also skim over the irrelevant parts and extract what was meant for human consumption only.

2) An automated scraper bot cannot. Do use care, however, in designing your “fly in the ointment.” Legitimate search engines can flag “over-optimized” content which is designed to alter search engine ranking, and could confuse your “spam sandwich” as an attempt to crank up your ratings, but with a little writing skill you can avoid getting penalized by Yahoo! or Google and still target your intended quarry – the dreaded Spiderbotus scraperus stinkerii.

Here’s the bot bait I designed using the above two principles: Three Great Ways to Increase Your Site Traffic

Here’s the result of the experiment: Open Season on Scraper Bots

There are lots of ways these ideas could be improved and refined. I’d like to hear the results of any similar experiment you conduct. The possibilities are endless. And of course, if you choose to indulge in this sort of behavior remember these words of wisdom by John Steinbeck:

“It has always been my private conviction that any man who puts his intelligence up against a fish and loses had it coming.”

In the meantime, if you like the idea, by all means use it. Just give me a link back, okay? (https://dangerousangel.wordpress.com/2007/09/19/vigilante-blog-justicehow-to-out-a-scraper-bot/)

Happy hunting. :-)
Ariel Laurel Strong on dangerousangel.wordpress.com

