Monthly Archives: October 2010

Building a Data Feed Driven Website (with @WebgainsUK Feeds)

Part 1: Webgains Data feed Introduction

Where I work (@WebgainsUK), we often have a lot of affiliates contact us that want help with regard to data feeds. Whilst a lot of these queries tend to be technical questions relating to an issue with the feed, or reporting a problem with a merchant’s feed; some of these requests are as technical as asking how to build a site using the data feeds, how to import feeds into the database, or asking how to build a price comparison site. Sadly, as much as Webgains try to help with these sorts of questions wherever possible, it’s simply not possible to offer the in-depth time-consuming answer required due to the nature of the task in hand.

Thankfully, there are tools out there such as Easy Content Units, which allows affiliates to insert content units into their site without requiring any knowledge of data feeds. A plus point of being a Webgains affiliate is that Webgains is partnered with Easy Content Units and as such get the service for free. However, having 3×3, 4×4 or other size grids of products powered by someone else’s website isn’t always what an affiliate wants – some folks want to do their own thing, and know how it works. Also, having products powered by Easy Content Units doesn’t do much for your SEO; you can’t index each individual product, you can’t have an in-built search, you can’t categorise your products, and you’re also relying on their server be up 100% of the time. Throw in a few other excuses for not wanting to use it, and you have a compelling reason why affiliates are asking how to build their own data driven website.

So with that in mind, I figured I’d have a go at providing a solution in the format of a guide, “Building a Data Feed Driven Website”. This guide will come in several parts, I have no idea how many parts it will be, and I have no idea if I will ever finish it. First things first though, let’s set out the objective.


To build a website that:
> is able to download data feeds from Webgains on a regular basis, so that the prices are always correct.
> can display 1 product per page for SEO reasons.
> has SEO friendly URLs
> can categorise products.
> has a neat search facility.
> has a voucher facility to accompany the products.
> can list the latest offers and promotions from merchants.

For now, that can be the ‘simplistic’ objective – we can always add to it later if we desire. Notice how the requirement mentions ‘data feeds from Webgains’; the reason for this is that as a Webgains employee, I feel I have a duty to not help you promote other networks. If by reading my guide, you feel technically inclined to mash the code to allow you to download feeds from other networks then feel free, but don’t ask me how 🙂

Tools/Resources/Knowledge Required

Basic understanding of affiliate marketing is a given; you should know what products you want to promote, you should know what a data feed is, you should know why you are reading this, and you should know what it is you want to achieve. It’s probably worth noting that you won’t get very far without a Webgains affiliate account. Signup to Webgains here.

Ultimately, you are going to need access to a server capable of running PHP5 & MySQL5. In terms of getting a server up and running with PHP5 and MySQL5, I’m afraid it is beyond the realm of this guide. However, if you have a spare computer laying around; may I suggest that you install Ubuntu. If you need help with that, Google this: “ubuntu LAMP server”. Failing that, you could always try a webhost like Fasthosts.

In addtion, you will need a decent text editor (Notepad++, or EditPlus[I use this]) and a willingness to learn. As this guide is meant to be a resource to those who already have a basic understanding of PHP & MySQL, you will have to ‘catch up’ where necessary. If possible, I’ll try to give, ‘beginners guide tips’ – but the ultimate goal is for you to reach the objective, so maybe a twitter account will be required so you can ask me questions: @bobbyjason.


Due to the nature of complexities of this project (it’s not such a small project after all), I’m going to code the site in sections of how to do each bit. I can’t promise that the final project will involve all the final bits being put together, as each section is meant to be a mini-guide in it’s own right. Again, throw any questions my way: @bobbyjason.

Getting Started

The most obvious place to start, is at the heart of the problem; data feeds. If you knew how to work with data feeds, and you knew how to load them into a database – you probably wouldn’t be reading this guide, so it is for that reason I have decided to start here. For the sake of brevity, I’m not going to explain in detail the difference between downloading a datafeed directly, and downloading a feed via a URL. In short, for our objective we want to download the feed via a URL as this will allow our website to be automated.

So, head over to the “Data feed url generator” and select the “xml” option, rather than “csv”. XML is a much more structured approach, and it makes life much easier where trying to find a problem when things go wrong – plus, I prefer working with XML and its MY guide! Select the “.gz” option – this is the only option that allows you to download a compressed feed immediatly. “.zip”, and “.tar.gz” required you to hang around a little – not so good. Select ONE of the programs, not all – but ONE. You can then select ‘All’ at the categories option, and select the ‘exended’ fields option. Finally, enter your username and password, and grab the URL – you should something that looks similar to this:

If you have a URL like the one above (with the correct username and password), you should be able paste it into your browser and download a .gz compressed feed of the program you selected. Notice how you can replace the “programs” parameter with any other program ID to download a feed for a different program. Straight away you should be able to see the logic required here; we need to create a PHP script that can loop through each of the programs’ feeds that we wish to download, and enter the correct program ID. In psuedo form, here it is:

foreach program
….download feed for this program
end foreach

Of course, in reality we don’t want to just download the feed, we also want to take the data out of it and populate our database, but I figured baby steps would be best as not every one will understand that just yet. So let’s get sizzling. Glancing at the clock I can see that it is 18/10/2010 22:17 – this means that I am missing the last episode of The Inbetweeners which is rather upsetting. Thankfully we have 4OD in this modern world, so I’ll create this script and then head off to bed!

<?php /* lesson1.php */

ini_set('display_errors', true);

// Create an array of the programs that we want to download
$programs = array(4084, 116);

print "Starting...\n";

// PHP foreach loop.
foreach ($programs as $program_id)
 // Use the "sprintf" function to pass the correct values to the string.
 // Also, splitting up the string to avoid long lines of code.
 $feed_url = sprintf(''.
12345, 'USERNAME', 'PASSWORD',$program_id);

 // Set the path of where you would like to save the file.
 $file = '/home/chops/xml/guide/feeds/'.$program_id.'.xml';
 $compressed = $file.'.gz';

 // Download the file (requires function, 'curl_get_file_contents'
 $data = curl_get_file_contents($feed_url);

 // Open the file for writing, write, and close the file.
 $fp = fopen($compressed, 'w+');
 fwrite($fp, $data);

 // Call the a UNIX command to unzip the file, and move it to our desired location.
 shell_exec("gunzip -c $compressed > $file");


print "Done.\n";

// Function to download a file.
function curl_get_file_contents($url)
 // Output something so we know it's working.
 print "Downloading '".$url."'\n";

 $c = curl_init();
 curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($c, CURLOPT_URL, $url);
 curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 5000);
 curl_setopt($c, CURLOPT_TIMEOUT, 10000);
 $contents = curl_exec($c);

 return $contents;

You can copy that code into a text document, save it as ‘lesson1.php’ (be sure to replace username, password and campaignid with your values!). You can call the script directly in a browser if you wish, personally I shall use the terminal. The files will be downloaded to the specified location. Note: If there are any PHP Gurus out there, you will notice that this code may not be the most complex, but I’m trying to keep it simple for those that may not be so familiar with PHP.

All the script does is:
> loop through our selected programs.
> downloads the compressed feed.
> extracts the compressed file with the name of {program_id}.xml

For now that will have to do, because I want to publish this blog post and go to bed. I hope it’s useful to somebody!

Coming up: How to parse the XML files and insert the products into a database.