thedoedoeblog

Musings of a small game development team

Scraping iTunes: Returning HTML

Written by Bill Soistmann on March 20, 2009 at 11:50 am

Today I want to take a look at how to modify our Perl script so that it returns the html we need instead of a number. The current version of our script can be found here. Before I get ahead of myself, I want to mention that we will continue to ignore the country code for now. Adding an option for grabbing the data from another iTunes store adds quite a bit of code so I will cover that in my last post.

So what we want to do today is add an option for grabbing the data as html instead of a number. It makes sense to leave the number as an option in the event someone still wants to use it for that purpose. At this point, we know we can add two arguments to the URL

  1. country=US - which the script currently ignores
  2. id=304570595 - the application id

For our new addition we will use an argument named html and pass in 'true' for true and anything else (including nothing) for false. The script will assume false if no argument is passed in.

While we are at it, let’s go ahead and add an option for half stars.

We start by adding these lines

my $htmlformat;
my $halves;
			

with the rest of the variable assignments. Then, we add more to our splitVars function. This should do it

elsif(index($item, "html", 0) >= 0){
	$htmlformat = substr($item, index($item, "html", 0) + 5, length($item));
}
			

Now is a good time to improve on the code that grabs the arguments. The current version is fine for a couple of args but we can see that it will quickly become unwieldy. Let’s clean it up a bit and rewrite the splitVars function as follows:

sub splitVars{
	if (length ($ENV{'QUERY_STRING'}) > 0){
	      my $buffer = $ENV{'QUERY_STRING'};
	      my @pairs = split(/&/, $buffer);
	      foreach my $pair (@pairs){
	           (my $name, my $value) = split(/=/, $pair);
	           $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
	           $data{$name} = $value;
	      }
		$currentSoftware = $data{'id'};
		$htmlformat = $data{'html'};
		$coCode = $data{'country'};
		$halves = $data{'halves'};
	 }
}
		

Now we can change the main logic a bit so we have this:

print "Content-type: text/html\n\n";
splitVars();
$store = 143441;
getReviews();
if($htmlformat eq 'true') {
	printHTML();
} else {
	if ($halves eq 'true') {
		print $result * 2;
	} else {
		print $result;
	}
}
			

Well, now we can clean up some variables we don’t need but we also have a function we haven’t written yet - printHTML.

Let’s get the function done and then I will show you our new version. This function will do exactly what we did with the PHP - with differences in syntax, of course. We end up with something like this.

sub printHTML{
    my $t; my $thisimg;
	$result=$result*2;
	my $half=$result%2;
	$result=$result/2;
	for (my $i=0;$i<5;$i++) {
                if($result>0) {
                    $thisimg='star_on';
                    if($half==1) {
                         $thisimg='star_half';
                         $t='Half';
                         $half=0;
                    }
                } else {
                    $t='No';
                    $thisimg='star';
                }
		print "<img src=\"images/$thisimg.gif\" alt=\"$t Star\" />\n";
		$result--;
	}
}
			

What we have now is a version with several options.

  1. id - REQUIRED - pass in app id
  2. country - OPTIONAL - pass in country code - default is US
  3. halves - OPTIONAL - pass in true for number of half stars - default is false, has no effect if next argument is true
  4. html - OPTIONAL - pass in true for html instead of a number - default is false

Grab the whole thing over here if you’d like.

This is part of a series of posts which start here. The next post is here.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

You must be logged in to post a comment.