These forums are archived

See this post for further info

get_iplayer forums

Forum archived. Posting disabled.

Webscraping TV/Radio Cache

Pages: 1 2

user-637

Hello again,

I do not know if this is needed by others now that the new version of get_iPlayer has been released, but anyway here is the Perl code that refreshes both the "tv.cache" and "radio.cache" files to keep everything working almost like before the BBC made changes to the way it announced the BBC iPlayer programs.

There are still some limitations, for example only (maximum) 20 episodes of each TV show are cached, only (maximum) 10 episodes of each radio show are cached, so sorry again to all Bargain Hunt fans ;). Also, after recording with the PVR manager the file-name is almost the same as it should be, but the beginning of the file-name is always “BBC_iPlayer_Feeds” which is something I cannot change!!! At least the icon image of the recorded file is correct.


INSTRUCTIONS:

I guess that this only works with a Windows PC, but I did read that it can also work on other operating systems?? I have two files that I copy into the get_iplayer program files directory which is “C:\Program Files (x86)\get_iplayer” (on this machine). I then remove the “.txt” from the end of both files so that they are simply named “Refresh_Cache_V0_5.pl” and “Refresh_Cache_V0_5.pl.cmd”. I will try to upload the two files with this message. I have renamed the files in this way so that the files can easily be downloaded without any "anti-virus" software getting over paranoid!

I then double click the “Refresh_Cache_V0_5.pl.cmd” file to open the command, which just starts running the Perl code (“Refresh_Cache_V0_5.pl”). You can then enter 't' or 'T' from the keyboard to refresh the TV cahce or 'r' or 'R' from the keyboard to update the radio cache. I then press ENETER and my computer then sits there for a very long time (about 20 minutes for the TV cache and a few hours for the radio cache) scraping all the Data from the BBC iPlayer website and storing the results on the hard-disk, this refreshes the cache that get i-player uses to store all the data from the BBC iPlayer system. You can watch the progress on the screen. After this, I can then start the “PVR Manager” and use get_iplayer exactly as before.

***********

Thank you for the feedback, keep it coming.

Big thanks to all those who helped to make get_iPlayer!!!!! :)

Tiiveni.
Refresh_Cache_V0_5.pl_.txt"
Refresh_Cache_V0_5.pl_.cmd_.txt"

user-2

Thanks for sharing your code. We may be reduced to scraping the iPlayer site at some point, so it's good to have someone breaking trail.

user-645

Radio as well as TV!

Cue much rejoicing in the Iain-F household, especially as the other half is about to fly to the US for a few weeks and will miss her progs.

Well done to all.

user-550

Sadly not here.

After leaving it to run the radio feeds overnight, I found a two-line 1kb radio.cache file.

Neil

user-637

Oh dear wingtech, I am sorry.

It ran OK here a few times, but did take a couple of hours each time. I can upload the results here (radio.cache), but I have seen that the new version of get_iPlayer (v2.90) works perfectly and downloads the radio.cache in about 1 minute - So I recommend getting the latest version.

Good luck!

Tiiveni.

user-550

Spotted that since I posted. I'll install it when I have a moment.

Thanks for your efforts.

Neil

user-647

it's possible to use the API which iplayer app for Android uses...
/topic...e-bbc-api/

my, er, friend, has a web-controlled scheduler and a back-end job. Just needs some help with the API

user-585

@Tiiveni Did you do any more development on this?

user-637

Hi tvfan,

Sorry, I have left this and moved on to other things (like the hls downloader written in .php). If ever things go down completely then I think that we can pick this up again, if that is needed. I would like to see a program running on a single central server that collects all of the radio programs and TV programs and makes both the "tv.cache" and "radio.cache" files for downloading. This would allow all of the programs to be listed and not just the last 7 days. Collecting the data takes some hours and it would not make sense for every user to do this, rather a single central server.

But I think that get_iplayer works really great just now! And I am very grateful! :o)

Thanks, Tiiveni

user-832

Some of you may find this useful, it is a development of tiiveni's webscraper script. It will only work for GiP 2.95 at the moment due to changed cache format. It requires the perl datetime module which is not in GiP's built in perl, so I have installed a separate copy of perl for this script to use.

I haven't updated the radio side of the program so I would not recommend trying it.

I would strongly advise you back up your cache files before using this script.

Why use this script?; Well it will allow you to update your TV cache should Auntie kill the feeds again, but also it will add the vast majority of archive and other web only content to the cache, allowing easy downloading with the PVR (I was able to download all of this years Glastonbury with a single wildcard search from the web PVR manager), which I'm sure some of you Olympic fans will find handy!

Unlike earlier versions of this script 0.6 does not simply overwrite the existing TV cache file but reads it in preserving the earlier cache entries, however it does not yet rename the existing cache file as GiP now does. My TV cache file now contains 5322 programmes dating back 37 days.

However there are some online programmes GiP doesn't get on with even though they now appear to have valid entries in the cache file, for example episodes of 'Top Gear: Extra Gear' and 'Witless', which won't display in search results.

I'm a complete Perl novice but this has been great fun to work on.

Enjoy but don't expect much support!
Refresh_Cache_V0_6.pl.txt
Refresh_Cache_V0_6.pl_.cmd.txt

user-1398

Thanks Sawtooth. (and original author) I just ran this on my RaspberryPi running osmc and it works well for TV programes. I only had to update the following and it worked otherwise as is:

$UserProfile = $ENV{'USERPROFILE'};

changed to:

$UserProfile = $ENV{'HOME'};

It should be easy to actually check if 'USERPROFILE' environment variable exists and if not try HOME. That would make the script a bit less tied to an OS. Or you could check if $UserPriofile is empty and try settig it to HOME if it is.
Pages: 1 2

These forums are archived

See this post for further info