RaiPlayDL: a python script to automatically download and merge RaiPlay Radio Podcasts

RaiPlay is the online platform of italian national broadcast company RAI: on this platform a big amount of interesting contents are freely available.

A special section is dedicated to radio channels, with a lot of good (italian) audio tracks: documentary, radio fictions, talk shows, audiobook.

All contents can be freely downloaded and are often organized in comfortable collections (like this https://www.raiplayradio.it/playlist/2018/09/Scienza-70e2f134-ef46-46b5-b4bb-7164d3829195.html) , but unfortunately the download process needs a lot of user interaction and a single download of all collection if not available.

So, using Python, i’ve developed a simple script that downloads the entire collection and merge all files into a single MP3 with the correct ID3 tags and a album cover.

The file created can be played on an audiobook reader or on a simple Mp3 reader (for example in a car mediacenter).


How it works?

Every collection page contains a list of entry, always with a specific pattern of attributes:

<li role="playlist-item" class=" " data-mediapolis="http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=ID92O7v74HceeqqEEqual" data-download="true" data-uniquename="ContentItem-e0b4380c-1f91-4552-9b8f-d9b6976801d1" data-title="WIKIRADIO - Hanna e Barbera " data-image="/cropgd/50x50/dl/components/img/radio/player/placeholder_img.png" data-href="/audio/2018/06/WIKIRADIO---Hanna--Barbera-e0b4380c-1f91-4552-9b8f-d9b6976801d1.html" rairadio-tooltip="">

Using BeautifulSoup, the script performs a simple search of all tags with attributes data-mediapolis and data-title and uses the data extracted from this tags in order to compile a list of downloads:

Finally, using pydub, all mp3 files will be merged into a single big file, with ID3 tags and album cover extract from the page.

MP3s on iOS app “BookPlayer”

The final file can be easily played on a simple feature phone, a Mp3 Player or on a car stereo:

MP3 played on my VW

Install and usage

First, clone the git repository:

$ git clone https://github.com/andreafortuna/RaiPlayDL.git

I suggest to create and activate a venv in order to resolve the dependencies without make any change on the system:

$ python3 -m venv env 
$ source env/bin/activate

Then, resolve the dependencies:

$ env/bin/pip3 install -r requirements.txt 

Finally, start the tool passing the URL of a RayPlayRadio collection:

L$ ./RaiPlayDL.py "https://www.raiplayradio.it/playlist/2018/01/William-Shakespeare-Romeo-e-Giulietta-61324db1-8ff4-4aea-8792-35aad7dfff99.html"

Starting download for "Romeo e Giulietta - Il Teatro di Radio3 - Rai Radio 3 - RaiPlay Radio"
Download collection image…
Download single MP3s…
Download "Romeo e Giulietta - Atto 1 - prima parte" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=GR6THsIe7wgeeqqEEqual)
Download "Romeo e Giulietta - Atto 1 - seconda parte" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=d1ss35ohEToeeqqEEqual)
Download "Romeo e Giulietta - Atto 2" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=vefl3ApPpPlussEJ1MeeqqEEqual)
Download "Romeo e Giulietta - Atto 3 - prima parte" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=pPpPlusshWTpnKCVz8eeqqEEqual)
Download "Romeo e Giulietta - Atto 3 - seconda parte" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=kCi3VFi7rHYeeqqEEqual)
Download "Romeo e Giulietta - Atto 4" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=4znY91sz7sAeeqqEEqual)
Download "Romeo e Giulietta - Atto 5" (http://mediapolisvod.rai.it/relinker/relinkerServlet.htm?cont=Gbl3X4a2x4keeqqEEqual)
Merge all MP3 files…

Disclaimer

All data about podcast and download URLs are extracted using a simple ‘scraping’ of RaiPlayRadio website: any change on page structure may break the proper functioning of the script, and I don’t know if I’ll have any more free time for this project in the future

All data about podcast and download URLs are extracted using a simple ‘scraping’ of RaiPlayRadio website: any change on page structure may break the proper functioning of the script, and I don’t know if I’ll have any more free time for this project in the future.
However, the sourcecode is freely available: if you want to contribute just send a pull request!


Download and references

Comments

This site uses Akismet to reduce spam. Learn how your comment data is processed.