Podcasts are huge at the moment and they are pretty much all I listen to these days when I am out walking or in the car. I do this by the (mainly) excellent Overcast app. I say mainly because it has a UI that I struggle to find things in. Anywho, the one thing that it lacks (other than an intuitive UI) are decent stats and I really wanted that for my annual Review of the Year posts over on my personal blog.
After a quick DuckDuckGo I found the following post from James Hodgkinson with a Python script that spat out some simple stats. It was a good starting point so I converted it to PHP and extended it out to give me more of what I needed. The original script just gave totals across all podcasts and I wanted a breakdown by podcast.
What I ended up with was the following. If you want to use it remember to change the filename location and date range.
<?php
/**
* Parses the "All data" OPML file from https://overcast.fm/account
* and shows some simple stats.
*/
$filename = '<location to your>/overcast.opml';
$startDate = new DateTime("2023-01-01");
$endDate = new DateTime("2023-12-21");
$XMLDATA = simplexml_load_file($filename);
$played_episodes = 0;
$episodes = 0;
$podcasts = 0;
$feeds = [];
foreach ($XMLDATA->body->children() as $child) {
if ((string)$child['text'] == 'feeds') {
$feeds[] = $child;
}
}
$results = [];
$i = 0;
foreach ($feeds as $obj) {
foreach ($obj->children() as $playlist) {
$podcasts++;
$attributes = $playlist->attributes();
$results[$i]['title'] = (string) $attributes['title'];
$results[$i]['url'] = (string) $attributes['htmlUrl'];
$results[$i]['count'] = 0;
foreach ($playlist->children() as $episode) {
// check date is in the range we want to check
$attributes = $episode->attributes();
$dateString = $attributes['userUpdatedDate'];
$dateTime = new DateTime($dateString);
if ($dateTime >= $startDate && $dateTime <= $endDate) {
if ((string)$episode['played'] != "1") {
$episodes++;
} else {
$results[$i]['count']++;
$played_episodes++;
}
}
}
$i++;
}
}
// sort the results
usort($results, function ($a, $b) {
return $b['count'] - $a['count'];
});
// Output results as an HTML table
echo '<table border="1">';
echo '<tr><th>Title</th><th>Count</th></tr>';
foreach ($results as $result) {
if ($result['count'] > 1) {
echo '<tr>';
if ($result['url'] != ''){
echo '<td><a href="' . $result['url'] . '">' . $result['title'] . '</a></td>';
}else{
echo '<td>' . $result['title'] . '</td>';
}
echo '<td align="right">' . $result['count'] . '</td>';
echo '</tr>';
}
}
echo '</table>';
die;
$i = 0;
// output results
while ($i < count($results)){
if ($results[$i]['count'] > 0){
echo $results[$i]['title'].' '.$results[$i]['count'].PHP_EOL;
}
$i++;
}
?>
The script spits out an HTML table so to run you’ll need to do something like the following and specify where the output file should be placed.
php overcast.php > ~/Downloads/overcast.html
You’ll get an HTML file that if you open will give you stats that look a little like the following:
As you can see I like a good Goalhanger podcast! Anyway, I hope you find it useful and if you extend it at all let me know.