You’ve probably heard about XKCD, a popular web comic on physics, math, computer science and science in general, and lots of other subjects. Over the years, author Randall Munroe released some pretty awesome pieces such as this, this or this. Wouldn’t it be great if new comics from XKCD automatically appeared in our inbox?
Breaking It Down
We want to send ourselves every new XKCD comic as an email. We want to do this with a Bash script triggered by cron
.
For this, we need to get the latest comic data. Also we need to keep track of what we’ve sent, so we don’t send the same thing again and again. After deciding there’s a new comic that we haven’t sent to our email address, we’ll send it.
Upon a little research on the website, I realized that XKCD serves a JSON endpoint here, which includes the latest comic, it’s number, the image url, alt text; basically everything we need.
{
"month": "10",
"num": 1749,
"link": "",
"year": "2016",
"news": "",
"safe_title": "Mushrooms",
"transcript": "",
"alt": "Evolutionarily speaking, mushrooms are technically a type of ghost.",
"img": "http:\/\/imgs.xkcd.com\/comics\/mushrooms.png",
"title": "Mushrooms",
"day": "21"
}
We can use the num
field, which is like an ID, to store what we sent the last time and realize if there is a new comic released.
If we break it down to independent Bash tasks, these are the seven tasks we need to be able to accomplish to finish this project, using Bash.
- Check if a file exists.
- Read from a file and put it into a variable.
- Download a file from a remote source and read it into a variable.
- Parse JSON data and extract a specific value by its key.
- Compare two values.
- Create a string from multiple variables (string concatenation).
- Send an email with HTML content.
I’ll assume you’re familiar with Unix concepts like piping, redirecting and appending.
Check If a File Exists
In Bash, this is a basic if
structure:
if [ $someVar = $anotherVar ]; then
# do something
else
# do something else
fi
We could also use two equals like in ‘regular’ languages. To do assignment =
should be used without spaces, like this: pi=3
. Spaces are extra important in Bash.
We’ll use -f
expression from the many expressions that test
accepts, to see if our log file exists. This square-brackets syntax ([ -f fileName ]
) is equal to test -f fileName
, because, basically, [
is a synonym for test
which takes]
as last parameter. To see all expressions available, run man test
.
lastSent=0
if [ -f ./xkcd-last-sent.log ]; then
#read it into lastSent
fi
Reading From a Text File
Since basically everything is a file in Unix, there are lots of tools to work with files. Our log file will only contain one line and no spaces, so to read this file content into a variable, we can just do this:
varName=`cat fileName`
lastSent=`cat ./xkcd-last-sent.log`
Getting JSON From a Remote Host
In Unix-like systems, the most common way to download a file from a remote host is to use wget
or curl
. While curl
is a very sophisticated network program, wget
is basically a downloader. We can use either, but let’s go with wget
:
jsonStr=$(wget -qO- https://xkcd.com/info.0.json)
-q
option tells wget to be quiet, meaning no command output will be printed, and -0-
tells it to write downloaded content to STDOUT
, the current stream.
At this point, we have our json content as a string in jsonStr
. If download failed, this should be empty. Let’s check it before going further:
if [ "$jsonStr" == "" ]
then
echo 'Failed pulling JSON data.';
exit;
fi
Why are we putting $jsonStr
inside quotes? Because variables will be evaluated; and if it’s empty, Bash will try to execute [ == "" ]
, which is clearly a syntax error. By wrapping our variable with quotes, we’re making sure if the string is empty, the executed expression will be syntactically correct. This will also protect us from errors caused by spaces and other special characters.
An alternative to this, is to use double square-brackets. In double square brackets, empty strings or strings with spaces won’t cause a syntax error:
if [[ $jsonStr == "" ]]
Parsing JSON
Now we need to parse the string and get values. Following snippet from this answer on StackOverflow looks quite compact and easy to use:
function getJsonVal() {
python -c "import json,sys;sys.stdout.write(json.dumps(json.load(sys.stdin)$1))";
}
This little function lets us to get value of a key with this syntax:
value=`echo $jsonStr | getJsonVal "['keyName']"`
We’ll modify this function a little bit. In this form, you need to specify keys with square brackets, like this: "['key']"
. The author probably did it this way, because, for example, when you need to get a value in second level or deeper, you need to reach it like this: [firstLevel][secondLevel]
. So it is wise to leave it that way to keep the code simple, otherwise you’d need to write more than one line of code to handle depth. But we don’t need to go deeper than one level as our json data is flat.
function getJsonVal() {
python -c "import json,sys;sys.stdout.write(json.dumps(json.load(sys.stdin)['$1']))";
}
Now we can do this:
echo $jsonStr | getJsonVal keyName
So, we’re not doing this using standart Unix tools and just calling python? Well, apparently, there is no simple way to parse JSON using standart Unix tools. A lot of people recommend jq, and you should definitely use it if you’re frequently doing JSON at the command line. Other solutions exists using awk
, but i found awk
to be a bit confusing to parse JSON. A suggested awk
solution, which i find complicated, is below:
echo $jsonStr | grep -Po '"keyName":.*?[^\\]",' | perl -pe 's/"keyName"://; s/^"//; s/",$//'
On the other hand, Python will most probably be available in every Unix-friendly environment that you’ll use, so i think it’s better to use it instead of adding a dependency and better than using awk
because it’s more stable than your awk
string.
We can use the same syntax to extract four values we need from the JSON string, whose key names are num
, title
, img
and alt
.
currentNum=`echo $jsonStr | getJsonVal num`
Comparing Values
At this point, we need to check if the current num
is greater than the value we keep in our little log file. If not, it means we’ve already sent the latest comic to ourselves and there’s no need to send it again.
We’ve already done a comparison above by checking if our json string is empty. This time we will compare two variables, and use -gt
(greater than) with similar syntax.
if [[ $currentNum -gt $lastSent ]]
then
title=`echo $jsonStr | getJsonVal title`
img=`echo $jsonStr | getJsonVal img`
alt=`echo $jsonStr | getJsonVal alt`
# create html string by concatenating the values above
# send mail
fi
String Concatenation
In Bash, you can pretty much write variables one after another to concatenate them. This is valid syntax:
aAndB=$a$b
We can also add text directly before, after or in between variables, with the help of quotes:
concatStr=$a" and some text here "$b" a little more text here"
This is all we need to produce a simple HTML string for our email content. We have a title, an image and some alt text. So something like this could work:
htmlContent='<h2>'$title'</h2><br><img src="'$img'"><br><p>'$alt'</p>'
You’ll notice that the three strings we extracted from json are wrapped by quotes. To get rid of them we’ll be using a special Bash syntax, which works like a substring
function that you’re familiar from languages like PHP or Javascript.
subString=${string:startPos:endPos}
We can also use negative indexes to get position from the end of the string. Remember, we want to get rid of first and last characters and keep the rest:
title=${title:1:-1}
Sending Email
If you’re going to run this on a real server with a domain attached, you probably already have postfix
and mailutils
installed and can send email.
But if you want to run this on your personal computer, you’ll probably need to install an MTA software and do some configuration.
If you’re using OSX, you have postfix
by default. You can configure it to use your personal email address. This gist shows step by step how to do it.
If you’re not on OSX and don’t have any MTA, i would suggest ssmtp
. It’s quite easy to install and configure. On Ubuntu, you can install it by:
sudo apt-get install ssmtp
After installation, you need to edit two configuaration files, /etc/ssmtp/ssmtp.conf
and /etc/ssmtp/revaliases
. I’ll share with you the content of my configuration files for gmail:
`/etc/ssmtp/ssmtp.conf`
mailhub=smtp.gmail.com:587
rewriteDomain=gmail.com
hostname=localhost
FromLineOverride=YES
UseTLS=YES
UseSTARTTLS=YES
[email protected]
AuthPass=password
`/etc/ssmtp/revaliases`
local_user_name:[email protected]:smtp.gmail.com:587
To test email using ssmtp:
echo "Test message" | sudo ssmtp -vvv [email protected]
To test email using mail
:
echo "Test message" | mail -s "Subject" [email protected]
Hopefully, our test message made it to our inbox, so we can continue.
We just sent ourselves a text-based email. To send HTML, we only need to set MIME-version
and Content-type
headers. A minimum html email looks like this:
MIME-Version: 1.0
Content-type: text/html; charset=utf-8
our html content
Putting it All Together
The only step left, is to update our log file with the new id. We can easily do this by redirecting this id to our log file:
echo $currentNum > /our/log/file
This will override the content in the log file with the new id.
This is what our complete script looks like:
A view from my inbox. Everything in its right place. Note: gazorbazorb [at] gmail is not my email address.
Setting Crontab
To make this work on a regular basis, we will set up crontab. On XKCD website, it says new comics are released every Monday, Wednesday and Friday. So we can check every Monday, Wednesday and Friday night at 23:00 by this:
0 23 * * 1,3,5 /path/to/our/script
I prefer to check every morning instead, at 8:45, as i like to read them on the commute to work:
45 8 * * * /path/to/our/script
One last useful trick; we’ll append the output of our script to another log file, preferably somewhere like /var/log/cron/
. This will write everything our script echo
es to this file. So if we need to debug our program, this file will come in handy, since we are echo
ing every time it worked and its decision about sending an email.
45 8 * * * /path/to/our/script >> /var/log/cron/xkcd.log
Conclusion
With crontab set, we certainly made sure that we’ll never miss new XKCD comics.
And last but not least, my two favorite XKCD comics; i couldn’t pick one over the other, so i’m putting them both here:
Recommended Reading
https://linuxacademy.com/blog/linux/conditions-in-bash-scripting-if-statements/
http://mywiki.wooledge.org/BashFAQ/031