Automate the shit!!

Yup. Automation is the key. Why bother to do boring tasks when you can automate it?
Automation=Less hard work+Time saver
In this very first blog, I will share how I automate the process of downloading wallpapers using Bash Scripting and with some Regular Expressions.

Story:-

On one fine day, I was looking for minimalist wallpapers for my Elementary OS. There comes this site https://wallpaperplay.com/ which has some huge collection of wallpapers. But the problem was with downloading process. It was like:-

Click on Download button.
Wait for 5 seconds (Timer).
Click on generated link.
And finally Right click ----> Save image as.

I was like, who is gonna pass these 4 stages to download a wallpaper? At least not me.

There comes the idea to automate this time taking process.

Approach:-

My approach was like this:-

Check source code of current page.
Extract the wallpaper links from it.
Save all the links to a file.
Pass the file as an argument to wget command.

Now it was a matter of time to convert this approach into an actual working code.

Getting Started:-

Suppose we have to download all 84+ wallpapers from this page:-

https://wallpaperplay.com/board/minimalist-desktop-wallpapers

1) The first thing is to download the HTML page using wget command.

wget https://wallpaperplay.com/board/minimalist-desktop-wallpapers

This will download and save the HTML file with path name i.e "minimalist-desktop-wallpapers".

2) Extracting filename using Regular Expression from URL.

echo "$1" | grep -Eoi 'board/.*' | cut -d'/' -f2

Let's decode this RegEx:-

echo "$1" means print the URL as output.

grep -Eoi 'board/.*' means match everything that includes and comes after "board/" and using cut command we'll further cut out board from output and rest will be our filename.

For ex:-

https://wallpaperplay.com/board/blue-car-wallpapers --> "blue-car-wallpapers" will be our filename.

https://wallpaperplay.com/board/hipster-galaxy-wallpapers --> "hipster-galaxy-wallpapers" will be our filename.

3) Now we have to extract the links from downloaded HTML page using Regular Expressions.

cat "minimalist-desktop-wallpapers" | grep -Eoi '<a[^>]+>'

This will extract all HTML anchor tags which contains links. Here cat command is being use to reads data from the file and gives their content as output. The output is then redirected using pipe "|" to grep command.
grep -Eoi '<a[^>]+>'
This is a Regular Expression which extracts the anchor tags "<a>". It means "Match everything that starts with "<a" except ">" in between and upto ">"
E = Extendedo = Show only the matched string i = Case insensitive search

4) Extract links from href attribute.

grep -Eoi '/.*jpg'

This Regular Expression will extract the value of href attribute.It means "Match everything that starts with "/" and contains anything afterwards and ends with ".jpg"
Till here we will get output like this-/walls/full/2/5/9/20629.jpg
which is a relative URL. We can't really use this as an argument for wget command. We need to append "https://wallpaperplay.com" at the beginning of the URL to make it absolute URL.

5) Making relative URL into absolute URL.

sed 's/^/https:\/\/www.wallpaperplay.com/g'

This Regular Expression will add "https://www.wallpaperplay.com" at the beginning of URL.
sed is stream editor and is basically use for editing text.
"^" means at the beginning of each line.
After doing all these operations, we finally have valid URL's. Redirect all links to a text file using ">" operator and then finally pass the text file using -i flag to wget command and -P to save wallpapers in "wallpapers" folder.

Assemble everything

cat "minimalist-desktop-wallpapers" | grep -Eoi '<a[^>]+>' | grep -Eoi '/.*jpg' | sed 's/^/https:\/\/www.wallpaperplay.com/g' > links.txt

wget -i links.txt -P wallpapers/

Final script

filename=$(echo "$1" | grep -Eoi 'board/.*' | cut -d'/' -f2)

cat "$filename" | grep -Eoi '<a[^>]+>' | grep -Eoi '/.*jpg' | sed 's/^/https:\/\/www.wallpaperplay.com/g' > links.txt

wget $2 $3 -i links.txt -P wallpapers/

rm $filename

rm links.txt

Demo

Here's the Github link of script:-

https://github.com/iamnihal/wallpaperplay

That's all for this blog. See you soon with the new one. ;-)

Page updated

Google Sites

Report abuse