On one fine day, I was looking for minimalist wallpapers for my Elementary OS. There comes this site https://wallpaperplay.com/ which has some huge collection of wallpapers. But the problem was with downloading process. It was like:-
I was like, who is gonna pass these 4 stages to download a wallpaper? At least not me.
There comes the idea to automate this time taking process.
My approach was like this:-
Now it was a matter of time to convert this approach into an actual working code.
Suppose we have to download all 84+ wallpapers from this page:-
https://wallpaperplay.com/board/minimalist-desktop-wallpapers
1) The first thing is to download the HTML page using wget command.
2) Extracting filename using Regular Expression from URL.
echo "$1" | grep -Eoi 'board/.*' | cut -d'/' -f2
Let's decode this RegEx:-
echo "$1" means print the URL as output.
grep -Eoi 'board/.*' means match everything that includes and comes after "board/" and using cut command we'll further cut out board from output and rest will be our filename.
For ex:-
https://wallpaperplay.com/board/blue-car-wallpapers --> "blue-car-wallpapers" will be our filename.
https://wallpaperplay.com/board/hipster-galaxy-wallpapers --> "hipster-galaxy-wallpapers" will be our filename.
3) Now we have to extract the links from downloaded HTML page using Regular Expressions.
cat "minimalist-desktop-wallpapers" | grep -Eoi '<a[^>]+>'
4) Extract links from href attribute.
grep -Eoi '/.*jpg'
5) Making relative URL into absolute URL.
sed 's/^/https:\/\/www.wallpaperplay.com/g'
wget -i links.txt -P wallpapers/
filename=$(echo "$1" | grep -Eoi 'board/.*' | cut -d'/' -f2)
cat "$filename" | grep -Eoi '<a[^>]+>' | grep -Eoi '/.*jpg' | sed 's/^/https:\/\/www.wallpaperplay.com/g' > links.txt
wget $2 $3 -i links.txt -P wallpapers/
rm $filename
rm links.txt
Here's the Github link of script:-
https://github.com/iamnihal/wallpaperplay
That's all for this blog. See you soon with the new one. ;-)