Read windows share folder from linux / python
SMBClient
The windows share drive / folder can not be accessed directly from linux. It will need SMBClient / samba to support the protocol and read files.
Install smbclient
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y smbclient
sudo apt-get clean
Connect to share drive
Note the share url is \\server\sharename in windows but replace them with / in linux.
The -U specifies username with domain, and the password. The % indicates the password to follow. If password or any part of the command contains reserved character, use quotations '' or "".
The -c is for running commands, e.g. ls, cd and get. There can be a chain of commands, e.g. -c 'ls; cd subfolder; get file /tmp/file'
smbclient '//server/sharename' -U domain/username%'password' -c 'ls'
Run command through python
To programmatically run the smbclient command and retrieve files, it can run command through subprocess, and capture the standard output
import subprocess
command = "smbclient '//server/sharename' -U domain/username%'password' -c 'ls'"
output = subprocess.run(command, shell=True, capture_output=True, text=True)
lines = output.stdout.split('\n')
print(lines)
Run more commands
From the output of the first 'ls' command, it can retrieve the subfolders, now it and navigate to subfolder and list the files within the folder
command = "smbclient '//server/sharename' -U domain/username%'password' -c 'cd subfolder; ls'"
output = subprocess.run(command, shell=True, capture_output=True, text=True)
Download a file
Given a file path retrieved through navigation above, it can down a file to local folder.
Here it simply cd to the subfolder, and then 'get' the file to local '/tmp/' folder.
command = "smbclient '//server/sharename' -U domain/username%'password' -c 'cd subfolder; get file.csv /tmp/file.csv'"
output = subprocess.run(command, shell=True, capture_output=True, text=True)
Extract the first field (name) from output
Simply pipe the std output to commands like awk.
The awk here runs awk command print within {}, and displays the first field of each output line. The fields are delimited by space. $1 points to the first field.
smbclient '//server/sharename' -U domain/username%'password' -c 'cd subfolder; ls' | awk '{print $1}'
Download multiple files or folders
The get command previouly gets only one file at a time. 'mget' command can get multiple files.
It can list each of the files to be downloaded, 'mget abc1.csv abc2.csv' or simply use mask 'mget abc*.csv'
Remember to set 'prompt OFF' otherwise the command will pause and wait for confirmation.
The 'lcd /tmp/subfolder' is local cd to point to the local download folder .
smbclient '//server/sharename' -U domain/username%'password' -c 'prompt OFF; lcd /tmp/subfolder; cd subfolder; mget abc1.csv abc2.csv'
smbclient '//server/sharename' -U domain/username%'password' -c 'prompt OFF; lcd /tmp/subfolder; cd subfolder; mget abc*.csv'
To download folders recursively, set 'recurse ON'
smbclient '//server/sharename' -U domain/username%'password' -c 'prompt OFF; recurse ON; lcd /tmp/subfolder; cd subfolder; mget 2024*.csv'
Note, the mget or get requires to navigate to the subfolder of the files, and download with the file name only. An 'mget folder/2024*.csv' wouldn't work, it has to be 'cd folder; mget 2024*.csv'