LanguagesWorking with Python and SFTP

Working with Python and SFTP

A typical use case for a networked Python application might involve the need to copy a remote file down to the computer on which a script is running, or to create a new file and transfer it to a remote server. More often than not, this is accomplished through the use of SFTP (Secure File Transfer Protocol). In this second part of a three part series on network programming in Python, we will look at ways to work with Python, SFTP, SSH, and sockets.

You can read the first part of this series by visiting our tutorial: Python and Basic Networking Operations.

Before getting into coding, it is important to emphasize how exactly SFTP ensures that file transfers are secure. One important thing that the SFTP process does is validate what is called the SSH Fingerprint of the remote SFTP Server. The SSH Fingerprint is unique to a given SFTP server. Note that, an SFTP Server’s Public Key is NOT the same thing as a SSH Public Key that is stored on the remote server that is used for authentication without a password.

Verifying the SSH Fingerprint is crucial for ensuring that the SFTP Server is indeed the SFTP Server that a remote user thinks it is. If an SFTP connection attempt reports that the SSH Fingerprint has changed, it could mean one of a few things:

  • The operator of the SFTP Server has upgraded it or made some configuration changes to the SFTP Server.
  • A malicious user is attempting to impersonate the remote server, most likely for the purposes of harvesting login credentials. This is an example of a “Man in the Middle” attack.

In either case, it is absolutely imperative that any code that automates an SFTP process verifies that the SSH Fingerprint is indeed valid, and if it is not valid, to immediately cease any connection attempts until the proper identity of the server is verified. SFTP, like all other protocols in the SSH suite, operates on TCP Port 22 by default. All of the examples in this article will follow this convention. If another port is needed, then that port must be specified in place of TCP Port 22.

Getting the Initial SSH Fingerprint with SFTP

The easiest way to get a copy of the SFTP Server’s SSH Fingerprint is to connect to it with a freely-available SFTP client, or with the OpenSSH tools which are provided with both Linux and Windows 10 Professional.

Linux

From a terminal, simply invoke the sftp command directly. If the SSH Fingerprint is not known, or if it has been changed, the command will prompt the user to that effect. The following command will query the remote server to retrieve its SSH Fingerprint before attempting to connect. Once the connection is confirmed per below, the user will be prompted for the password for the specified account:

$ sftp sftp://[email protected]

Python and SFTP tutorial

Getting a SSH Fingerprint in Linux, with the fingerprint and algorithm highlighted

Confirming the addition of the fingerprint will save it to the ~/.ssh/known_hosts file, which is specific to each individual user account on a Linux system. In the figure above, the hash of the SSH Fingerprint, as well as the hashing algorithm used (sha-256) and the encryption or signature scheme used (Ed25519) are all highlighted with red rectangles.

The Python code for Linux below will fail if “yes” is not answered to the question of wanting to continue connecting. It is critical for the host record to be added to the ~/.ssh/known_hosts file, and it is a good security practice to make sure that any program code which makes use of SFTP is limited to the hosts already added by the end user.

While this information can be important in other applications, the Python module that will be featured in these demonstrations is more interested in the value of the fingerprint itself. These can be listed by dumping the contents of the ~/.ssh/known_hosts file:

Sample hosts.txt file

A Sample known_hosts File

In the figure above, the encryption or signature algorithm that corresponds to the hash is highlighted. Note that in this specific Linux distribution and OpenSSH implementation, the host itself, which is the value on the leftmost side of each line, is hashed.

Windows

Windows 10 provides an official implementation of OpenSSH which can be used to retrieve the SSH Fingerprint of a remote SFTP Server. Before continuing, follow the instructions at Get started with OpenSSH and verify that at a minimum, the “OpenSSH Client” Windows Add-On is installed in Windows. Once it is installed, a file similar to the ~/.ssh/known_hosts file can be generated using the command:

C…> ssh-keyscan my-sftp-host-or-ip > known_hosts.txt

The known_hosts.txt file will look similar to the following. Note how, in this situation, the IP Address (or host name) is not hashed:

Python SFTP examples

The known_hosts file doppelganger.

As was the case with the Linux ~/.ssh/known_hosts file, the signature algorithm is listed with the SSH Fingerprint, but in the Windows implementation of OpenSSH, the host entries are not hashed.

Now that the SSH Fingerprints of the remote server in question are known, it is time to move on to the code. The server examples here made use of the ssh-ed25519 signature algorithm and SHA-256 hashing. Other servers may use different encryption schemes and the code will need to be tweaked accordingly.

Read: Top Online Courses to Learn Python

Paramiko Module

Paramiko is a Python module which implements SSHv2. The demonstrations in this Python tutorial will focus strictly on SFTP connectivity and basic SFTP usage. The example below was run on Ubuntu 22.04 LTS with Python version 3.10.4. In this system, the command python3 must be explicitly used to invoke Python 3. Consequently, the pip command associated with this system is pip3. Other systems may alias the command python to invoke Python 3. In these situations, the command would be pip.

To install the Paramiko module, use the command:

$ pip3 install paramiko

Linux

The code below connects from Linux, and obtains the host key from the ~/.ssh/known_hosts file. The code verifies that the SSH fingerprint matches before allowing a connection:

# demo-sftp.py

import paramiko
import sys

def main(argv):
  hostkeys = paramiko.hostkeys.HostKeys (filename="/home/phil/.ssh/known_hosts")
  # The host fingerprint is stored using the ed25519 algorithm. This was revealed
  # when the host was initially connected to from the sftp program invoked earlier.
  hostFingerprint = hostkeys.lookup ("my-sftp-host-or-ip")['ssh-ed25519']    


  try:
    # Note that the parameters below represent a low-level Python Socket, and 
    # they must be represented as such.
    tp = paramiko.Transport("my-sftp-host-or-ip", 22)

    # Note that while you *can* connect without checking the hostkey, you really
    # shouldn't. Without checking the hostkey, a malicious actor can steal
    # your credentials by impersonating the server.
    tp.connect (username = "my-username", password="my-password", hostkey=hostFingerprint)
    try:
      sftpClient = paramiko.SFTPClient.from_transport(tp)
      fileCount = 0
      # Proof of concept - List First 10 Files
      for file in sftpClient.listdir():
        print (str(file))
        fileCount = 1 + fileCount
        if 10 == fileCount:
          break
      sftpClient.close()
    except Exception as err:
      print ("SFTP failed due to [" + str(err) + "]")

    tp.close()
  except paramiko.ssh_exception.AuthenticationException as err:
    print ("Can't connect due to authentication error [" + str(err) + "]")
  except Exception as err:
    print ("Can't connect due to other error [" + str(err) + "]")

if __name__ == "__main__":
  main(sys.argv[1:])

Below is a sample of the output:

Python networking examples

Output of Listing 3

Windows

The same command can be used to install the Paramiko module in Windows:

C…> pip3 install paramiko

Once Paramiko is installed in Windows, make a note of the known_hosts.txt file created above. The Windows implementation below assumes that the known_hosts.txt file is in the same directory as the Python code.

The same code from before can be adapted for Windows:

# demo-sftp-windows.py

import paramiko
import sys

def main(argv):
  hostkeys = paramiko.hostkeys.HostKeys (filename="known_hosts.txt")
  # The host fingerprint is stored using the ed25519 algorithm. This was revealed
  # when the host was initially connected to from the sftp program invoked earlier.
  hostFingerprint = hostkeys.lookup ("my-sftp-host-or-ip")['ssh-ed25519']    


  try:
    # Note that the parameters below represent a low-level Python Socket, and 
    # they must be represented as such.
    tp = paramiko.Transport("my-sftp-host-or-ip", 22)

    # Note that while you *can* connect without checking the hostkey, you really
    # shouldn't. Without checking the hostkey, a malicious actor can steal
    # your credentials by impersonating the server.
    tp.connect (username = "my-username", password="my-password", hostkey=hostFingerprint)
    try:
      sftpClient = paramiko.SFTPClient.from_transport(tp)
      fileCount = 0
      # Proof of concept - List First 10 Files
      for file in sftpClient.listdir():
        print (str(file))
        fileCount = 1 + fileCount
        if 10 == fileCount:
          break
      sftpClient.close()
    except Exception as err:
      print ("SFTP failed due to [" + str(err) + "]")

    tp.close()
  except paramiko.ssh_exception.AuthenticationException as err:
    print ("Can't connect due to authentication error [" + str(err) + "]")
  except Exception as err:
    print ("Can't connect due to other error [" + str(err) + "]")

if __name__ == "__main__":
  main(sys.argv[1:])

The Importance of Closure

Note how in both code listings, both the sftpClient object and tp objects are closed near the end of the blocks where they are used. This is crucial because certain underlying operations may block if these objects are not closed.

Read: Best Tools for Remote Developers

Simulating Security Problems

As this Python tutorial has made a “big deal” out of ensuring that the SSH Fingerprint matches the one that was discovered initially, it might be interesting to “simulate” what a host impersonation attack might look like. To do this, simply open the ~/.ssh/known_hosts file and make a change to the host key for the system being connected to in Listing 1:

Example of a corrupt SSH file

Intentionally Corrupting the SSH Fingerprint in Linux

Since this Linux distribution uses hashes instead of hosts for entries, It will have to be inferred that since this is the only entry that uses the Ed25519 algorithm, that this is the entry that needs to be modified. It is incumbent upon the developer to ensure that if this type of testing is necessary, that the proper host record is modified. While the example above focuses on one letter, any character on that line to the right of “ssh-ed25519 ” can be changed.

Now running the code in Listing 1 again gives this error:

Mismatched SSH Fingerprinting

A Proper Failure due to a mismatched SSH Fingerprint

If the SSH Fingerprint had been changed on the server side due to a malicious user attempting to impersonate the SSH server, this would be a very welcome failure, as the security credentials are not transmitted in case the SSH Fingerprint does not match.

To fix the problem created above, simply invoke the original command used to discover the SSH Fingerprint and follow the instructions it provides:

Python SSH fingerprinting

Fixing the intentionally created SSH Fingerprint mismatch

Once the offending entries are removed, simply re-attempt SFTP into the original host as per the initial steps above to re-add the SSH Fingerprint to the ~/.ssh/known_hosts file.

The same security problem can be simulated in Windows by making a similar change to the known_hosts.txt file created above:

Corrupt SSH fingerprinting in Python

Intentionally Corrupting the SSH Fingerprint in Windows

And the same error appears in the Windows version of the code. To fix the above problem, simply recreate the known_hosts.txt file per the above steps.

Common SFTP Tasks with Python

The Paramiko module provides a very rich and robust toolkit for simple as well as very complex SFTP tasks. This section will highlight some of the more basic and common SFTP tasks.

Uploading Files with SFTP and Python

The put method uploads a file to the SFTP Server, within the context of an existing open SFTP connection:

# demo-sftp-upload.py

import paramiko
import sys

def main(argv):
	hostkeys = paramiko.hostkeys.HostKeys (filename="/home/phil/.ssh/known_hosts")
	# The host fingerprint is stored using the ed25519 algorithm. This was revealed
	# when the host was initially connected to from the sftp program invoked earlier.
	hostFingerprint = hostkeys.lookup ("my-sftp-host-or-ip")['ssh-ed25519']


	try:
		# Note that the parameters below represent a low-level Python Socket, and 
		# they must be represented as such.
		tp = paramiko.Transport("my-sftp-host-or-ip", 22)

		# Note that while you *can* connect without checking the hostkey, you really
		# shouldn't. Without checking the hostkey, a malicious actor can steal
		# your credentials by impersonating the server.
		tp.connect (username = "my-username", password="my-password", hostkey=hostFingerprint)

		# Use a dictionary object to create a list of files to upload, along with their remote paths.
		# Note that the first entry attempts to upload to a directory without write permissions.
		filesToUpload = {"./Wiring Up Close - Annotated.jpeg":"./no-upload-allowed/Wiring Up Close - Annotated.jpeg",
			"./lipsum.txt":"./lipsum.txt", 
			"./3 Separate LEDs - Full Diagram - Cropped.jpeg":"./3 Separate LEDs - Full Diagram - Cropped.jpeg"}


		sftpClient = paramiko.SFTPClient.from_transport(tp)
		for key, value in filesToUpload.items():
			try:
				sftpClient.put(key, value)
				print ("[" + key + "] successfully uploaded to [" + value + "]")
			except PermissionError as err:
				print ("SFTP Operation Failed on [" + key + 
					"] due to a permissions error on the remote server [" + str(err) + "]")
			except Exception as err:
				print ("SFTP failed due to other error [" + str(err) + "]")

		# Make sure to close all created objects.
		sftpClient.close()

		tp.close()
	except paramiko.ssh_exception.AuthenticationException as err:
		print ("Can't connect due to authentication error [" + str(err) + "]")
	except Exception as err:
		print ("Can't connect due to other error [" + str(err) + "]")

if __name__ == "__main__":
	main(sys.argv[1:])





Listing 3 - Uploading Files

Note how the full directory is specified for each file to be uploaded, both locally and remotely. This is because the put method may raise an error if only the remote directory is specified.

The “no-upload-allowed” directory on the remote SFTP server is explicitly configured to be non-writable, for the purposes of illustrating what happens when an upload is attempted to such a directory.

As was expected, the one attempt to upload to a non-writable directory resulted in a permissions error. The other uploads succeeded.

Downloading Files in SFTP with Python

The get method downloads files from the SFTP Server, within the context of an existing open SFTP connection:

# demo-sftp-download.py

import os
import paramiko
import sys

def main(argv):
 hostkeys = paramiko.hostkeys.HostKeys (filename="known_hosts.txt")
 # The host fingerprint is stored using the ed25519 algorithm. This was revealed
 # when the host was initially connected to from the sftp program invoked earlier.
 hostFingerprint = hostkeys.lookup ("my-sftp-host-or-ip")['ssh-ed25519']


 try:
  # Note that the parameters below represent a low-level Python Socket, and 
  # they must be represented as such.
  tp = paramiko.Transport("my-sftp-host-or-ip", 22)

  # Note that while you *can* connect without checking the hostkey, you really
  # shouldn't. Without checking the hostkey, a malicious actor can steal
  # your credentials by impersonating the server.
  tp.connect (username = "my-username", password="my-password", hostkey=hostFingerprint)

  # Use a dictionary object to create a list of files to download, along with their remote paths.
  # Note that the first entry attempts to download from a directory with no files.
  
  # Note that while this dictionary shows the local path as the key and the remote path
  # as the value, the get method expects the remote path as its first parameter, so the 
  # call to that will look "backwards."
  filesToDownload = {"./Wiring Up Close - Annotated.jpeg":"./no-upload-allowed/Wiring Up Close - Annotated.jpeg",
   "./lipsum.txt":"./lipsum.txt", 
   "./3 Separate LEDs - Full Diagram - Cropped.jpeg":"./3 Separate LEDs - Full Diagram - Cropped.jpeg"}
  sftpClient = paramiko.SFTPClient.from_transport(tp)
  for key, value in filesToDownload.items():
   # Note how the remote file to download is specified first. The path to which it will be saved
   # locally is the second parameter.
   try:
    sftpClient.get (value, key)
    print ("[" + value + "] successfully downloaded to [" + key + "]")
   except FileNotFoundError as err:
    print ("File download failed because [" + value + "] did not exist on the remote server.")
    # Note that the get method may leave a zero-length file in the local path.
    # This should be deleted.
    if os.path.exists(key):
     os.remove(key)
   except Exception as err:
    print ("File download failed for [" + value + "] due to other error [" + str(err) + "]")
  
  # Make sure to close all created objects.
  sftpClient.close()
  tp.close()
 except paramiko.ssh_exception.AuthenticationException as err:
  print ("Can't connect due to authentication error [" + str(err) + "]")
 except Exception as err:
  print ("Can't connect due to other error [" + str(err) + "]")

if __name__ == "__main__":
 main(sys.argv[1:])




Listing 4 - Downloading Files

The major takeaway from this listing is that while the dictionary used here contains the same values as the one in the previous listing, it is the value, as opposed to the key, which is driving the download operation. Another minor twist is that in some circumstances, a zero-length file is created when it cannot be downloaded. It is a good practice to delete such files.

The listing above gives the following output, note the directory listing before and after:

Python and SFTP

The output of Listing 4, showing the downloaded files.

As was the case with uploading a file, an error message was displayed when attempting to download a file that did not exist.

Deleting Files with SFTP and Python

The remove method deletes files on the remote server assuming that the account used to log into the server has sufficient permissions to do so:

# demo-sftp-delete.py

import paramiko
import sys

def main(argv):
	hostkeys = paramiko.hostkeys.HostKeys (filename="/home/phil/.ssh/known_hosts")
	# The host fingerprint is stored using the ed25519 algorithm. This was revealed
	# when the host was initially connected to from the sftp program invoked earlier.
	hostFingerprint = hostkeys.lookup ("my-sftp-host-or-ip")['ssh-ed25519']


	try:
		# Note that the parameters below represent a low-level Python Socket, and 
		# they must be represented as such.
		tp = paramiko.Transport("my-sftp-host-or-ip", 22)

		# Note that while you *can* connect without checking the hostkey, you really
		# shouldn't. Without checking the hostkey, a malicious actor can steal
		# your credentials by impersonating the server.
		tp.connect (username = "my-username", password="my-password", hostkey=hostFingerprint)

		# Use a list to create a list of files to delete, including their remote paths.
		filesToDelete = [ "./no-upload-allowed/Wiring Up Close - Annotated.jpeg",
			"./lipsum.txt", "./3 Separate LEDs - Full Diagram - Cropped.jpeg",
			"./no-upload-allowed/Non-Blocking Input - Key Codes Kali.png"]

		sftpClient = paramiko.SFTPClient.from_transport(tp)

		for file in filesToDelete:
			try:
				sftpClient.remove(file)
				print ("[" + file + "] successfully deleted.")
			except PermissionError as err:
				print ("SFTP Delete Failed on [" + file + 
					"] due to a permissions error on the remote server [" + str(err) + "]")
			except FileNotFoundError as err:
				print ("SFTP Delete Failed on [" + file + "] because it was not found.")
			except Exception as err:
				print ("SFTP failed due to other error [" + str(err) + "]")

		# Make sure to close all created objects.
		sftpClient.close()

		tp.close()
	except paramiko.ssh_exception.AuthenticationException as err:
		print ("Can't connect due to authentication error [" + str(err) + "]")
	except Exception as err:
		print ("Can't connect due to other error [" + str(err) + "]")

if __name__ == "__main__":
	main(sys.argv[1:])






Listing 5 - Deleting Files

Note that additional exceptions are needed to cover the two common reasons why a delete may fail.

Other SFTP and Python Considerations

If the purpose of an SFTP-enabled Python application is to perform some sort of operation on a subset of files to be downloaded, then it is a good practice to download each file individually into the local computer’s temporary directory. It is almost never a good idea to “copy” the entire contents of a remote site before performing additional operations.

Malware typically uses SFTP to steal a local computer’s files, and certain virus scanning software is often on the lookout for multiple sequential SFTP operations occurring outside of the purview of a traditional SFTP client such as FileZilla or WinSCP. In such situations, this virus scanning software is known to simply block the execution of an SFTP-enabled Python application. It may be necessary to create an exception to allow SFTP-enabled Python applications to operate.

In the next installment of this Python network programming tutorial, we will look at ways to work with Python and HTTPS on the client-side.

Read more Python programming tutorials and software development guides.

Latest Posts

Related Stories