August 1, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Using the New Python 3.2 Concurrent Programming Features

  • April 13, 2011
  • By Edmon Begoli
  • Send Email »
  • More Articles »

The most recent release of Python (3.2) has introduced a new module called concurrent.futures that offers Python programmers new, streamlined and flexible means for handling common concurrent programming tasks -- thread and process submission, results handling, synchronization of execution and worker threads/process pooling.

Three classes -- Futures, Executors and ExecutorPools -- constitute the basis of this new package, and the previously mentioned common concurrent programming tasks are accomplished through the interaction of these three classes.

Programmers familiar with Java will find these new features very familiar. The design of the concurrent.futures module is directly inspired by the namesake structures available in Java's java.util.concurrent package.

What Are Python Futures, Executors and Executor Pools?

Concurrent programming is, by the nature of the model, a more challenging task than single-threaded sequential programming. Programmers need to manage distribution of the tasks, non-deterministic execution flows, and synchronization of the completion of the concurrent tasks.

The purpose of the Futures class, as a design concept, is to mitigate some of the cognitive burdens of concurrent programming. Futures, as a higher abstraction of the thread of execution, offer means for initiation, execution and tracking of the completion of the concurrent tasks.

One can think of Futures as objects that model a running task, unlike a synchronously executing function, that will produce a result at some point in the future. Futures offer methods to query the status of the running task and, if necessary, to shut it down.

Futures are not meant to be directly instantiated and executed by a programmer. One can think of Futures as interfaces that can be queried but not instantiated directly. Futures are instantiated by submitting tasks (functions with optional parameters) to Executors. Executors are launchers that initialize and start the Futures.

Executors, in Python, are abstractions that are accessed through their subclasses: Thread or ProcessExecutorPools.

Use of pools of threads and processes is another best design and implementation practice of concurrent programming. Instantiation of threads and process is a resource-demanding task, so it is better to pool these resources and use them as repeatable launchers or "executors" (hence the Executors concept) for parallel or concurrent tasks.

Python Futures, Executors and ExecutorPools in Action

Let's look at some examples.

In this article, I will use a search over several files as an example of how can one use Futures and Executors in Python to execute tasks suitable for concurrent execution.

Starting with the simplest tasks, I will introduce some of the most interesting functions of the concurrent.future package, and we will move up a bit in increasing order of complexity.

Let's start with the simple example of running individually instantiated search tasks for three different strings in three different files:

def basic_search():
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
future1 = executor.submit( search, "test1.txt", "Test")
future2 = executor.submit( search, "test2.txt", "synergy")
future3 = executor.submit( search, "test3.txt", "Something Else")
print( future1.result() )
print( future2.result() )
print( future3.result() )

In the example above, in the method basic_search() I instantiate a ThreadPoolExececutor named executor with the keyword with:

with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:

Notice also that I instantiated executor pool with a max_workers=5 parameter. This indicates the maximum size of the worker threads in the pool that will handle the execution of the concurrent tasks at the time.

Using executor pool, I submit three search jobs by passing into an executor's submit function, as parameters, the name of the search function (file_search) and two arguments for the search function itself (file name and string to search for):

future1 = executor.submit( file_search, "test1.txt", "Test")

The call to each submit returns a future. I get the results of each search task on the files by calling the result() method on the future objects that were returned by the executor.

In this case, to process the results, I just print them:

print( future1.result() )

In the example above, using the file_search function, I am simply returning a string indicating if the search string was found with the index or match, but the actual scenario for the search task and return types can be far more complex--futures as result of their execution may even return other futures. In addition, future objects returned from the executor can also be queried for the execution status. This is accomplished with the methods cancelled(), running() and done(). Futures may also be "asked" to halt with the method cancel(). (I purposefully use the term "asked" because a program is, by design contract of most threading libraries and underlying implementations, never guaranteed immediate control over the thread's status of execution).

The above example, despite its intended triviality, demonstrates the typical pattern of use for objects provided in concurrent.features package:

  1. Call ThreadPoolExecutor to get an instance of the Executor.
  2. Submit one or more routines into it.
  3. Get as a result one or more Future objects.
  4. Query Future object for the result of the execution.

Python Futures with Callbacks

In addition to the basic scenario with getting the result from the future, futures also support attachment of the callback--the function that gets assigned to the future and that gets called when the future completes execution.

Building up from the original basic example, here is how the parallel file search would be processed with callbacks. In the example below, the function process_result is attached to the search futures as callback:

def with_callback():
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
executor.submit( search, "test1.txt", "Test").add_done_callback(
process_result )
executor.submit( search, "test2.txt", "Tset").add_done_callback(
process_result )
executor.submit( search, "test3.txt", "Something").add_done_callback(
process_result )

Note the use of the add_done_callback method. It attaches the method process_result to a future that is returned from the call to submit on an executor.

The process_result method is an ordinary function that I implement to print the result of search returned from a future. The only requirement for callback processing function is to accept a single future object as its parameter.

def process_result( future ):

print ( "callback on result: " + future.result() )

Tags: Python, Concurrency

Originally published on http://www.developer.com.

Page 1 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel