Threadpool
Title: | Easy to use object-oriented thread pool framework |
Author: |
Christopher Arndt |
Version: |
1.3.2 |
Date: |
2015-11-29 |
License: | MIT License |
Warning
This module is OBSOLETE and is only provided on PyPI to support old
projects that still use it. Please DO NOT USE IT FOR NEW PROJECTS!
Use modern alternatives like the multiprocessing
module in the standard library or even an asynchroneous approach with
asyncio.
Description
A thread pool is an object that maintains a pool of worker threads to perform
time consuming operations in parallel. It assigns jobs to the threads
by putting them in a work request queue, where they are picked up by the
next available thread. This then performs the requested operation in the
background and puts the results in another queue.
The thread pool object can then collect the results from all threads from
this queue as soon as they become available or after all threads have
finished their work. It's then possible to define callbacks to handle
each result as it comes in.
Note
This module is regarded as an extended example, not as a finished product.
Feel free to adapt it too your needs.
Basic usage
>>> pool = ThreadPool(poolsize)
>>> requests = makeRequests(some_callable, list_of_args, callback)
>>> [pool.putRequest(req) for req in requests]
>>> pool.wait()
See the end of the module source code for a longer, annotated usage example.
Documentation
You can view the API documentation, generated by epydoc, here:
API documentation
The documentation is also packaged in the distribution.
Download
You can download the latest version of this module here:
Download directory
or see the colorized source code:
threadpool.py
You can also install it from the Python Package Index PyPI via pip:
[sudo] pip install threadpool
Or you can check out the latest development version from the Git
repository:
git clone https://github.com/SpotlightKid/threadpool.git
Discussion
The basic concept and some code was taken from the book "Python in a Nutshell"
by Alex Martelli, copyright O'Reilly 2003, ISBN 0-596-00188-6, from section
14.5 "Threaded Program Architecture". I wrapped the main program logic in the
ThreadPool class, added the WorkRequest class and the callback system
and tweaked the code here and there.
There are some other recipes in the Python Cookbook, that serve a similar
purpose. This one distinguishes itself by the following characteristics:
- Object-oriented, reusable design
- Provides callback mechanism to process results as they are returned from the
worker threads.
- WorkRequest objects wrap the tasks assigned to the worker threads and
allow for easy passing of arbitrary data to the callbacks.
- The use of the Queue class solves most locking issues.
- All worker threads are daemonic, so they exit when the main programm exits,
no need for joining.
- Threads start running as soon as you create them. No need to start or stop
them. You can increase or decrease the pool size at any time, superfluous
threads will just exit when they finish their current task.
- You don't need to keep a reference to a thread after you have assigned the
last task to it. You just tell it: "don't come back looking for work, when
you're done!"
- Threads don't eat up cycles while waiting to be assigned a task, they just
block when the task queue is empty (though they wake up every few seconds to
check whether they are dismissd).
Notes
Due to the parallel nature of threads, you have to keep some things in mind:
- Do not use simultaneous threads for tasks were they compete for a single,
scarce resource (e.g. a harddisk or stdout). This will probably be slower
than taking a serialized approach.
- If you call ThreadPool.wait() the main thread will block until _all_
results have arrived. If you only want to check for results that are available
immediately, use ThreadPool.poll().
- The results of the work requests are not stored anywhere. You should provide
an appropriate callback if you want to do so.
References
There are several other recipes similar to this module in the Python Cookbook,
for example:
News
- 2015-11-29 (1.3.2)
- Added missing release.py to source distribution
- 2015-10-14 (1.3.1)
- Minor distribution procedure changes
- 2015-10-14 (1.3.0)
- Migrated repository from SVN to Git.
- Incorporated changes (with minor adjustments) from Lutz Prechelt from
https://github.com/prechelt/threadpool to make Threadpool Python 3
compatible. Thanks, Lutz!
- Fixed some build errors.
- 2009-10-07 (1.2.7)
- I made a stupid error and made threadpool.py import from release.py but
this module is not installed by setup.py. Removed import again.
- 2009-10-06 (1.2.6)
- Due to some mix up up the I got the bugfix for the 'timeout' parameter
to ThreadPool.putRequest exactly the wrong way round (or I "fixed" it
twice). It now defaults to None as its should, so putRequest blocks by
default, if the requests queue is full. Thanks for Guillaume Taglang for
reporting the issue.
- Rename NEWS.txt to CHANGELOG.txt (this file).
- Add SVN checkout instructions to README.
- 2008-11-19
- Update reference to "Python In A Nutshell" to second edition (suggested
by Alex Martelli).
- Fixed typo in WorkerThread.run() (thanks to Nicholas Bollweg, Aaron
Levinson, Rogério Schneider, Grégory Starck for reporting).
- Fixed missing first argument in call to Queue.get() in WorkerThread.run()
(thanks to Aaron Levinson for report).
- added new argument 'do_join' to ThreadPool.dismissWorkers(). When True,
the method will perform Thread.join() on each thread after dismissing it.
- Added joinAllDismissedWorkers method to ThreadPool to join dismissed
threads at a later time (thanks to Aaron Levinson for patch for these two
changes).
- 2008-05-04
- 'timeout' parameter of ThreadPool.putRequest now correctly defaults to 0
instead of None (thanks to Mads Sülau Jørgensen for bug report).
- Added default exception handler callback (thanks to Moshe Cohen for the
patch).
- Fixed locking issue that prevented worker threads from being dismissed
when no work requests are in the requests queue (thanks to Guillaume
Pratte for the bug report).
- Add option for results queue size to ThreadPool (thanks to Krzysztof
Jakubczyk for the idea).
- Changed name of reuquestQueue and resultsQueue attributes in WorkerThread
and ThreadPool to _requests_queue and _results_queue to be more consistent
and compliant with PEP 8 and properly indicate private nature.
- Moved repository to Subversion.
- 2008-05-03
- Updated homepage and download URL
- Updated README
- Enable packaging as an eggs with the use of setuptools
- License changes to MIT License (Python license is only for code licensed
by the PSF)
- 2006-06-23 1.2.3 (never announced)
- fixed typo in ThreadPool.putRequest() (reported by Jérôme Schneider)
- 2006-05-19 1.2.2 (first release as a package)
- fixed wrong usage of isinstance in makeRequests()
Thanks to anonymous for bug report in comment on ASPN
- added setup.py and created a proper distribution package
- added timeout parameter to putRequest()