How to implement Threads in Python
Threads in Python are used for concurrent execution of functions. They should be used for jobs which have more I/O operations. Threads in Python are more suitable to be used when there are I/O based tasks. I/O means reads and writes to network DB etc.
But if you have more CPU bound tasks, You should be using multiprocessing. Which I shall delve into in some future post.
Python Threads does not give true parallelism as there is something called GIL in python which restricts threads to run in parallel. GIL stands for Global Interpreter Lock which is part of the Python's design and it does not allow threads to run two or more threads or to say two bytecodes simultaneously in parallel. This was done so as to make the design simple for memory management, thread safety and Legacy compatibility of certain modules.
GIL is an important topic and discussing about it will take another Post. GIL is the reason that we should be using threads when we have more I/O bound tasks and less CPU bound tasks.
The performance benefits comes when there are more I/O which lets the other thread to take over and keep processing when I/O operation is taking place in the other thread.
Threads are relatively lighweight compared to multiprocessing. It is easier to do inter-thread communications than inter-process communications.
The simplest Threading program:
import threading
def called_with_threads(name):
for i in range(10):
print(name, i)
thread1 = threading.Thread(target=called_with_threads, args=("thread1",))
thread2 = threading.Thread(target=called_with_threads, args=("thread2",))
thread1.start()
thread2.start()
Another way to use a class and inheriting from the threading.Thread
class.
import threading
class MyThread(threading.Thread):
def __init__(self, name):
super().__init__()
self.name = name
def run(self):
for i in range(10):
print(self.name,i)
thread1 = MyThread("thread1")
thread2 = MyThread("thread2")
thread1.start()
thread2.start()
Things to remember here:
- You will need to import the threading module
- If you are using the threading.Thread class you should supply the target=funcname, args = (argument1, argument2..)
- To run the Thread you will need to call .start() on the thread
- If you are using the class-based Threads you will need implement a function called run which shall execute the code.
Then lets now summarise the Advantages and Disadvantges of using Threads
Advantages
- It is Lightweight and has lower overhead as compared to multiprocessing. and is simple to code
- Is well suited for I/O bound tasks and acheive high performance in such senarios
- Can be used to make the application responsive and not be blocked for I/O
Disadvantages:
- Is not Suitable for CPU bound tasks as it cannot take the advantage of Mutiple cores of the CPU because of GIL
.join(): The above implementations will terminate as soon as you run them, reason being the main function gets terminated as soon as the threads are spawned. if you want the main function which started the thread to pause till the threads are fully executed before moving forward, we use the join as illustrated below.
import threading
def called_with_threads(name):
pass # add some code for running this thread
th = threading.Thread(target=called_with_threads, args=("th",))
th.start()
th.join()
This was just an introduction, In my upcomming blogs I will discuss about resource sharing, thread safety on shared resources and how there are some datastructures available such as the queue and dequeue which are already thread safe.