Inter-Process Communication
What is IPC and Why we need it?
IPC Techniques
- Unix Domain sockets
- Network Sockets
- Message Queues
- Shared Memory
- Pipes (Not used in the industry)
- Signal
Sockets
Two types:
- Unix Domain Sockets: IPC between processes running on the same system
- Network Sockets: Communication between processes running on different physical machines over the network
Sockets Steps and related API
- Remove the socket, if already exits
- Create a Unix socket using socket()
- Specify the socket name
- Bind the socket using bind()
- listen()
- accept()
- Read the data recvd on socket using recvfrom()
- Send back the result using sendto()
- Close the data socket
- Close the connection socket
- Remove the socket
- Exit
Sockets Message Types
Messages exchange between the client and the server processes can be categorized into two types:
- Connection Initiation Request Messages (CIR)
- Service Request Messages (SRM)
State Machine of Socket based Client Server Communication
- When server boots up, it create a connection socket (also called “master socket file descriptor” using socket()), M = socket()
- M is the mother of all client handles. M gives birth to all client handles. Client handles are also called “data_sockets”
- Once client handles are created for each client. Server carries out communication (actual date exchange) with the client using client handle (and not M)
- Server has to maintain the database of connected client handles or data sockets
- M is only used to create new client handles, M is not used for data exchange with already connected client
- accept() is the system call used on server side to create client handles
- In linux terminology, handles are called as “file descriptors” which are just positive interger numbers. Client handles are called “communication file descriptors” or “data sockets” and M is called “Master socket file descriptor” or “Connection socket”
Unix Domain Sockets
Using UNIX Domain Sockets, you can setup STREAM of DATAGRAM based communication
- STREAM : When large files need to be moved or copied from one location to another, eg: copying a movies, continuous flow of bytes
- DATAGRAM : When small units of data needs to be moved from one process to another within a system
Multiplexing
Example
Code example of unix socket and network socket
Message Queues
- Linux/Unix OS provides another mechanism called Message Queue for carrying out IPC
- Processes running on same machine can exchange any data using message queues
- Process can create a new message queue or can use an existing msgQ which was created by another process
- A message queue is identified uniquely by the ID, no two msgQ can have same ID
- Message queue resides and manage by the kernel/OS
- Sending process A can post the data to the message queue, receiving process B reads the data from msg Q
- Process which creates a msgQ is termed as owner or creator of msgQ
- MsgQ as a kernel resource
Message queue creation
A process can create new msgQ or use an existing msgQ using below API mq_open(const char *name, int oflag); mq_open(const char *name, int oflag, mode_t mode, struct mq_attr *attr);
- name - Name of msg Q, eg “/server-msg-q”
- oflag - Operational flags
- mode - Permissions set by the owning process on the queue, usually 0660
- attr - Specify various atrributes of the msgQ being created
- Like maximum size the msgQ can grow, should be less than or equal to /proc/sys/fs/mqueue/msg_max
- Maximum size of the msg which msgQ can hold, should be less than or equal to /proc/sys/fs/mqueue/msgsize_max if mq_open() succeeds, it returns a file (a handle) to msgQ. Using this handle, we perform all msgQ operations on msgQ throughout the program
Message queue closing
A process can close a msgQ using below API int mq_close(mqd_t msgQ);
- After closing the msgQ, the process cannot use the msgQ unless it open it again using mq_open()
- Operating system removes and destroy the msgQ if all processes using the msgQ have closed it
- OS maintains information regarding how many process are using same msgQ (invoked mq_open()). This concept is called reference counting
- msgQ is a kernel reqource which is being used by application process. For kernel resource, kernel keeps track how many user space processes are using that particular resource
- When a kernel resource (msgQ in our exemple) is created for the first time by application process, reference_count = 1
- If other process also invoke open() on existing kernel resource (mq_open() in our case), kernel increments reference_count by 1
- When a processes invoke close() in existing kernel resource (mq_close() in our case), kernel decrements reference_count by 1
- When reference_count = 0, kernel; cleanups/destroys that kernel resource
- Remember, kernel resource could be anything, it could be socket FD, msgQ FD etc
Enqueue a message
A sending process can place a message in a message queue using below API int mq_send(mqd_t msgQ, char *msg_ptr, size_t msg_len, unsigned int msg_prio);
- mq_send is for sending a message to the queue referenced by the descriptor msgQ
- The msg_ptr points to the message buffer, msg_len is the size of the message, which should be less than or equal to the message size for the queue
- msg_prio is the message priority, which is a non-negative number specifying the priority of the message
- Message are placed in the queue in the decreasing order of message priority, with the older messages for a priority coming before newer messages
- If the queue is full, mq_send blocks till there is space on the queue, unless the O_NONBLOCK flag is enabled for the message queue, in which case mq_send returns immediately with errno set to EAGAIN
Dequeue a message
A receiving process can dequeue a message in a message queue using below API int mq_receive(mqd_t msgQ, char *msg_ptr, size_t msg_len, unsigned int msg_prio);
- mq_receive is for retrieving a message from the queue referenced by the descriptor msgQ
- The msg_ptr points to the empty message buffer, msg_len is the size of the buffer in bytes
- The oldest msg of the highest priority is deleted from the queue and passed to the process in the buffer pointed by msg_ptr
- If the pointer msg_prio is not null, the priority of the received message is stored in the interger pointed by it
- The default behavior of mq_receive is to block if there is no message in the queue. However, if the O_NONBLOCK flag is enabled for the queue, and the queue is empty, mq_receive returns immediately with errno set to EAGAIN
- On success, mq_receive returns the number of bytes receoved in the buffer pointed by msg_ptr
Destroying a message queue
A creating process can destroy a message queue using below API int mq_unlink(const char *msgQ_name);
- mq_unlink destroys the msgQ (release the kernel resource)
- Should be called when the process has invoked mq_close() on msgQ
- Return -1 if it fails, 0 on success
- Postpone if other processes are using msgQ
Using a message queue
- A message queue IPC mechanism usually supports N:1 communication paradigm, meaning there can be N senders but 1 receiver per message queue
- Multiple senders can open the same msgQ using msgQ name, and enque their msgs in the same queue
- Receiver process can dequeue the messages from the message queue that was placed by different sender processes
- However, receiving process can dequeue message from different message queues at the same time (multiplexing using slect())
- A msg queue can support only one client process
- Client can dequeue msgs from more than one msg queues
- No limitation on server processes
Example
Shared Memory
Memory mapping
Virtual memory, physical memory and secondary memory setup
- Not all programs needs secondary storages, but most non-trivial applications do
- Memory mapping is used to change the secondary storage of the program to some other source, say some hardware device memory or some particular file on disk
- Your application is not even aware of physical memory, let alone secondary storages
- So, your same notepad application shall run seamlessly even if we change secondary storage source to printer/camera device memory
Using external data source as shared memory
- Virtual pages of both the processes maps to same physical pages loaded in RAM
- Physical pages in turn are read/written to extenal memory
- Rule: a process never can access any address outside of its VAS is never violated
- Any modification made by process A in its shared VM, shall be seen by process B
Using RAM itself as data source
mmap()
void *mmap(void addr[.length], size_t length, int prot, int flags, int fd, off_t offset);
Design constraints for using shared memory as IPC
Shared memory approach of IPC is used in a scenario where:
- Exactly one processes is responsible to update the shared memory (publisher process)
- Rest of the processes only read the shared memory (subscriber processes)
- The frequency of updating the shared memory by publisher process should not be very high
- If multiple processes attempts to update the shared memory at the same time, then it leads to write-wite conflict
- We need to handle this situation using mutual exclusion based synchronization
- Synchronization comes at the cost of performance, because we put the threads to sleep (in addition to their natural CPU preemption) in order to prevent concurrent access to critical section
When publisher process update the shared memory:
- The subscribers would not know about this update
- Therefore, after updating the shared memory, publisher needs to send a notification to all publishers which states that “shared memory has been updated”
- After receiving this notification, subscribers can read the update shared memory and update their internal data structures, if any
- The notification is just a small message, and can be sent out using other IPC mechanisms, such as Unix domain sockets or msg queues
Example
Signals
A system message sent from one process to another, not usually used to transfer data but instead used to remotely command the partnered process. When a process receives a signal. Either of the three things can happen:
- Default
- Customized
- Ignore
Well known signals in linux
- SIGINT - interrupt (i.e., Ctrl-c)
- SIGUSR1 and SIGUSR2 - user defined signals
- SIGKILL - sent to process from kernel when kill -9 is invoked on pid, this signal cannot be caught by the process
- SIGABRT - raised by abort() by the process itself, cannot be blocked, the process is terminated
- SIGTERM - raised when kill is invoked, can be caught by process to execute user defined action
- SIGSEGV - segmentation fault, raised by the kernel to the process when illegal memory is referenced
- SIGCHILD - whenever a child terminates, this signal is sent to the parent. Upon receiving this signal, parent should execute wait() system call to read child status. Need to understand fork() to understand this signal
Three ways of generating signals in linux
- Raising a signal from OS to a process
- Sending a signal from process A to itself (using raise())
- Sending signal from process A to process B (using kill())
Example
References
[1] Linux Inter Process Communication (IPC) from Scratch in C