|
GRPC Core
15.0.0
|
Author: Sree Kuchibhotla (@sreecha) - Sep 2018
Polling engine component was created for the following reasons:
grpc_endpoint code calls recvmsg call when the fd is readable and sendmsg call when the fd is writabletcp_client connect code issues async connect and finishes creating the client once the fd is writable (i.e when the connect actually finished)There are multiple polling engine implementations depending on the OS and the OS version. Fortunately all of them expose the same interface
epollex** (default but requires kernel version >= 4.5),epoll1 (If epollex is not available and glibc version >= 2.9)poll (If kernel does not have epoll support)poll** (default)libuv polling engine implementation (requires different compile #defines)The following are the Opaque structures exposed by Polling Engine interface (NOTE: Different polling engine implementations have different definitions of these structures)
grpc_pollsetsgrpc_pollset_work() APIgrpc_fds, grpc_pollsets and grpc_pollset_sets (yes, a grpc_pollset_set can contain other grpc_pollset_sets)grpc_fd_notify_on_(grpc_fd* fd, grpc_closure* closure)grpc_fd_shutdown(grpc_fd* fd)grpc_fd_orphan(grpc_fd* fd, grpc_closure* on_done, int* release_fd, char* reason)grpc_fd structure and call on_done closure when the operation is completerelease_fd is set to nullptr, then close() the underlying fd as well. If not, put the underlying fd in release_fd (and do not call close())release_fd set to non-null in cases where the underlying fd is NOT owned by grpc core (like for example the fds used by C-Ares DNS resolver )grpc_pollset_add_fd(grpc_pollset* ps, grpc_fd *fd)grpc_pollset_remove_fd. This is because calling grpc_fd_orphan() will effectively remove the fd from all the pollsets it’s a part ofgrpc_pollset_work(grpc_pollset* ps, grpc_pollset_worker** worker, grpc_millis deadline) > NOTE: grpc_pollset_work() requires the pollset mutex to be locked before calling it. Shortly after calling grpc_pollset_work(), the function populates the *worker pointer (among other things) and releases the mutex. Once grpc_pollset_work() returns, the *worker pointer is invalid and should not be used anymore. See the code in completion_queue.cc to see how this is used.grpc_pollset_kick for more details)grpc_pollset_kick(grpc_pollset* ps, grpc_pollset_worker* worker)worker == nullptr, kick ANY worker active on that pollsetgrpc_pollset_set_[add|del]_fd(grpc_pollset_set* pss, grpc_fd *fd)grpc_pollset_setgrpc_pollset_set_[add|del]_pollset(grpc_pollset_set* pss, grpc_pollset* ps)grpc_pollset_work() on the pollset will also poll all the fds in the pollset_set i.e semantically, it is similar to adding all the fds inside pollset_set to the pollset.grpc_pollset_set_[add|del]_pollset_set(grpc_pollset_set* bag, grpc_pollset_set* item)Relation between grpc_pollset_worker, grpc_pollset and grpc_fd:

grpc_pollset_set


Code at src/core/lib/iomgr/ev_epoll1_posix.cc
pollset_neighborhood (a structure internal to epoll1 polling engine implementation). grpc_pollset_workers that call grpc_pollset_work on a given pollset are all queued in a linked-list against the grpc_pollset. The head of the linked list is called "root worker"pollset_neighborhood listed is scanned to pick the next pollset and worker that could be the new designated poller.grpc_pollset_workers with a way to group them per-pollset (needed to implement grpc_pollset_kick semantics) and a way randomly select a new designated pollerbegin_worker() function to see how a designated poller is chosen. Similarly end_worker() function is called by the worker that was just out of epoll_wait() and will have to choose a new designated poller)
Code at src/core/lib/iomgr/ev_epollex_posix.cc
Pollable, then the pollable MUST be either empty or of type PO_FD (i.e single-fd)Pollables (even if one of the Pollables is of type PO_FD)Pollables of type PO_FD for the same fdPollable of type PO_FD and PO_EMPTY ?Pollable and hence an epollset. This is because every completion queue automatically creates a pollset and the channel fd will have to be put in that pollset. This clearly requires an epollset to put that fd. Creating an epollset per call (even if we delete the epollset once the call is completed) would mean a lot of sys calls to create/delete epoll fds. This is clearly not a good idea.Pollables, all pollsets (corresponding to the new per-call completion queue) will initially point to PO_EMPTY global epollset. Then once the channel fd is added to the pollset, the pollset will point to the Pollable of type PO_FD containing just that fd (i.e it will reuse the existing Pollable). This way, the epoll fd creation/deletion churn is avoided.poll polling engine is quite complicated. It uses the poll() function to do the polling (and hence it is for platforms like osx where epoll is not available)src/core/lib/iomgr/ev_poll_posix.cc is written a certain/seemingly complicated way :))
1.8.17