[PLUG] high cpu utilization

Abhijit Bhopatkar bain at devslashzero.com
Sat Nov 14 09:44:38 IST 2009

> > No, you don't want to do that,
> > Passing data around pipes is expensive, you might as well busy spin in
> > write thread. (yes there is always a splice syscall to do a zero copy op,
> > but its complicated and not needed, all we need here is a conditional
> > variable).
> How is a busy spin better than a blocking read on a pipe? Also, since
> he is streaming, he needs a 'stream' (perhaps eventually, the write
> may not be so fast?). Either he would have to implement buffer
> management to get the desired stream functionality or go for something
> like a pipe.

An who said streaming _has_ to involve data copying multiple times.
Writing to a pipe is making three copies of the data. once in read thread, one 
inside the pipe and one in write thread. For each page that has to be written 
to the disk we are creating three. Which is ridiculous. Splitting it in 
different processes is even more stupid (unless using splice as below, with 
splice world makes sense again). 

And the reason busy waiting will be faster is because of the cache lines, 
using any RPC mechanism for transferring huge amounts of data usually screws 
up you cache lines enough to give drastic regression in performance. Busy 
waiting will just make CPU cycles waste and not memory bus bandwidth (which is 
a lot more limited resource in intel arch),which will be handled rather 
gracefully by multi core cpu and linux schedular.

But i understand the point you are trying to make. Pipe is a nice abstraction 
for a nice streaming data. Ands thats where a simple splice thing works the 
best. Somewhere around 2002ish Linus implemented zerocopy pipes using splice, 
This essentially eliminates the read/write thread and even the first data copy.
You open a device driver that produces a data, spilce it to a pipe and splice 
the pipe to a consumer who then splices  it back to whatever.
Basically a producer thread (like a video driver controller) just directs the 
stream to consumer (like X server window), which ultimately forwards it to 
final dest( the video driver). All of this happens without a single data copy. 

This is how you can now use unix pipes to implement zero copy data streaming.

BTW: if you want to process the data in between, use vmsplice instead


PS: I just know the general idea about splice, google for exact implementation 
and usage, i might have been wrong in specifics in above scenario.

A nice enough intro i found with quick google is here

More information about the Plug-mail mailing list