The normal action of the sg driver for a read operation (from a device) is to request the lower level (adapter) driver to DMA [1] data into kernel buffers that the sg driver manages. The sg driver will then copy the contents of its buffers into the user space. [This sequence is reversed for a write operation (towards a device)]. While this double handling of data is obviously inefficient it does decouple some hardware issues from user applications. For these and historical reasons the "double-buffered" IO remains the default for the sg driver.
Both "direct" and "mmap-ed" IO are techniques that permit the data to be DMA-ed directly from the lower level (adapter) driver into the user application (vice versa for write operations). Both techniques result in faster speed, smaller latencies and lower CPU utilization but come at the expense of complexity (as always). For example the Linux kernel must not attempt to swap out pages in a user application that a SCSI adapter is busy DMA-ing data into.
[1] | Older SCSI adapters and some pseudo adapter drivers don't have DMA capability in which case the CPU is used to copy the data. |