Linux的O_DIRECT选项
2015-09-03 19:03
633 查看
在man 2 open的时候发现一个O_DIRECT选项,使用O_DIRECT选项后,可以不使用缓存直接写入。在海量数据写入的时候,不使用缓存貌似更快呢!于是也尝试写了一个用O_DIRECT选项的文件写入。完成O_DIRECT选项写入的代码还真不容易,使用new或者malloc分配的内存是无法在O_DIRECT选项下工作的,必须使用posix_memalign(或valloc, memalign,这两个函数已经被标记为废弃)。
我的猜想(唉,没学过内核,只能是猜想)是:要使用高性能的调用,则在用户空间分配的内存的格式必须与内核空间所使用的内存页一致,这样内核才能高速地处理数据。posix_memalign是基于页的方式来分配内存的,且分配的大小必须是页大小的整数倍。
以下是实现了O_DIRECT选项的文件写入代码:
//写入200MB数据花费的时间是3100994微秒,3.1秒,更慢了,而且慢太多!!!
/*
测试大数据写入的性能 test_io_4.cpp 使用posix函数,且使用O_DIRECT选项
*/
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
static int operator-(struct timeval& lsh, struct timeval& rsh)
{
if (lsh.tv_sec==rsh.tv_sec)
{
return lsh.tv_usec - rsh.tv_usec;
}
else
{
return (lsh.tv_sec-rsh.tv_sec)*1000000 + (lsh.tv_usec - rsh.tv_usec);
}
}
void test()
{
struct timeval start;
struct timeval end;
const int DATA_LEN = 1024*1024*200; //200MB
char* pData = NULL;
printf("page size=%d\n", getpagesize());
int nTemp = posix_memalign((void**)&pData, getpagesize(), DATA_LEN);
if (0!=nTemp)
{
perror("posix_memalign error");
return;
}
//pData[DATA_LEN-1] = '\0';
gettimeofday(&start, NULL);
int fd = open("write_direct.dat", O_RDWR | O_CREAT | O_DIRECT);
if (fd<0)
{
perror("open error:");
return;
}
int nLen = write(fd, pData, DATA_LEN);
if (nLen<DATA_LEN)
{
perror("write error:");
return;
}
close(fd);
fd = -1;
gettimeofday(&end, NULL);
free(pData);
pData = NULL;
//显示占用时间
struct tm stTime;
localtime_r(&start.tv_sec, &stTime);
char strTemp[40];
strftime(strTemp, sizeof(strTemp)-1, "%Y-%m-%d %H:%M:%S", &stTime);
printf("start=%s.%07d\n", strTemp, start.tv_usec);
//
localtime_r(&end.tv_sec, &stTime);
strftime(strTemp, sizeof(strTemp)-1, "%Y-%m-%d %H:%M:%S", &stTime);
printf("end =%s.%07d\n", strTemp, end.tv_usec);
printf("spend=%d 微秒\n", end-start);
}
int main()
{
test();
return 1;
}
使用了O_DIRECT选项反而是文件写入更慢了,百思不得其解,终于在网上找到这篇文章:
《Linus对O_DIRECT非常不感冒啊,呵呵》
http://www.linux-ren.org/modules/newbb/viewtopic.php?topic_id=2722
-------------------------------------------
A thread on the
lkml began with a query about using O_DIRECT when opening a file. An early
white paper written by Andrea Arcangeli [interview] to describe the O_DIRECT patch before it was merged into the 2.4 kernel
explains, "with O_DIRECT the kernel will do DMA directly from/to the physical memory pointed [to] by the userspace buffer passed as [a] parameter to the read/write syscalls. So there will be no CPU and memory bandwidth spent in the copies between userspace
memory and kernel cache, and there will be no CPU time spent in kernel in the management of the cache (like cache lookups, per-page locks etc..)." Linux creator Linus Torvalds was quick to reply that despite all the claims there is no good reason for mounting
files with O_DIRECT, suggesting that interfaces like
madvise() and
posix_fadvise() should be used instead.
From: Aubrey [email blocked]
To: "Hua Zhong" [email blocked]O_
Subject: O_DIRECT question
Date: Thu, 11 Jan 2007 10:57:06 +0800
Hi all,
Opening file with O_DIRECT flag can do the un-buffered read/write access.
So if I need un-buffered access, I have to change all of my
applications to add this flag. What's more, Some scripts like "cp
oldfile newfile" still use pagecache and buffer.
Now, my question is, is there a existing way to mount a filesystem
with O_DIRECT flag? so that I don't need to change anything in my
system. If there is no option so far, What is the right way to achieve
my purpose?
Thanks a lot.
-Aubrey
From: Linus Torvalds [email blocked]
Subject: Re: O_DIRECT question
Date: Wed, 10 Jan 2007 19:05:30 -0800 (PST)
On Thu, 11 Jan 2007, Aubrey wrote:
>
> Now, my question is, is there a existing way to mount a filesystem
> with O_DIRECT flag? so that I don't need to change anything in my
> system. If there is no option so far, What is the right way to achieve
> my purpose?
The right way to do it is to just not use O_DIRECT.
The whole notion of "direct IO" is totally braindamaged. Just say no.
This is your brain: O
This is your brain on O_DIRECT: .
Any questions?
I should have fought back harder. There really is no valid reason for EVER
using O_DIRECT. You need a buffer whatever IO you do, and it might as well
be the page cache. There are better ways to control the page cache than
play games and think that a page cache isn't necessary.
So don't use O_DIRECT. Use things like madvise() and posix_fadvise()
instead.
Linus
From: Linus Torvalds [email blocked]
Subject: Re: O_DIRECT question
Date: Wed, 10 Jan 2007 19:15:48 -0800 (PST)
On Wed, 10 Jan 2007, Linus Torvalds wrote:
>
> So don't use O_DIRECT. Use things like madvise() and posix_fadvise()
> instead.
Side note: the only reason O_DIRECT exists is because database people are
too used to it, because other OS's haven't had enough taste to tell them
to do it right, so they've historically hacked their OS to get out of the
way.
As a result, our madvise and/or posix_fadvise interfaces may not be all
that strong, because people sadly don't use them that much. It's a sad
example of a totally broken interface (O_DIRECT) resulting in better
interfaces not getting used, and then not getting as much development
effort put into them.
So O_DIRECT not only is a total disaster from a design standpoint (just
look at all the crap it results in), it also indirectly has hurt better
interfaces. For example, POSIX_FADV_NOREUSE (which _could_ be a useful and
clean interface to make sure we don't pollute memory unnecessarily with
cached pages after they are all done) ends up being a no-op ;/
Sad. And it's one of those self-fulfilling prophecies. Still, I hope some
day we can just rip the damn disaster out.
Linus
From:
Andrew Morton [email blocked]
Subject: Re: O_DIRECT question
Date: Wed, 10 Jan 2007 20:51:57 -0800
On Thu, 11 Jan 2007 10:57:06 +0800
Aubrey [email blocked] wrote:
> Hi all,
>
> Opening file with O_DIRECT flag can do the un-buffered read/write access.
> So if I need un-buffered access, I have to change all of my
> applications to add this flag. What's more, Some scripts like "cp
> oldfile newfile" still use pagecache and buffer.
> Now, my question is, is there a existing way to mount a filesystem
> with O_DIRECT flag? so that I don't need to change anything in my
> system. If there is no option so far, What is the right way to achieve
> my purpose?
Not possible, basically.
O_DIRECT reads and writes must be aligned to the device's block size
(usually 512 bytes) in memory addresses, file offsets and read/write request
sizes. Very few applications will bother to do that and will hence fail if
their files are automagically opened with O_DIRECT.
----------------------------------------------------------------------------------
呵呵,注意这一句:The whole notion of "direct IO" is totally braindamaged. (使用"direct IO"的想法简直是脑子坏掉了!) 恩,看来O_DIRECT选项是早就不推荐使用的了。
NE:
使用O_DIRECT的话,就必须以页为单位进行I/O,这是没办法的事,因为设备本身就是块设备。你可以加一层中间代码,自己计算对齐后的文件偏移量,用posix_memalign生成对齐的buffer,进行I/O以后,再把buffer里面的内容copy到调用者的buffer里面去。
我的猜想(唉,没学过内核,只能是猜想)是:要使用高性能的调用,则在用户空间分配的内存的格式必须与内核空间所使用的内存页一致,这样内核才能高速地处理数据。posix_memalign是基于页的方式来分配内存的,且分配的大小必须是页大小的整数倍。
以下是实现了O_DIRECT选项的文件写入代码:
//写入200MB数据花费的时间是3100994微秒,3.1秒,更慢了,而且慢太多!!!
/*
测试大数据写入的性能 test_io_4.cpp 使用posix函数,且使用O_DIRECT选项
*/
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
static int operator-(struct timeval& lsh, struct timeval& rsh)
{
if (lsh.tv_sec==rsh.tv_sec)
{
return lsh.tv_usec - rsh.tv_usec;
}
else
{
return (lsh.tv_sec-rsh.tv_sec)*1000000 + (lsh.tv_usec - rsh.tv_usec);
}
}
void test()
{
struct timeval start;
struct timeval end;
const int DATA_LEN = 1024*1024*200; //200MB
char* pData = NULL;
printf("page size=%d\n", getpagesize());
int nTemp = posix_memalign((void**)&pData, getpagesize(), DATA_LEN);
if (0!=nTemp)
{
perror("posix_memalign error");
return;
}
//pData[DATA_LEN-1] = '\0';
gettimeofday(&start, NULL);
int fd = open("write_direct.dat", O_RDWR | O_CREAT | O_DIRECT);
if (fd<0)
{
perror("open error:");
return;
}
int nLen = write(fd, pData, DATA_LEN);
if (nLen<DATA_LEN)
{
perror("write error:");
return;
}
close(fd);
fd = -1;
gettimeofday(&end, NULL);
free(pData);
pData = NULL;
//显示占用时间
struct tm stTime;
localtime_r(&start.tv_sec, &stTime);
char strTemp[40];
strftime(strTemp, sizeof(strTemp)-1, "%Y-%m-%d %H:%M:%S", &stTime);
printf("start=%s.%07d\n", strTemp, start.tv_usec);
//
localtime_r(&end.tv_sec, &stTime);
strftime(strTemp, sizeof(strTemp)-1, "%Y-%m-%d %H:%M:%S", &stTime);
printf("end =%s.%07d\n", strTemp, end.tv_usec);
printf("spend=%d 微秒\n", end-start);
}
int main()
{
test();
return 1;
}
使用了O_DIRECT选项反而是文件写入更慢了,百思不得其解,终于在网上找到这篇文章:
《Linus对O_DIRECT非常不感冒啊,呵呵》
http://www.linux-ren.org/modules/newbb/viewtopic.php?topic_id=2722
-------------------------------------------
A thread on the
lkml began with a query about using O_DIRECT when opening a file. An early
white paper written by Andrea Arcangeli [interview] to describe the O_DIRECT patch before it was merged into the 2.4 kernel
explains, "with O_DIRECT the kernel will do DMA directly from/to the physical memory pointed [to] by the userspace buffer passed as [a] parameter to the read/write syscalls. So there will be no CPU and memory bandwidth spent in the copies between userspace
memory and kernel cache, and there will be no CPU time spent in kernel in the management of the cache (like cache lookups, per-page locks etc..)." Linux creator Linus Torvalds was quick to reply that despite all the claims there is no good reason for mounting
files with O_DIRECT, suggesting that interfaces like
madvise() and
posix_fadvise() should be used instead.
From: Aubrey [email blocked]
To: "Hua Zhong" [email blocked]O_
Subject: O_DIRECT question
Date: Thu, 11 Jan 2007 10:57:06 +0800
Hi all,
Opening file with O_DIRECT flag can do the un-buffered read/write access.
So if I need un-buffered access, I have to change all of my
applications to add this flag. What's more, Some scripts like "cp
oldfile newfile" still use pagecache and buffer.
Now, my question is, is there a existing way to mount a filesystem
with O_DIRECT flag? so that I don't need to change anything in my
system. If there is no option so far, What is the right way to achieve
my purpose?
Thanks a lot.
-Aubrey
From: Linus Torvalds [email blocked]
Subject: Re: O_DIRECT question
Date: Wed, 10 Jan 2007 19:05:30 -0800 (PST)
On Thu, 11 Jan 2007, Aubrey wrote:
>
> Now, my question is, is there a existing way to mount a filesystem
> with O_DIRECT flag? so that I don't need to change anything in my
> system. If there is no option so far, What is the right way to achieve
> my purpose?
The right way to do it is to just not use O_DIRECT.
The whole notion of "direct IO" is totally braindamaged. Just say no.
This is your brain: O
This is your brain on O_DIRECT: .
Any questions?
I should have fought back harder. There really is no valid reason for EVER
using O_DIRECT. You need a buffer whatever IO you do, and it might as well
be the page cache. There are better ways to control the page cache than
play games and think that a page cache isn't necessary.
So don't use O_DIRECT. Use things like madvise() and posix_fadvise()
instead.
Linus
From: Linus Torvalds [email blocked]
Subject: Re: O_DIRECT question
Date: Wed, 10 Jan 2007 19:15:48 -0800 (PST)
On Wed, 10 Jan 2007, Linus Torvalds wrote:
>
> So don't use O_DIRECT. Use things like madvise() and posix_fadvise()
> instead.
Side note: the only reason O_DIRECT exists is because database people are
too used to it, because other OS's haven't had enough taste to tell them
to do it right, so they've historically hacked their OS to get out of the
way.
As a result, our madvise and/or posix_fadvise interfaces may not be all
that strong, because people sadly don't use them that much. It's a sad
example of a totally broken interface (O_DIRECT) resulting in better
interfaces not getting used, and then not getting as much development
effort put into them.
So O_DIRECT not only is a total disaster from a design standpoint (just
look at all the crap it results in), it also indirectly has hurt better
interfaces. For example, POSIX_FADV_NOREUSE (which _could_ be a useful and
clean interface to make sure we don't pollute memory unnecessarily with
cached pages after they are all done) ends up being a no-op ;/
Sad. And it's one of those self-fulfilling prophecies. Still, I hope some
day we can just rip the damn disaster out.
Linus
From:
Andrew Morton [email blocked]
Subject: Re: O_DIRECT question
Date: Wed, 10 Jan 2007 20:51:57 -0800
On Thu, 11 Jan 2007 10:57:06 +0800
Aubrey [email blocked] wrote:
> Hi all,
>
> Opening file with O_DIRECT flag can do the un-buffered read/write access.
> So if I need un-buffered access, I have to change all of my
> applications to add this flag. What's more, Some scripts like "cp
> oldfile newfile" still use pagecache and buffer.
> Now, my question is, is there a existing way to mount a filesystem
> with O_DIRECT flag? so that I don't need to change anything in my
> system. If there is no option so far, What is the right way to achieve
> my purpose?
Not possible, basically.
O_DIRECT reads and writes must be aligned to the device's block size
(usually 512 bytes) in memory addresses, file offsets and read/write request
sizes. Very few applications will bother to do that and will hence fail if
their files are automagically opened with O_DIRECT.
----------------------------------------------------------------------------------
呵呵,注意这一句:The whole notion of "direct IO" is totally braindamaged. (使用"direct IO"的想法简直是脑子坏掉了!) 恩,看来O_DIRECT选项是早就不推荐使用的了。
NE:
使用O_DIRECT的话,就必须以页为单位进行I/O,这是没办法的事,因为设备本身就是块设备。你可以加一层中间代码,自己计算对齐后的文件偏移量,用posix_memalign生成对齐的buffer,进行I/O以后,再把buffer里面的内容copy到调用者的buffer里面去。
相关文章推荐
- 从NFS启动Linux系统,OK6410
- 救援模式修复bootloader
- Linux学习之建立yum源、yum命令的使用及rpm包编译安装
- Centos7 安装zabbix-server
- CentOS 7 安装问题:dracut_initqueue[599]: Warning: Could not boot 原因,及解决办法
- Linux系统启动流程
- linux系统IDE找不到Eigen
- 文件查找及find命令详解
- Redhat/CentOS安装vsftp软件
- 升级chrome出现SELinux问题的解决方法
- Linux C语言程序设计(十五)——进程、线程与信号
- 历史Linux镜像的问题修复方案
- 如何借助Motion操控Linux监控摄像头
- Linux系统管理-(7)-压缩与归档
- Linux系统中valgrind检查内存泄露
- linux内核之内核与ring3的通信
- linux之用户管理
- Linux I/O 进阶
- Linux mii-tool命令
- centOS同步时间