您的位置:首页 > 其它

NIFTI格式(.Nii)数据version 1格式分析

2017-11-27 20:56 916 查看

NIFTI格式(.Nii)数据格式分析

NIFTI出现原因

NIFTI出现的原因是原来一种图像格式是ANALYZE 7.5 format,但是这个图像格式缺少一些信息,比如没有方向信息,病人的左右方位等,如果需要包括额外的信息,就需要一个额外的文件,比如ANALYZE7.5就需要一对<.hdr, .img>文件来保存图像的完整信息。因此,解决这个问题Data Format Working Group (DFWG) 将图像格式完整的定义为NIFTI(Neuroimaging Informatics Technology Initiative)格式。

放射学和神经学医生使用软件和使用习惯会有一些不同,所以为了统一格式,DFWG提出NIFTI格式图像。

NIFTI格式的读取软件,也应该可以读取<.hdr, .img>文件对。

NIFTI-1文件头信息格式

:NIFTI格式中保存的信息是,一共可以保存7维数据。第1,2,3维度(0维统计使用维度个数)保留给空间维度x,y,z,而第四维度留给时间维度t。第5,6,7维度可供用户其他使用,但是第五维度也有一些预定义的用途,例如存储特定体素分布的参数或存储矢量数据。

// NIFTI version1 头文件存储信息(顺序存放)
/*
NIFTI 第一版的数据格式的头文件大小必须为348字节,这是为了兼容新,和旧的analyze格式保持一致。就是sizeof_hdr保存的数据大小。
*/
//                      //OFFSET    SIZE    Description
int     sizeof_hdr;     //0B        4B      Size of the header.Must be 348 (bytes).
char    data_type[10];  //4B        10B     Not used; compatibility with analyze.
char    db_name[18];    //14B       18B     Not used; compatibility with analyze.
int     extents;        //32B       4B      Not used; compatibility with analyze.
short   session_error;  //36B       2B      Not used; compatibility with analyze.
char    regular;        //38B       1B      Not used; compatibility with analyze.
char    dim_info;       //39B       1B      Encoding directions(phase, frequency, slice).
short   dim[8];         //40B       16B     Data array dimensions.
float   intent_p1;      //56B       4B      1st intent parameter.
float   intent_p2;      //60B       4B      2nd intent parameter.
float   intent_p3;      //64B       4B      3rd intent parameter.
short   intent_code;    //68B       2B      nifti intent.
short   datatype;       //70B       2B      Data type.
short   bitpix;         //72B       2B      Number of bits per voxel.
short   slice_start;    //74B       2B      First slice index.
float   pixdim[8];      //76B       32B     Grid spacings(unit per dimension).
float   vox_offset;     /*108B      4B      Offset into a.nii file. !Data true start point.*/
float   scl_slope;      //112B      4B      Data scaling, slope.
float   scl_inter;      //116B      4B      Data scaling, offset.
short   slice_end;      //120B      2B      Last slice index.
char    slice_code;     //122B      1B      Slice timing order.
char    xyzt_units;     //123B      1B      Units of pixdim[1..4].
float   cal_max;        //124B      4B      Maximum display intensity.
float   cal_min;        //128B      4B      Minimum display intensity.
float   slice_duration; //132B      4B      Time for one slice.
float   toffset;        //136B      4B      Time axis shift.
int     glmax;          //140B      4B      Not used; compatibility with analyze.
int     glmin;          //144B      4B      Not used; compatibility with analyze.
char    descrip[80];    //148B      80B     Any text.
char    aux_file[24];   //228B      24B     Auxiliary filename.
short   qform_code;     //252B      2B      Use the quaternion fields.
short   sform_code;     //254B      2B      Use of the affine fields.
float   quatern_b;      //256B      4B      Quaternion b parameter.
float   quatern_c;      //260B      4B      Quaternion c parameter.
float   quatern_d;      //264B      4B      Quaternion d parameter.
float   qoffset_x;      //268B      4B      Quaternion x shift.
float   qoffset_y;      //272B      4B      Quaternion y shift.
float   qoffset_z;      //276B      4B      Quaternion z shift.
float   srow_x[4];      //280B      16B     1st row affine transform
float   srow_y[4];      //296B      16B     2nd row affine transform.
float   srow_z[4];      //312B      16B     3rd row affine transform.
char    intent_name[16];//328B      16B     Name or meaning of the data.
char    magic[4];       //344B      4B      Magic string.


文件的读取器需要的重要参数包括:

sizeof_hdr, dim[8], datatype, bitpix, pixdim[8], vox_offset.


各个头字段内容和意义

int sizeof_hdr

sizeof_hdr 是保存文件的头文件大小,如果是NIFTI-1或者ANALYZE格式的文件sizeof_hdr=348.

注:从 data_type[10] 到 regular 字段,在NIFTI文件中没有使用,仅仅是为了和ANALYZE格式兼容。

char dim_info

dim_info字段存储着频率编码方向(1,2,3),相位编码方向(1,2,3)和采集期间层选择方向(1,2,3),对于径向采集来讲,频率编码和相位编码都设置为0。

short dim[8]

short dim[8]保存着前面提到的图像的维度信息。如果第0维不是(1-7)之间的数字,那么这个数据具有相反的字节顺序,所以应该进行字节交换(NIFTI标准没有提供字节顺序的字段,提倡使用dim[0])。

intent系列(影响到图像数据的读取和存储)

short intent_code是一个编码后的整数码。一些代码包括需要额外的参数,比如自由度数。这些额外的信息当需要的时候可以保存在intent_p1, _p2, _p3这些字段内,或者如果voxelwise,则保存在第五维中。如下列表中是intent编码:

INTENT                      CODE        PARAMETERS
None                        0           no parameters
Correlation                 2           p1 = degrees of freedom (df)
t test                      3           p1 = df
F test                      4           p1 = numerator df, p2 = denominator df
z score                     5           no parameters
χ^2statistic                6           p1 = df
Beta distribution           7           p1 = a, p2 = b
Binomial distribution       8           p1 = number of trials, p2 = probability per trial
Gamma distribution          9           p1 = shape, p2 = scale
Poisson distribution        10          p1 = mean
Normal distribution         11          p1 = mean, p2 = standard deviation
Noncentral F statistic      12          p1 = numerator df, p2 = denominator df, p3 = numerator noncentrality parameter
Noncentral χ2 statistic     13          p1 = dof, p2 = noncentrality parameter
Logistic distribution       14          p1 = location, p2 = scale
Laplace distribution        15          p1 = location, p2 = scale
Uniform distribution        16          p1 = lower end, p2 = upper end
Noncentral t statistic      17          p1 = dof, p2 = noncentrality parameter
Weibull distribution        18          p1 = location, p2 = scale, p3 = power
χ distribution              19          p1 = df*
Inverse Gaussian            20          p1 = μ, p2 = λ
Extreme value type I        21          p1 = location, p2 = scale
p-value                     22          no parameters
-ln(p)                      23          no parameters
-log(p)                     24          no parameters
//一下编码代表文件中包含一些不是统计特性的数据
Estimate                    1001        Estimate of some parameter, possibly indicated in intent_name
Label                       1002        Indices of a set of labels, which may be indicated in aux_file.
NeuroName                   1003        Indices in the NeuroNames set of labels.
Generic matrix              1004        For a MxN matrix in the 5th dimension, row major. p1 = M, p2 = N (integers as float); dim[5] = M*N.
Symmetric matrix            1005        For a symmetric NxN matrix in the 5th dimension, row major, lower matrix. p1 = N (integer as float); dim[5] = N*(N+1)/2.
Displacement vector         1006        Vector per voxel, stored in the 5th dimension.
Vector                      1007        As above, vector per voxel, stored in the 5th dimension.
Point set                   1008        Points in the space, in the 5th dimension. dim[1] = number of points; dim[2]=dim[3]=1; intent_name may be used to indicate modality.
Triangle                    1009        Indices of points in space, in the 5th dimension. dim[1] = number of triangles.
Quaternion                  1010        Quaternion in the 5th dimension.
Dimless 1011    Nothing. The intent may be in intent_name.
Time series                 2001        Each voxel contains a time series.
Node index                  2002        Each voxel is an index of a surface dataset.
rgb                         2003        rgb triplet in the 5th dimension. dim[0] = 5, dim[1] has the number of entries, dim[2:4] = 1, dim[5] = 3.
rgba                        2004        rgba quadruplet in the 5th dimension. dim[0] = 5, dim[1] has the number of entries, dim[2:4] = 1, dim[5] = 4.
Shape                       2005        Value at each location is a shape parameter, such as a curvature.


如果进行数据读取,那么在判断数据大小时需要提取intent中的一些编码方向和intent_p变量中存储的数据大小。

比如,如果存储的为generic matrix, 那么dim中的数据就无法直接获取数据的维度信息,则需要在intent中获得intent的code编码,在从intent_p1和intent_p2中获得矩阵的维度大小。

symmetric matrix的矩阵大小是N*N,那么intent_p1=N,dim[5]=N*(N+1)/2.(对称矩阵,只存储一半矩阵值就可以,行排列优先)。

displacement vector(位移向量):每个像素存储一个向量大小值,存储在第五维。

vector:和上边唯一向量一样。

point set:第一维中dim[1]=点的数量。dim[2]=dim[3]=1, 空间中的点存在第五维。

triangle:空间点的指数,存储在第五维。dim[1]=triangles的数量。

Quaternion:在第五维中的四元组。

RGB:第五维中RGB三元组,dim[0]=5, dim[1]=entries数目,dim[2:4]=1,dim[5]=3

RGBA:第五维中是RGBA四元组,dim[0]=5, dim[1] = entries数目,dim[2:4]=1, dim[5]=4.

Data Type 和 bitpix(bits per pixel/voxel)

datatype中存储的是数据的类型,可接受类型如下:

类型bit(s)/pixCODE
unknown0
bool1 bit1
unsigned char8 bits2
signed short16 bits4
signed int32 bits8
float32 bits16
complex64 bits32
double64 bits64
rgb24 bits128
“all”255
signed char8 bits256
unsigned short16 bits512
unsigned int32 bits768
long long64 bits1024
unsigned long long64 bits1280
long double128 bits1536
double pair128 bits1792
long double pair256 bits2048
rgba32 bits2304
而bitpix字段必须与datatype中的代码所对应的bit(s)/pix的大小相等。

slice切片信息

包含字段:slice_start,slice_end, slice_code, slice_duration

slice_duration是存储功能磁共振成像采集的时间相关信息,需要与dim_info字段一起使用,dim_info有slice_dim信息。有且仅有slice_dim不是0时,slice_info可以解释为以下编码:

CODE|       INTERPRETATION
--- |       ---
0   |       Slice order unknown
1   |       Sequential, increasing
2   |       Sequential, decreasing
3   |       Interleaved, increasing, starting at the 1st mri slice
4   |       Interleaved, decreasing, starting at the last mri slice
5   |       Interleaved, increasing, starting at the 2nd mri slice
6   |       Interleaved, decreasing, starting at one before the last mri slice


slice_duration是指湖区一单张图像所需时间。在单独的区域将这些信息存储设备相关的图像信息要比存储到pixdim[4]中要节约空间,slice_duration*dim[slice_dim]。

(说实话,不是很懂)

float pixdim[8] 体素维度

每个体素维度信息都保存在pixdim[8]中,各自对应dim[8],但pixdim[0]有特殊意义,其值只能是-1或1。前四个维度将在xyzt_units字段中指定。

float vox_offset 体素偏移量

vox_offset指 单个文件(.nii)图像数据的字节偏移量。

为了兼容老的软件(好烦啊!!!)可能的值是16的倍数,最小值是352(因为头文件就是348,所以最小只能取352 Bytes)

如果是文件对(.hdr/.img),vox_offset=0。但是也可以>0,那么就是当用户想要存储额外的信息在.img文件中,就像DICOM文件头一样。但是在文件对中,如果vox_offset>0,那么vox_offset是16的倍数就不用保障了。vox_offset的类型是float(32位)而非int是基于头文件内存对齐以及和ANALYZE格式兼容。

float scl_slope和scl_inter 数据缩放的斜率和截距

存储在每个体素中的值可以线性缩放到不同的单位。字段float scl_slope和float scl_inter定义一个斜率和一个线性函数的截距。数据缩放功能允许存储在比数据类型所允许的范围更广的范围内。但是,可以在相同的数据类型中使用缩放。对于rgb数据的存储,两个缩放字段都应该被忽略。对于复杂类型,它应用于实部和虚部。

float cal_min和cal_max 数据显示

存储标量数据的文件,这两个字段用来图像打开时默认显示范围。体素值小于等于cal_min的像素显示为显示范围中的最小值(灰度范围内通常为黑),大于等于cal_max的值显示为显示范围中的最大值(通常为白色),注意:这里并不是真实改变数据大小,而是改变显示大小。

xyzt_units 度量单位

在dim[1]和dim[4]中用到的空间和时间测量单元(对应各自的pixdim[1]和pixdim[4]),编码在xyzt_units字段中,1-3 bit用来存储空间维度,4-6 bit用来存储时间维度,6-7 bit没有使用。时间偏移量放在float toffset字段中,xyzt_units十进制编码如下:

UNITCODE
Unknown0
Meter (m)1
Milimeter (mm)2
Micron (µm)3
Seconds (s)8
Miliseconds (ms)16
Microseconds (µs)24
Hertz (Hz)32
Parts-per-million (ppm)40
Radians per second (rad/s)48

char descrip[80] 描述

该字段char descrip[80]可以包含最多80个字符的文本。标准中没有指定这个字符串是否需要被空字符终止。大概是由应用程序来正确处理它。

Auxiliary file 附加文件

包含额外信息的补充文件可放在该字段中。

方向信息

nifti格式比以前的ANALYZE格式最明显的改进是能够明确地存储信息的方向。标准中假定体素坐标指的是每个体素的中心,而不是某个位置为中心。假定坐标系为右手坐标系:x正方向是右,y正方向是前,z正方向是上。而ANALYZE是左手坐标系。nifti格式提供三种映射到传统坐标系的方法:第一种只是为了和ANALYZE兼容,另外两种可以共存,转换为不同的坐标系统。这些系统在short qform_code字段和short sform_code字段中指定。下表是其编码值:

NAMECODEDESCRIPTION
unknown0Arbitrary coordinates. Use Method 1.
scanner_anat1Scanner-based anatomical coordinates.
aligned_anat2Coordinates aligned to another file, or to the “truth” (with an arbitrary coordinate center).
talairach3Coordinates aligned to the Talairach space.
mni_1524Coordinates aligned to the MNI space.
原则上,qform_code(下面的方法2)应包含0,1或2,而sform_code(下面的方法3)可以包含任何在表中所示的编码。



char magic[4]

该字符串声明文件符合NIFTI标准。

理想情况下,应该先检查该字段,如果字段中存储为”ni1”(或者是16进制的‘6E 69 31 00’),那么是.hdr/.img文件对形式;如果是’n+1’(或’6E 2B 31 00’),那么就是单一的.nii文件;而如果缺少字符串,那么就按照ANALYZE格式处理。未来还会有’n+2’, ‘n+3’等字符串对应于NIFTI的后续版本。

在该版本中未使用的区域

char data_type[10], char db_name[18], int extents, short session_error以及char regular在NIFTI-1中未被启用,仅仅用来兼容ANALYZE标准。extent字段应该是整数16384,regular应该是字符’r’, glmin和glmax分别对应于ANALYZE格式中数据集中脑的最小和最大值。

存储的额外信息

按照标准有许多方式允许把额外的信息包含到NIFTI格式中。在头的最后,接下来的4个字节(比如从349到352,352包含在内)可以也可以不在.hdr文件中出现。然而,这4字节在.nii文件中一定存在。解析为一字符数组,如char extention[4]。原则上这四个字节都应该被置为0。如果首先extension[0]是非零的,意味着从353字节开始是扩展信息。扩展信息的大小应该是16字节的倍数,扩展的第一个8字节,应该被解释为两个整数:int esize 和int ecode。esize字段保存扩展信息的大小(这个大小包括esize和ecode本身占用的8字节。)

ecode字段用来保存剩下的额外信息。已经定义的三个ecode编码如下:

CODEUSE
0Unknown. This code should be avoided.
2dicom extensions
4xml extensions used by the AFNI software package.
在同一个文件中可以扩展多个额外信息,每一个扩展的额外信息都必须以esize和ecode对开始,下一个额外信息紧接着上一个额外信息开始。在单一的.nii文件中,vox_offset字段必须设置为在最后一个额外信息后的图像数据开始点。

第一个版NIFTI格式文件的问题

The nifti format brought a number of great benefits if compared to the old analyze format. However, it also brought its own set of new problems. Fortunately, these problems are not severe. Here are some:

Even though a huge effort was done to keep compatibility with analyze, a crucial aspect was not preserved: the world coordinate system is assumed, in the nifti format, to be ras, which is weird and confusing. The las is a much more logical choice from a medical perspective. Fortunately, since orientation is stored unambiguously, it is possible to later flip the images in the screen at will in most software.

The file format still relies too much on the file extension being .nii or on a pair .hdr/.img, rather than much less ambiguous magic strings or numbers. On the other hand, the different magic strings for single file and for file pairs effectively prevent the possibility of file splitting/merging using common operating system tools (such as dd in Linux), as the magic string needs to be changed, even though the header structure remains absolutely identical.

The magic string that is present in the header is not placed at the beginning, but near its end, which makes the file virtually unrecognisable outside of the neuroimaging field.

The specification of three different coordinate systems, while bringing flexibility, also brought ambiguity, as nowhere in the standard there is information on which should be preferred when more than one is present. Certain software packages explicitly force the qform_code and sform_code to be identical to each other.

There is no field specifying a preferred interpolation method when using Methods 2 or 3, even though these methods do allow fractional voxels to be found with the specification of world coordinates.

Method 2 allows only rotation and translation, but sometimes, due to all sorts of scanner calibration issues and different kinds of geometric distortion present in different sequences, the coregistration between two images of the same subject may require scaling and shear, which are only provided in Method 3.

Method 3 is supposed to inform that the data is aligned to a standard space using an affine transformation. This works perfectly if the data has been previously warped to such a space. Otherwise, the simple alignment of any actual brain from native to standard space cannot be obtained with only linear transformations.

To squeeze information while keeping compatibility with the analyze format, some fields had to be mangled into just one byte, such as char dim_info and char xyzt_units, which is not practical and require sub-byte manipulation.

The field float vox_offset, directly inherited from the analyze format, should in fact, be an integer. Having it as a float only adds confusion.

Not all software packages implement the format exactly in the same way. Vector-based data, for instance, which should be stored in the 5th dimension, is often stored in the 4th, which should be reserved for time. Although this is not a problem with the format itself, but with the use made of it, easy implementation malpractices lead to a dissemination of ambiguous and ill-formed files that eventually cannot be read in other applications as intended by the time of the file creation.

Despite these issues, the format has been very successful as a means to exchange data between different software packages. An updated format, the nifti 2.0, with a header with more than 500 bytes of information, may become official soon.

原文地址

https://brainder.org/2012/09/23/the-nifti-file-format/
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: