您的位置:首页 > 编程语言 > Python开发

numpy中array与matrix

2014-08-12 22:41 357 查看

What are the differences between numpy arrays and matrices? Which one should
I use?

up vote 44
down vote
favorite
8

What are the advantages and disadvantages of each?

From what I've seen, either one can work as a replacement for the other if need be, so should I bother using both or should I stick to just one of them?

Will the style of the program influence my choice? I am doing some machine learning using numpy, so there are indeed lots of matrices, but also lots of vectors (arrays).

Thanks.

python

arrays
matrix
numpy
share|improve
this question
asked
Nov 11 '10 at 3:25




levesque

1,31841633

1
I don't have enough information to justify an answer but from what I can tell the main difference is the implementation of multiplication. A matrix performs matrix/tensor multiplication, whereas an array
will do element-wise multiplication. –
Mike Axiak Nov 11 '10 at 3:55

add a comment |

3 Answers 3

active

oldest
votes

up vote 60
down vote
accepted
Numpy matrices are strictly 2-dimensional, while numpy arrays (ndarrays) are N-dimensional. Matrix objects are a subclass of ndarray, so they inherit all the attributes and methods of ndarrays.

The main advantage of numpy matrices is that they provide a convenient notation for matrix multiplication: if a and b are matrices, then a*b is their matrix product.

import numpy as np

a=np.mat('4 3; 2 1')
b=np.mat('1 2; 3 4')
print(a)
# [[4 3]
#  [2 1]]
print(b)
# [[1 2]
#  [3 4]]
print(a*b)
# [[13 20]
#  [ 5  8]]

Both matrix objects and ndarrays have
.T
to return the transpose, but matrix objects also have
.H
for the conjugate transpose, and
.I
for the inverse.

In contrast, numpy arrays consistently abide by the rule that operations are applied element-wise. Thus, if a and b are numpy arrays, then a*b is the array formed by multiplying the components element-wise:

c=np.array([[4, 3], [2, 1]])
d=np.array([[1, 2], [3, 4]])
print(c*d)
# [[4 6]
#  [6 4]]

To obtain the result of matrix multiplication, you use np.dot :

print(np.dot(c,d))
# [[13 20]
#  [ 5  8]]

The
**
operator also behaves differently:

print(a**2)
# [[22 15]
#  [10  7]]
print(c**2)
# [[16  9]
#  [ 4  1]]

Since
a
is a matrix,
a**2
returns the matrix product
a*a
. Since
c
is an ndarray,
c**2
returns an ndarray with each component squared element-wise.

There are other technical differences between matrix objects and ndarrays (having to do with np.ravel, item selection and sequence behavior).

The main advantage of numpy arrays is that they are more general than 2-dimensional matrices. What happens when you want a 3-dimensional array? Then you have to use an ndarray, not a matrix object. Thus, learning to use matrix objects is more work -- you
have to learn matrix object operations, and ndarray operations.

Writing a program that uses both matrices and arrays makes your life difficult because you have to keep track of what type of object your variables are, lest multiplication return something you don't expect.

In contrast, if you stick solely with ndarrays, then you can do everything matrix objects can do, and more, except with slightly different functions/notation.

If you are willing to give up the visual appeal of numpy matrix product notation, then I think numpy arrays are definitely the way to go.

PS. Of course, you really don't have to choose one at the expense of the other, since
np.asmatrix
and
np.asarray
allow you to convert one to the other (as long as the array is 2-dimensional).

share|improve
this answer
edited
Nov 11 '10 at 13:29

answered
Nov 11 '10 at 3:59




unutbu

224k15318476

For the matrix
**
operator, there is no alternative like
np.dot
to using matrix objects. –
Sven Marnach Nov 11 '10 at 9:19

And numpy arrays also have the
.T
attributte -- that's not special to matrices. Nice answer, btw :) –

Sven Marnach Nov 11 '10 at 9:20

@Sven: Thanks for the comments; I've tried to correct and clarify a bit. –

unutbu Nov 11 '10 at 10:29

2
For those wondering,
mat**n
for a matrix can be inelegantly applied to an array with
reduce(np.dot, [arr]*n)

askewchan Apr 12 '13 at 20:42

add a comment |

up vote 11
down vote
Just to add one case to unutbu's list.

One of the biggest practical differences for me of numpy ndarrays compared to numpy matrices or matrix languages like matlab, is that the dimension is not preserved in reduce operations. Matrices are always 2d, while the mean of an array, for example, has
one dimension less.

For example demean rows of a matrix or array:

with matrix

>>> m = np.mat([[1,2],[2,3]])
>>> m
matrix([[1, 2],
[2, 3]])
>>> mm = m.mean(1)
>>> mm
matrix([[ 1.5],
[ 2.5]])
>>> mm.shape
(2, 1)
>>> m - mm
matrix([[-0.5,  0.5],
[-0.5,  0.5]])

with array

>>> a = np.array([[1,2],[2,3]])
>>> a
array([[1, 2],
[2, 3]])
>>> am = a.mean(1)
>>> am.shape
(2,)
>>> am
array([ 1.5,  2.5])
>>> a - am #wrong
array([[-0.5, -0.5],
[ 0.5,  0.5]])
>>> a - am[:, np.newaxis]  #right
array([[-0.5,  0.5],
[-0.5,  0.5]])

I also think that mixing arrays and matrices gives rise to many "happy" debugging hours. However, scipy.sparse matrices are always matrices in terms of operators like multiplication.

share|improve
this answer
answered
Nov 11 '10 at 20:49




user333700

4,7941821

add a comment |

up vote 11
down vote
Scipy.org recommends that you use arrays:

*'array' or 'matrix'? Which should I use? - Short answer

Use arrays.

They are the standard vector/matrix/tensor type of numpy. Many numpy function return arrays, not matrices. There is a clear distinction between element-wise operations and linear algebra operations. You can have standard vectors or row/column vectors if
you like. The only disadvantage of using the array type is that you will have to use dot instead of * to multiply (reduce) two tensors (scalar product, matrix vector multiplication etc.).

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: