您的位置：首页 > 其它

Coursera | Andrew Ng (01-week2-2.1)—二分类

2017-12-19 15:18 375 查看

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂

转载请注明作者和出处：微信公众号-「SelfImprovementLab」

知乎：https://zhuanlan.zhihu.com/c_147249273

CSDN：http://blog.csdn.net/JUNJUN_ZHAO/article/details/78813175

Basics of Neural Network Programming

第二周神经网络基础

Binary Classification

二分分类

(字幕来源：网易云课堂)

Hello, and welcome back. In this week we’re going to go over the basics of neural network programming. It turns out that when you implement a neural network,there are some techniques that are going to be really important.For example, if you have a training set of m training examples,

you might be used to processing the training set by having a for loop step through your m training examples.But it turns out that when you’re implementing a neural network,you usually want to process your entire training set without using an explicit(显示的) for loop to loop over your entire training set. So, you’ll see how to do that in this week’s materials.

大家好，欢迎回来,本周我们会学习神经网络编程的基础知识。当你要构建一个神经网络，有些技巧是相当重要的。例如 m 个样本的训练集，你可能会习惯性地去用一个 for 循环，来遍历这 m 个样本。但事实上，实现一个神经网络，如果你要遍历整个训练集，并不需要直接使用 for 循环。本周的课程你会学到如何做到。

Another idea, when you organize the computation of, in your network,usually you have what’s called a forward pass or forward propagation step,

followed by a backward pass or what’s called a backward propagation step.And so in this week’s materials, you also get an introduction about why the computations in learning an neural network can be organized in this forward propagation and a separate backward propagation. For this week’s materials,I want to convey(阐述) these ideas using logistic regression in order to make the ideas easier to understand.But even if you’ve seen logistic regression before,I think that there’ll be some new and interesting ideas for you to pick up in this week’s materials.So with that, let’s get started.

还有就是，神经网络的计算过程中通常有一个正向过程，或者叫 正向传播步骤，接着会有一个反向步骤，也叫做 反向传播 步骤。这周的学习材料中，我也会给你介绍 为什么神经网络的计算过程可以分为前向传播和反向传播两个分开的过程。 本周课程中，我会用 logistic 回归 来阐述，以便于你能更好地理解。如果你之前学过 logistic 回归，我也认为，这周的学习材料也会带给你一些新的、有意思的想法，下面开始吧。

Logistic regression is an algorithm for binary classification. So let’s start by setting up the problem.Here’s an example of a binary classification problem.You might have an input of an image, like that,and want to output a label to recognize this image as being either a cat, in which case you output 1,or not-cat in which case you output 0,and we’re going to use y to denote（表示） the output label. Let’s look at how an image is represented in a computer.

logistic 回归是一个用于二分分类的算法。我们从一个问题开始。这有一个二分分类问题的例子。假如你有一张图片作为输入，这样子的，你想输出识别此图的标签，如果是猫输出 1，如果不是则输出 0，我们用 y 来表示输出的结果标签。来看看一张图片在计算机中是如何表示的。

To store an image, your computer stores three separate matrices corresponding to the red, green,and blue color channels of this image.So if your input image is 64 pixels by 64 pixels,then you would have 3 64 by 64 matrices corresponding to the red, green and blue pixel intensity values for your images.Although to make this little slide I drew these as much smaller matrices,so these are actually 5 by 4 matrices rather than 64 by 64.So to turn these pixel intensity values into a feature vector, what we’re going to do is unroll all of these pixel values into an input feature vector x.

计算机保存一张图片，要保存三个独立矩阵，分别对应图片中的 红、绿、蓝三个颜色通道，如果输入图片是 64×64 像素的，就有三个 64×64 的矩阵，分别对应图片中红、绿、蓝三种像素的亮度。为了方便表示，这里我用三个小矩阵，它们是 5×4 的，并不是 64×64 的。要把这些像素亮度值，放进一个特征向量中，就要把这些像素值都提出来放入一个特征向量 x。

So to unroll all these pixel intensity values into feature vector,what we’re going to do is define a feature vector x corresponding to this image as follows.We’re just going to take all the pixel values 255, 231, and so on.255, 231, and so on until we’ve listed all the red pixels.And then eventually 255 134 255,134 and so on,until we get a long feature vector listing out all the red, green and blue pixel intensity values of this image.If this image is a 64 by 64 image, the total dimension of this vector x will be 64 by 64 by 3 because that’s the total numbers we have in all of these matrixes.

为了把这些像素值取出放入特征向量，就要像下面这样定义一个特征向量 x 以表示这张图片。我们把所有的像素值都取出来，例如 255、231这些 255、231 等等直到列完所有的红色像素，接着是 255、134、255、134 等等。最后得到一个很长的特征向量把图片中所有的红、绿、蓝像素强度值都列出来。如果图片是 64×64 的，那么向量 x 的总维度就是 64×64×3，因为这是三个矩阵的元素数量。

Which in this case, turns out to be 12,288, that’s what you get if you multiply all those numbers.And so we’re going to use nx=12288 to represent the dimension of the input features x. And sometimes for brevity（简洁）, I will also just use lowercase n to represent the dimension of this input feature vector.So in binary classification, our goal is to learn a classifier that can input an image represented by this feature vector x. And predict whether the corresponding label y is 1 or 0,that is, whether this is a cat image or a non-cat image.

对于这个例子，数字是 12288，把它们乘起来这就是结果。我们用 nx=12288，来表示输入的特征向量 x 的维度。有时候为了简洁，我会直接用小写的 n，来表示输入的特征向量的维度。在二分分类问题中目标是训练出一个分类器,它以图片的特征向量 x 作为输入。预测输出的结果标签 y 是 1, 还是 0。也就是预测图片中是否有猫。

Notation 符号

Let’s now lay out some of the notation（符号） that we’ll use throughout the rest of this course. A single training example is represented by a pair (x,y),where x is nx-dimensional feature vector and y, the label, is either 0 or 1.Your training sets will comprise lower-case m training examples.And so your training sets will be written (x(1),y(1)) which is the input and output for your first training example (x(2),y(2)) for the second training example up to (x(m),y(m)) which is your last training example.

现在，我们看看在后面课程中需要用到的一些符号。用一对(x,y)来表示一个单独的样本，x 是nx维的特征向量，标签 y 值为 0 或 1，训练集由 m 个训练样本构成。(x(1),y(1)) 表示样本一的输入和输出(x(2),y(2))表示样本二 (x(m),y(m)) 表示最后一个样本 m。

And then that altogether is your entire training set.So I’m going to use lowercase m to denote the number of training samples.And sometimes to emphasize(强调) that this is the number of train examples,I might write this as m=mtrain .And when we talk about a test set,we might sometimes use m subscript test to denote the number of test examples mtest. So that’s the number of test examples.Finally, to put all of the training examples into a more compact notation,we’re going to define a matrix, capital X .As defined by taking you training set inputs x1, x2 and so on and stacking them in columns.

这些一起就表示整个训练集。用小写的字母 m 来表示训练样本的个数。有时候为了强调这是训练样本的个数，可以写作 m=mtrain 。当说到测试集时，来表示测试集的样本数。所以这是测试集的样本数。最后用更紧凑的符号表示训练集，我们定义一个矩阵用大写的 X 表示它由训练集中的 x(1) , x(2) 这些组成像这样写成矩阵的列。

So we take x(1) and put that as a first column of this matrix, x(2) , put that as a second column and so on down to xm,then this is the matrix capital X. So this matrix X will have m columns, where m is the number of train examples and the number of railroads, or the height of this matrix is nx. Notice that in other causes, you might see the matrix capital X defined by stacking up the train examples in rows like so, x(1) transpose down to x(m) transpose.It turns out that when you’re implementing neural networks,using this convention I have on the left,will make the implementation much easier.

现在我们把x(1)放进矩阵的第一列 x(2) 是第二列 …… xm 是第 m 列，最后得到矩阵 X。这个矩阵有 m 列，m 是训练集的样本数，这个矩阵的高度记为 nx。要注意的是有时候矩阵 X 的定义，训练样本作为行向量堆叠而不是这样列向量堆叠 x(1) 转置 …… x(m) 转置，构建神经网络时，用左边这个约定形式会让构建过程简单得多。

So just to recap, X is a nx by m dimensional matrix,and when you implement this in Python,you see that

X.shape

, that’s the python command for founding the shape of the matrix, that this an (nx,m).That just means it is an nx by m dimensional matrix.So that’s how you group the training examples, input x into matrix.How about the output labels y?It turns out that to make your implementation of a neural network easier,it would be convenient to also stack y in columns.So we’re going to define capital Y to be equal to y(1), y(2) up to ym like so.So Y here will be a 1 by m dimensional matrix.

现在回顾一下X是一个 nx×m 矩阵，当你用 Python 实现的时候，你会看到

X.shape

这是一条 Python 命令，用来输出矩阵的维度即 (nx,m)，表示 X 是一个 nx×m 矩阵。这就是如何将训练样本即输入 x 用矩阵表示。那输出标签 y 呢？同样，为了方便构建一个神经网络，将 y 标签也放到列中。我们定义 Y 是 y(1) y(2) 一直到 y(m) ，这里的Y是一个1×m矩阵。

And again, to use the Python notation then the shape of Y will be (1,m).Which just means this is a 1 by m matrix.And as you implement your neural network,later in this course, you find that a useful convention would be to take the data associated with different training examples,and by data I mean either x or y, or other quantities you see later.But to take the stuff or the data associated with different training examples and to stack them in different columns,like we’ve done here for both x and y.So, that’s the notation we we’ll use for logistic regression and for neural networks networks later in this course.

同样地，在 Python 里

Y.shape

等于 (1,m)。表示这是一个1×m 矩阵。在后面的课程要实现神经网络时，你会发现好的惯例符号能够将不同训练样本的数据联系起来这里说的数据不仅有 x 和 y 还会有之后其他的量。将不同的训练样本数据取出来放到不同的列上，就像刚刚我们处理x和y那样。这门课中在 logistic 回归和神经网络要用到的符号就是这些了。

If you ever forget what a piece of notation means,like what is m or what is n or what is something else,we’ve also posted on the course website a notation guide that you can use to quickly look up what any particular piece of notation means.So with that, let’s go on to the next video where we’ll start to fetch out logistic regression using this notation.

如果你忘了这些符号的意义，比如什么是 m 什么是 n，或者其他，我们也会在课程网站上放上符号说明，这样你就可以快速地查阅每个符号的意义。就这样吧,下个课程视频中我们以 logistic 回归作为开始。

重点总结：

Binary Classification 二分类

一些符号 Notation 表示

样本：(x,y)，训练样本包含 m 个；

其中x∈Rnx，表示样本 x 包含nx个特征；

y∈0,1，目标值属于 0、1分类；

训练数据： {(x(1),y(1)),(x(2),y(2)),⋯,(x(m),y(m))}

训练数据样本形状： X.shape=(nx,m)

对应标签数据的形状：Y=[y(1),y(2),⋯,y(m)] , Y.shape=(1,m)

PS: 欢迎扫码关注公众号：「SelfImprovementLab」！专注「深度学习」，「机器学习」，「人工智能」。以及「早起」，「阅读」，「运动」，「英语」「其他」不定期建群打卡互助活动。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 吴恩达深度学习

相关文章推荐

新的分享

章节导航