您的位置:首页 > 运维架构 > 网站架构

aws s3 静态网站_为什么不应该使用AWS S3或CloudFront交付静态资产

2020-08-21 04:34 1006 查看

aws s3 静态网站

AWS is THE cool kid in the town. Every comparison of different cloud providers is incomplete unless you compare them with AWS at least once.

AWS是小镇上最酷的孩子。 除非您至少将它们与AWS进行一次比较,否则不同云提供商的每次比较都是不完整的。

But S3, the most popular solution for storing on the cloud and the one everyone loves, should not always be your choice. In this article, I'll explain why.

但是,S3,最流行的存储在云上的解决方案,也是每个人都喜欢的解决方案,并不总是您的选择。 在本文中,我将解释原因。

Note: Please don't immediately yell at me about how and why AWS is best. I know they are at the top of cloud computing - and in no way am I trying to target any of their business practices and services. I've just used CloudFront + S3 myself, along with DigitalOcean + Cloudflare too, and have laid down my observations. Please take my thoughts constructively, and if you think I've made any mistakes, tweet me at mehulmpt.

注意:请不要立即对我大喊AWS的最佳方式和原因。 我知道他们处于云计算的顶端-我决不打算针对他们的任何业务实践和服务。 我自己也使用了CloudFront + S3,以及DigitalOcean + Cloudflare,并提出了自己的看法。 请建设性地考虑我的想法,如果您认为我犯了任何错误,请在mehulmpt上发布我的推文。

CloudFront + S3 (CloudFront + S3)

CloudFront is another service often used (and recommended) with S3 when you're trying to distribute files digitally all over the globe. CloudFront is a CDN from Amazon with edge servers all over the world. This is how it works:

当您尝试在全球范围内以数字方式分发文件时,CloudFront是S3经常使用(推荐)的另一项服务。 CloudFront是Amazon的CDN,其边缘服务器遍布全球。 它是这样工作的:

Your user, say from India, tries to load your website whose server is located in the USA. Let's say you're using a SPA like React or Angular. The first index.html page will load from your origin server (it is usually a good practice to never cache HTML pages, especially if you're using SSR applications to prevent cache mishaps).

例如,您的用户来自印度,试图加载服务器位于美国的网站。 假设您使用的是SPA,例如React或Angular。 第一个index.html页面将从您的原始服务器加载(通常最好不要缓存HTML页面,尤其是在使用SSR应用程序以防止出现缓存事故的情况下)。

After that, if you've hosted your JS/CSS files on CloudFront (S3), those calls will be made to a domain name from CloudFront which resolves to an IP address of a machine closest to your location. In this case, it's probably some server from AWS sitting in some data center in Mumbai, India.

之后,如果您已将JS / CSS文件托管在CloudFront(S3)上,则将通过CloudFront对这些域名进行调用,该域名将解析为最接近您所在位置的计算机的IP地址。 在这种情况下,它可能是来自AWS的服务器,位于印度孟买的某个数据中心。

From this point, that server has the responsibility of delivering that file. Two things can happen:

从这一点来看,该服务器负责传送该文件。 可能发生两件事:

  • your file is already available with that Mumbai server (cached), and that server returns you that file immediately (cache hit),

    您的文件已经可以在该孟买服务器上使用(缓存),并且该服务器会立即向您返回该文件(缓存命中),
  • or it does not has that file and has to perform a trip to your origin server (S3 bucket in this case) to get that file.

    或者它没有该文件,而必须执行一次到原始服务器的访问(在这种情况下为S3存储桶)以获取该文件。

But even if there's a cache miss, chances are high that it will be still faster for a user compared to not having CloudFront in front.

但是,即使存在缓存未命中的情况,与没有在前面安装CloudFront相比,用户仍然有更快的机会。

Why? Because when there is a cache miss and the edge server is trying to reach the main server, it is using a Tier 1 internet connection line operated by Amazon - a trillion-dollar US company. They likely have much better internet connectivity and latency than what your ISP can offer.

为什么? 因为当发生缓存未命中并且边缘服务器试图到达主服务器时,它使用的是由亚马逊运营的Tier 1 Internet连接线-一家价值数万亿美元的美国公司。 它们可能具有比ISP所提供的更好的Internet连接性和延迟。

Also, because they're on the same global Amazon network, they can do some neat optimizations to save more time.

另外,由于它们位于同一全球Amazon网络上,因此可以进行一些巧妙的优化以节省更多时间。

Alright! Sounds great to me so far, so what's the problem? Hold your horses, we'll get to it.

好的! 到目前为止对我来说听起来很棒,那是什么问题呢? 抱着你的马,我们会去的。

资产压缩 (Asset compression)

CloudFront allows you to deliver compressed assets using GZIP. But there's even a cooler kid in the market: brotli compression. And it is supported by almost every major browser.

CloudFront允许您使用GZIP交付压缩资产。 但是市场上还有一个更酷的孩子:肉芽肿压缩。 几乎所有主流浏览器都支持它。

Brotli compresses your transmission data even more. This means it's not only good on your wallet, but it's also good for the end-user (because they'll spend less time seeing that loading/white screen).

Brotli会进一步压缩您的传输数据。 这意味着它不仅对您的钱包有好处,而且对最终用户也有好处(因为他们花更少的时间查看该加载/白屏)。

Amazon CloudFront does not support brotli compression delivery, yet. And I won't blame them for this either. This is because brotli compression is slow to do on the fly (CloudFront does gzip on the fly), so they have not implemented it yet.

Amazon CloudFront目前不支持brotli压缩交付。 我也不会为此责怪他们。 这是因为brotli压缩在运行中很慢(CloudFront会在运行中进行gzip压缩),因此他们尚未实现它。

Sure, then let's do it ourselves and store it on S3 and deliver the compressed version, right? Unfortunately, it is not as simple, and we'll soon spin down into more of an architecture problem.

当然,让我们自己完成并将其存储在S3上并提供压缩版本,对吗? 不幸的是,它并不是那么简单,我们很快就会陷入更多的架构问题。

A typical asset URL would look like this: http://mysite/assets/javascript/file.js

典型的资产网址如下所示:http://mysite/assets/javascript/file.js

When your browser makes a request, it sends a header: Accept-Encoding. This header can contain compression algorithms your browser can support, like gzip, deflate, brotli, etc. The server now has to act smart to have maximum efficiency.

当您的浏览器发出请求时,它会发送一个标头:Accept-Encoding。 此标头可以包含您的浏览器可以支持的压缩算法,例如gzip,deflate,brotli等。服务器现在必须明智地发挥作用,以实现最高效率。

  1. If the client supports brotli, then always deliver the brotli compressed asset.

    如果客户支持brotli,则始终交付brotli压缩资产。
  2. If the client supports gzip, then always deliver gzip.

    如果客户端支持gzip,则始终提供gzip。
  3. Otherwise, deliver the original file.

    否则,请提供原始文件。
  4. Also, make sure that in the response type, the correct Content-Encoding is set so that browser can recognize the compression algorithm.

    另外,请确保在响应类型中设置了正确的Content-Encoding,以便浏览器可以识别压缩算法。

Now, firstly, you have to create 3 variants of every single asset file:

现在,首先,您必须为每个资产文件创建3个变体:

  1. file.js

    file.js
  2. file.js.br - brotli

    file.js.br-brotli
  3. file.js.gz - gzip

    file.js.gz-gzip

And you have to conditionally deliver them depending on if the browser supports it or not. CloudFront is a "dumb" CDN - it will just map your request URL to the file on your server. It cannot perform any transformations unless.... you opt-in for another AWS service - Lambda@edge functions

而且您必须根据浏览器是否支持有条件地交付它们。 CloudFront是一个“哑” CDN-它将仅将您的请求URL映射到服务器上的文件。 除非....您选择加入另一项AWS服务-Lambda @ edge函数,否则它将无法执行任何转换。

We all likely know what Lambda is on AWS - you can run functions on the cloud without worrying about underlying infrastructure upscaling or downscaling. Per API request pricing, time-bounded, sweet. Lambda@edge is a similar service but was made for edge servers (CloudFront CDN datacenters)

我们都可能知道Lambda在AWS上是什么-您可以在云上运行功能,而不必担心基础架构的升迁或降级。 每个API请求的价格,有时限的,甜蜜的。 Lambda @ edge是一项类似的服务,但它是为边缘服务器(CloudFront CDN数据中心)提供的

You can technically configure a Lambda server to act as a "middle man" between the request made by your client and the CloudFront CDN. Lambda can open the request, see the supported content headers, modify the URL accordingly and forward it to the "dumb" CloudFront which is going to retrieve the modified URL file then.

您可以从技术上将Lambda服务器配置为充当客户端发出的请求和CloudFront CDN之间的“中间人”。 Lambda可以打开请求,查看受支持的内容标头,相应地修改URL,然后将其转发到“哑” CloudFront,后者将随后检索修改后的URL文件。

For example, if Lambda sees that browser sent an Accept-Encoding: br then lambda can be used to modify the request URL from /javascript/file.js to /javascript/file.js.br without actually telling the user side. Cloudfront will now retrieve a smaller payload and return a response for a brotli encoding. WIN!

例如,如果Lambda看到浏览器发送了Accept-Encoding:br,则可以使用lambda来将请求URL从/javascript/file.js修改为/javascript/file.js.br,而无需实际告诉用户端。 Cloudfront现在将检索较小的有效负载,并返回针对brotli编码的响应。 赢得!

But that is good, isn't it? WHERE is the problem? The problem is... pricing.

但这很好,不是吗? 问题出在哪里? 问题是价格。

AWS极其昂贵(用于此任务) (AWS is ridiculously expensive (for this task))

Whatever you've done so far sounds and looks very good. But when you look at what's happening when you start to hit significant numbers, you'll realize that AWS isn't great when it comes to data transfer. Zoom just bounced AWS for the same reason.

到目前为止,您所做的一切听起来都很不错。 但是,当您查看开始达到大量数字时所发生的情况时,您会意识到AWS在数据传输方面并不出色。 出于相同的原因,Zoom刚刚反弹了AWS

Plus, with the asset compression, now you also have to pay for Lambda@edge calls. I figured out that implementing Lambda@edge will actually reduce your costs, otherwise you'll pay much more for AWS for traffic!

另外,随着资产压缩,现在您还必须支付Lambda @ edge通话费用。 我发现实施Lambda @ edge实际上会降低您的成本,否则您将为AWS支付更多流量!

CloudFront works on data transfer pricing. It does not charge you when it retrieves data from the S3 bucket, it charges you when a user retrieves data from the edge servers.

CloudFront负责数据传输定价。 当它从S3存储桶中检索数据时,它不会向您收费;当用户从边缘服务器中检索数据时,它将向您收费。

上限成本 (Upper cost bound)

In the most expensive country - India - CloudFront charges you $0.170 per GB of data transferred. This is huge!

在最昂贵的国家/地区(印度)-CloudFront对您传输的每GB数据收取$ 0.170的费用。 太好了!

Let's say you have a popular (mainly) Indian website with about 50,000 users visiting your site daily. Also, let's say you make some design changes every week on your site (pretty common for fast iterating products) so you have to invalidate the browser and CloudFront cache.

假设您有一个受欢迎的(主要是)印度网站,每天大约有50,000个用户访问您的网站。 另外,假设您每周在网站上进行一些设计更改(对于快速迭代的产品来说很常见),因此必须使浏览器和CloudFront缓存无效。

Also, let's assume on average, a single user downloads about 10MB of the static asset from your site (includes CSS/JS/images/fonts) hosted on S3 proxied through CloudFront.

此外,我们假设平均而言,单个用户从通过CloudFront代理的S3上托管的站点(包括CSS / JS /图像/字体)下载大约10MB的静态资产。

Let's calculate the cost:

让我们计算成本:

  1. 50K Indian users

    50K印度用户
  2. 0.17 USD per GB

    每GB 0.17美元
  3. 10MB per user

    每位使用者10MB
  4. Every user retrieves this 4 times a month (you flush your cache 4 times - once every week)

    每个用户每月检索4次(您刷新缓存4次-每周一次)

Cost = 50000 * 0.17 * (10/1024) * 4 = 332 USD. That is your COST of just data transfer! I did not calculate the S3 storage cost and the hosting site cost. (I also did not include lambda pricing because it's not much => $(0.20 * (50,000 * 4))/1 million = 4 cents.)

费用= 50000 * 0.17 *(10/1024)* 4 = 332美元。 这就是您仅进行数据传输的费用! 我没有计算S3存储成本和托管站点成本。 (我也不包括lambda定价,因为它的价格不高=> $(0.20 *(50,000 * 4))/ 1百万= 4美分。)

降低成本 (Lower cost bound)

In this case, let's assume a US based traffic site. The parameters now would be:

在这种情况下,我们假设一个美国的交通站点。 现在的参数为:

  1. 50K USA users

    50K美国用户
  2. 0.085 USD per GB

    每GB 0.085美元
  3. 3 MB per user

    每位使用者3 MB
  4. Every user retrieves this 4 times a month (you flush your cache 4 times - once every week)

    每个用户每月检索4次(您刷新缓存4次-每周一次)

The cost = 50000*0.085*3*4/1024 = 50 USD. That is the lowest you'll pay when using CloudFront with the mentioned traffic (given that all of your 50K users are from the USA only). And remember, that is the cost only for the data transfers! (Not including server costs for hosting your website.)

费用= 50000 * 0.085 * 3 * 4/1024 = 50美元。 这是您在使用CloudFront和上述流量时要支付的最低费用(假设您所有的5万用户仅来自美国)。 请记住,这仅是数据传输的费用! (不包括托管网站的服务器费用。)

另类 (Alternative)

Let's say now you host all these static assets on your main server only - reverse proxied by NGiNX and say, running on a $60 DigitalOcean instance.

假设现在您仅将所有这些静态资产托管在主服务器上-由NGiNX反向代理,并说在60美元的DigitalOcean实例上运行。

Your data transfer per month = 50000 * (10/1024) * 4 = 1952 GB approximately 2TB - DigitalOcean covers your 1TB of transfer per droplet for free. And it is $10 per 1TB from then, so it'll be $70 net for running the server.

您每月的数据传输= 50000 *(10/1024)* 4 = 1952 GB大约2TB-DigitalOcean免费涵盖每个液滴的1TB传输。 从那时起,每1TB的费用为10美元,因此运行服务器的费用为70美元。

Sure, you'll get some latency now - because you're hosting it yourself (we'll even fix this later). NGiNX is a high performing web server and you can rely on it not to be a bottleneck in your static asset delivery.

当然,您现在会遇到一些延迟-因为您是自己托管的(稍后我们将对此进行修复)。 NGiNX是高性能的Web服务器,您可以依靠它来避免静态资产交付中的瓶颈。

So you just dropped the cost of "only asset transfer" from $332 to $70 for running the whole server! Bonus tip? We were focusing on running this only in India, so use a DigitalOcean server from India. This would mean less latency.

因此,您只需将“仅资产转移”的费用从$ 332降低到$ 70,即可运行整个服务器! 奖金小费? 我们专注于仅在印度运行,因此请使用印度的DigitalOcean服务器。 这将意味着更少的延迟。

Not only this, but you can also opt for Cloudflare CDN too - which is FREE. Cloudflare won't respect your files to keep in the CDN if they're too big or too infrequently accessed. But we're assuming a hell of a popular site here, so we should be fine. If not, opt for any other CDN service, and I guarantee you it will be less than $332 a month.

不仅如此,您还可以选择Cloudflare CDN-这是免费的。 如果文件太大或访问频率不高,Cloudflare不会考虑将它们保留在CDN中。 但是我们假设这里是一个受欢迎的网站,所以我们应该没事。 如果没有,请选择其他CDN服务,我保证您每月将少于332美元。

TL;DR - If you're hosting a website with medium-large amounts of traffic with regularly scheduled updates, it is much more cost efficient to host assets yourself and use external CDNs (or even things like DigitalOcean CDN) instead of using S3 and CloudFront (where data traffic rates are through the roof).

TL; DR-如果您托管的网站流量中等而定期进行定期更新,则托管资产并使用外部CDN(甚至是DigitalOcean CDN之类的东西)而不是使用S3和CloudFront(数据流量速率通过屋顶)。

结论 (Conclusion)

I used this setup (CloudFront + AWS S3) on codedamn.com - a platform for developers to learn and grow. I soon realised that although it looks fancy and I've put codedamn into the big leagues - Amazon - it's just not efficient enough.

我在codedamn.com上使用了此设置(CloudFront + AWS S3)-开发人员学习和成长的平台。 我很快意识到,尽管它看起来不错,但我已经把代码该死的加入了大联盟-亚马逊-它的效率还不够。

Do you agree with me? What do you think? Let me know by tweeting at me on my Twitter or reaching out to me on Instagram.

你是否同意我的观点? 你怎么看? 通过在Twitter上发推特给我或在Instagram上与我联系,让我知道。

Peace!

和平!

翻译自: https://www.freecodecamp.org/news/do-not-use-s3-for-static-assets/

aws s3 静态网站

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: