Three things you should never put in your database
2012-05-15 19:25
459 查看
As I've said in a few talks, the best way to improve your systems is by first not doing "dumb things". I don't mean you or your development staff is "dumb", it's easy to overlook the implications of these types of decisions and not realize how bad they are
for maintainability let alone scaling. As a consultant I see this stuff all of the time and I have yet to ever see it work out well for anyone.
Images, files, and binary data
Your database supports BLOBs so it must be a good idea to shove your files in there right? No it isn't! Hell it isn't even very convenient to use with many DB language bindings.
There are a few of problems with storing files in your database:
read/write to a DB is always slower than a filesystem
your DB backups grow to be huge and more time consuming
access to the files now requires going through your app and DB layers
The last two are the real killers. Storing your thumbnail images in your database? Great now you can't use nginx or another lightweight web server to serve them up.
Do yourself a favor and store a simple relative path to your files on disk in the database or use something like S3 or any CDN instead.
Ephemeral data
Usage statistics, metrics, GPS locations, session data anything that is only useful to you for a short period of time or frequently changes. If you find yourself DELETEing an hour, day, or weeks worth of some table with a cron job, you're using the wrong tool
for the job.
Use redis, statsd/graphite, Riak anything else that is better suited to that type of work load. The same advice goes for aggregations of ephemeral data that doesn't live for very long.
Sure it's possible to use a backhoe to plant some tomatoes in the garden, but it's far faster to grab the shovel in the garage than schedule time with a backhoe and have it arrive at your place and dig. Use the right tool(s) for the job at hand.
Logs
This one seems ok on the surface and the "I might need to use a complex query on them at some point in the future" argument seems to win people over. Storing your logs in a database isn't a HORRIBLE idea, but storing them in the same database as your other
production data is.
Maybe you're conservative with your logging and only emit one log line per web request normally. That is still generating a log INSERT for every action on your site that is competing for resources that your users could be using. Turn up your logging to a verbose
or debug level and watch your production database catch on fire!
Instead use something like Splunk, Loggly or plain old rotating flat files for your logs. The few times you need to inspect them in odd ways, even to the point of having to write a bit of code to find your answers, is easily outweighed by the constant resources
it puts on your system.
But wait, you're a unique snowflake and your problem is SO different that it's ok for you to do one of these three. No you aren't and no it really isn't. Trust me.
原文:点击打开链接
for maintainability let alone scaling. As a consultant I see this stuff all of the time and I have yet to ever see it work out well for anyone.
Images, files, and binary data
Your database supports BLOBs so it must be a good idea to shove your files in there right? No it isn't! Hell it isn't even very convenient to use with many DB language bindings.
There are a few of problems with storing files in your database:
read/write to a DB is always slower than a filesystem
your DB backups grow to be huge and more time consuming
access to the files now requires going through your app and DB layers
The last two are the real killers. Storing your thumbnail images in your database? Great now you can't use nginx or another lightweight web server to serve them up.
Do yourself a favor and store a simple relative path to your files on disk in the database or use something like S3 or any CDN instead.
Ephemeral data
Usage statistics, metrics, GPS locations, session data anything that is only useful to you for a short period of time or frequently changes. If you find yourself DELETEing an hour, day, or weeks worth of some table with a cron job, you're using the wrong tool
for the job.
Use redis, statsd/graphite, Riak anything else that is better suited to that type of work load. The same advice goes for aggregations of ephemeral data that doesn't live for very long.
Sure it's possible to use a backhoe to plant some tomatoes in the garden, but it's far faster to grab the shovel in the garage than schedule time with a backhoe and have it arrive at your place and dig. Use the right tool(s) for the job at hand.
Logs
This one seems ok on the surface and the "I might need to use a complex query on them at some point in the future" argument seems to win people over. Storing your logs in a database isn't a HORRIBLE idea, but storing them in the same database as your other
production data is.
Maybe you're conservative with your logging and only emit one log line per web request normally. That is still generating a log INSERT for every action on your site that is competing for resources that your users could be using. Turn up your logging to a verbose
or debug level and watch your production database catch on fire!
Instead use something like Splunk, Loggly or plain old rotating flat files for your logs. The few times you need to inspect them in odd ways, even to the point of having to write a bit of code to find your answers, is easily outweighed by the constant resources
it puts on your system.
But wait, you're a unique snowflake and your problem is SO different that it's ok for you to do one of these three. No you aren't and no it really isn't. Trust me.
原文:点击打开链接
相关文章推荐
- [转]Three things you should never put in your database
- Three things you should never put in your database
- The Three Most Important Things You Look for in Your Employment Relationship
- 4 Things You Should Never Do with Your JMeter Script
- Driving Me Nuts - Things You Never Should Do in the Kernel
- 不要轻易重头再来,可能的话,尽量优化而非重构——读《 Things You Should Never Do, Part I》后感
- Your Podfile has had smart quotes sanitised. To avoid issues in the future, you should not use TextE
- 使用cocapods报错 [!] Your Podfile has had smart quotes sanitised. To avoid issues in the future, you should not use TextEdit for editing it. If you are not using TextEdit, you should turn off smart quotes
- 10 things you should NEVER say during presentations
- 41.You have recently collected statistics on certain objects of a schema in your database. But you o
- 94.You plan to move data from a flat file to a table in your database. You decide to use SQL*Loader
- What 10 things should you do every day to improve your life?
- When should you store serialized objects in the database?
- Three Things About Data Science You Won't Find In the Books
- ccah-500 第45题 You want to minimize the chance of data loss in your cluster. What should you do
- JBPM工作流之出现Could not synchronize database state with session以及You have an error in your SQL syntax;的异
- 51 You have not configured Oracle Managed Files (OMF) in your database. You do not want to scan the
- You should be the top dog in your department and always keep an eye on the outside world.
- Top 2 things you should know about the Global Assembly Cache (GAC) in .NET 4.0
- 54 You have enabled resumable space allocation in your database by setting the RESUMABLE_TIMEOUT par