mirror of
https://github.com/cheetahlou/CategoryResourceRepost.git
synced 2025-10-18 16:03:47 +08:00
429 lines
25 KiB
Markdown
429 lines
25 KiB
Markdown
<audio id="audio" title="80 | 程序员练级攻略:数据库" controls="" preload="none"><source id="mp3" src="https://static001.geekbang.org/resource/audio/b4/bc/b481f2396ab646fb2d4786ab3d3019bc.mp3"></audio>
|
||
|
||
对于数据库方向,重点就是两种数据库,一种是以SQL为代表的关系型数据库,另一种是以非SQL为代表的NoSQL数据库。关系型数据库主要有三个:Oracle、MySQL 和 Postgres。
|
||
|
||
在这里,我们只讨论越来越主流的MySQL数据库。首先,我们要了解数据库的一些实现原理和内存的一些细节,然后我们要知道数据的高可用和数据复制这些比较重要的话题,了解一下关系型数据库的一些实践和难点。然后,我们会进入到NoSQL数据库的学习。
|
||
|
||
NoSQL数据库千奇百怪,其主要是解决了关系型数据库中的各种问题。第一个大问题就是数据的Schema非常多,用关系型数据库来表示不同的Data Schema是非常笨拙的,所以要有不同的数据库(如时序型、键值对型、搜索型、文档型、图结构型等)。另一个大问题是,关系型数据库的ACID是一件很讨厌的事,这极大地影响了数据库的性能和扩展性,所以NoSQL在这上面做了相应的妥协以解决大规模伸缩的问题。
|
||
|
||
对于一个程序员,你可能觉得数据库的事都是DBA的事,然而我想告诉你你错了,这些事才真正是程序员的事。因为程序是需要和数据打交道的,所以程序员或架构师不仅需要设计数据模型,还要保证整体系统的稳定性和可用性,数据是整个系统中关键中的关键。所以,作为一个架构师或程序员,你必须了解最重要的数据存储——数据库。
|
||
|
||
# 关系型数据库
|
||
|
||
今天,关系型数据库最主要的两个代表是闭源的Oracle和开源的MySQL。当然,还有很多了,比如微软的SQL Server,IBM的DB2等,还有开源的PostgreSQL。关系型数据库的世界中有好多好多产品。当然,还是Oracle和MySQL是比较主流的。所以,这里主要介绍更为开放和主流的MySQL。
|
||
|
||
如果你要玩Oracle,我这里只推荐一本书《[Oracle Database 9i/10g/11g编程艺术](https://book.douban.com/subject/5402711/)》,无论是开发人员还是DBA,它都是必读的书。这本书的作者是Oracle公司的技术副总裁托马斯·凯特(Thomas Kyte),他也是世界顶级的Oracle专家。
|
||
|
||
这本书中深入分析了Oracle数据库体系结构,包括文件、内存结构以及构成Oracle数据库和实例的底层进程,利用具体示例讨论了一些重要的数据库主题,如锁定、并发控制、事务等。同时分析了数据库中的物理结构,如表、索引和数据类型,并介绍采用哪些技术能最优地使用这些物理结构。
|
||
|
||
<li>
|
||
学习MySQL,首先一定是要看[MySQL 官方手册](https://dev.mysql.com/doc/)。
|
||
</li>
|
||
<li>
|
||
然后,官方还有几个PPT也要学习一下。
|
||
<ul>
|
||
<li>
|
||
[How to Analyze and Tune MySQL Queries for Better Performance](https://www.mysql.com/cn/why-mysql/presentations/tune-mysql-queries-performance/)
|
||
</li>
|
||
<li>
|
||
[MySQL Performance Tuning 101](https://www.mysql.com/cn/why-mysql/presentations/mysql-performance-tuning101/)
|
||
</li>
|
||
<li>
|
||
[MySQL Performance Schema & Sys Schema](https://www.mysql.com/cn/why-mysql/presentations/mysql-performance-sys-schema/)
|
||
</li>
|
||
<li>
|
||
[MySQL Performance: Demystified Tuning & Best Practices](https://www.mysql.com/cn/why-mysql/presentations/mysql-performance-tuning-best-practices/)
|
||
</li>
|
||
<li>
|
||
[MySQL Security Best Practices](https://www.mysql.com/cn/why-mysql/presentations/mysql-security-best-practices/)
|
||
</li>
|
||
<li>
|
||
[MySQL Cluster Deployment Best Practices](https://www.mysql.com/cn/why-mysql/presentations/mysql-cluster-deployment-best-practices/)
|
||
</li>
|
||
<li>
|
||
[MySQL High Availability with InnoDB Cluster](https://www.mysql.com/cn/why-mysql/presentations/mysql-high-availability-innodb-cluster/)
|
||
</li>
|
||
|
||
然后推荐《[高性能MySQL](https://book.douban.com/subject/23008813/)》,这本书是MySQL领域的经典之作,拥有广泛的影响力。不但适合数据库管理员(DBA)阅读,也适合开发人员参考学习。不管是数据库新手还是专家,都能从本书中有所收获。
|
||
|
||
如果你对MySQL的内部原理有兴趣的话,可以看一下这本书《[MySQL技术内幕:InnoDB存储引擎](https://book.douban.com/subject/24708143/)》。当然,还有官网的[MySQL Internals Manual](https://dev.mysql.com/doc/internals/en/) 。
|
||
|
||
数据库的索引设计和优化也是非常关键的,这里还有一本书《[数据库的索引设计与优化](https://book.douban.com/subject/26419771/)》也是很不错的。虽然不是讲MySQL的,但是原理都是相通的。这也是上面推荐过的《高性能MySQL》在其索引部分推荐的一本好书。
|
||
|
||
你千万不要觉得只有做数据库你才需要学习这种索引技术。不是的!在系统架构上,在分布式架构中,索引技术也是非常重要的。这本书对于索引性能进行了非常清楚的估算,不像其它书中只是模糊的描述,你一定会收获很多。
|
||
|
||
下面还有一些不错的和MySQL相关的文章。
|
||
|
||
<li>
|
||
[MySQL索引背后的数据结构及算法原理](http://blog.codinglabs.org/articles/theory-of-mysql-index.html)
|
||
</li>
|
||
<li>
|
||
[Some study on database storage internals](https://medium.com/@kousiknath/data-structures-database-storage-internals-1f5ed3619d43)
|
||
</li>
|
||
<li>
|
||
[Sharding Pinterest: How we scaled our MySQL fleet](https://medium.com/@Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f)
|
||
</li>
|
||
<li>
|
||
[Guide to MySQL High Availability](https://www.mysql.com/cn/why-mysql/white-papers/mysql-guide-to-high-availability-solutions/)
|
||
</li>
|
||
<li>
|
||
[Choosing MySQL High Availability Solutions](https://dzone.com/articles/choosing-mysql-high-availability-solutions)
|
||
</li>
|
||
<li>
|
||
[High availability with MariaDB TX: The definitive guide](https://mariadb.com/sites/default/files/content/Whitepaper_High_availability_with_MariaDB-TX.pdf)
|
||
</li>
|
||
|
||
最后,还有一个MySQL的资源列表 [Awesome MySQL](https://shlomi-noach.github.io/awesome-mysql/),这个列表中有很多的工具和开发资源,可以帮助你做很多事。
|
||
|
||
MySQL有两个比较有名的分支,一个是Percona,另一个是MariaDB,其官网上的Resources页面中有很多不错的资源和文档,可以经常看看。 [Percona Resources](https://www.percona.com/resources)、[MariaDB Resources](https://mariadb.com/resources) ,以及它们的开发博客中也有很多不错的文章,分别为 [Percona Blog](https://www.percona.com/blog/) 和 [MariaDB Blog](https://mariadb.com/resources/blog)。
|
||
|
||
然后是关于MySQL的一些相关经验型的文章。
|
||
|
||
<li>
|
||
[Booking.com: Evolution of MySQL System Design](https://www.percona.com/live/mysql-conference-2015/sessions/bookingcom-evolution-mysql-system-design) ,Booking.com的MySQL数据库使用的演化,其中有很多不错的经验分享,我相信也是很多公司会遇到的的问题。
|
||
</li>
|
||
<li>
|
||
[Tracking the Money - Scaling Financial Reporting at Airbnb](https://medium.com/airbnb-engineering/tracking-the-money-scaling-financial-reporting-at-airbnb-6d742b80f040) ,Airbnb的数据库扩展的经验分享。
|
||
</li>
|
||
<li>
|
||
[Why Uber Engineering Switched from Postgres to MySQL](https://eng.uber.com/mysql-migration/) ,无意比较两个数据库谁好谁不好,推荐这篇Uber的长文,主要是想让你从中学习到一些经验和技术细节,这是一篇很不错的文章。
|
||
</li>
|
||
|
||
关于MySQL的集群复制,下面有这些文章供你学习一下,都是很不错的实践性比较强的文章。
|
||
|
||
<li>
|
||
[Monitoring Delayed Replication, With A Focus On MySQL](https://engineering.imvu.com/2013/01/09/monitoring-delayed-replication-with-a-focus-on-mysql/)
|
||
</li>
|
||
<li>
|
||
[Mitigating replication lag and reducing read load with freno](https://githubengineering.com/mitigating-replication-lag-and-reducing-read-load-with-freno/)
|
||
</li>
|
||
<li>
|
||
另外,Booking.com给了一系列的文章,你可以看看:
|
||
<ul>
|
||
<li>
|
||
[Better Parallel Replication for MySQL](https://medium.com/booking-com-infrastructure/better-parallel-replication-for-mysql-14e2d7857813)
|
||
</li>
|
||
<li>
|
||
[Evaluating MySQL Parallel Replication Part 2: Slave Group Commit](https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-2-slave-group-commit-459026a141d2)
|
||
</li>
|
||
<li>
|
||
[Evaluating MySQL Parallel Replication Part 3: Benchmarks in Production](https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-3-benchmarks-in-production-db5811058d74)
|
||
</li>
|
||
<li>
|
||
<p><a href="https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-4-more-benchmarks-in-production-49ee255043ab">Evaluating MySQL Parallel Replication Part 4: More Benchmarks in Production<br>
|
||
</a></p>
|
||
</li>
|
||
<li>
|
||
[Evaluating MySQL Parallel Replication Part 4, Annex: Under the Hood](https://medium.com/booking-com-infrastructure/evaluating-mysql-parallel-replication-part-4-annex-under-the-hood-eb456cf8b2fb)
|
||
</li>
|
||
|
||
对于MySQL的数据分区来说,还有下面几篇文章你可以看看。
|
||
|
||
<li>
|
||
[StackOverflow: MySQL sharding approaches?](https://stackoverflow.com/questions/5541421/mysql-sharding-approaches)
|
||
</li>
|
||
<li>
|
||
[Why you don’t want to shard](https://www.percona.com/blog/2009/08/06/why-you-dont-want-to-shard/)
|
||
</li>
|
||
<li>
|
||
[How to Scale Big Data Applications](https://www.percona.com/sites/default/files/presentations/How%20to%20Scale%20Big%20Data%20Applications.pdf)
|
||
</li>
|
||
<li>
|
||
[MySQL Sharding with ProxySQL](https://www.percona.com/blog/2016/08/30/mysql-sharding-with-proxysql/)
|
||
</li>
|
||
|
||
然后,再看看各个公司做MySQL Sharding的一些经验分享。
|
||
|
||
<li>
|
||
<p><a href="https://devs.mailchimp.com/blog/using-shards-to-accommodate-millions-of-users/">MailChimp: Using Shards to Accommodate Millions of Users<br>
|
||
</a></p>
|
||
</li>
|
||
<li>
|
||
[Uber: Code Migration in Production: Rewriting the Sharding Layer of Uber’s Schemaless Datastore](https://eng.uber.com/schemaless-rewrite/)
|
||
</li>
|
||
<li>
|
||
[Sharding & IDs at Instagram](https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c)
|
||
</li>
|
||
<li>
|
||
[Airbnb: How We Partitioned Airbnb’s Main Database in Two Weeks](https://medium.com/airbnb-engineering/how-we-partitioned-airbnb-s-main-database-in-two-weeks-55f7e006ff21)
|
||
</li>
|
||
|
||
# NoSQL数据库
|
||
|
||
关于NoSQL数据库,其最初目的就是解决大数据的问题。然而,也有人把其直接用来替换掉关系型数据库。所以在学习这个技术之前,我们需要对这个技术的一些概念和初衷有一定的了解。下面是一些推荐资料。
|
||
|
||
<li>
|
||
Martin Fowler在YouTube上分享的NoSQL介绍 [Introduction To NoSQL](https://youtu.be/qI_g07C_Q5I), 以及他参与编写的 [NoSQL Distilled - NoSQL 精粹](https://book.douban.com/subject/25662138/),这本书才100多页,是本难得的关于NoSQL的书,很不错,非常易读。
|
||
</li>
|
||
<li>
|
||
[NoSQL Databases: a Survey and Decision Guidance](https://medium.com/baqend-blog/nosql-databases-a-survey-and-decision-guidance-ea7823a822d#.nhzop4d23),这篇文章可以带你自上而下地从CAP原理到开始了解NoSQL的种种技术,是一篇非常不错的文章。
|
||
</li>
|
||
<li>
|
||
[Distribution, Data, Deployment: Software Architecture Convergence in Big Data Systems](https://resources.sei.cmu.edu/asset_files/WhitePaper/2014_019_001_90915.pdf),这是卡内基·梅隆大学的一篇讲分布式大数据系统的论文。其中主要讨论了在大数据时代下的软件工程中的一些关键点,也说到了NoSQL数据库。
|
||
</li>
|
||
<li>
|
||
[No Relation: The Mixed Blessings of Non-Relational Databases](http://ianvarley.com/UT/MR/Varley_MastersReport_Full_2009-08-07.pdf),这篇论文虽然有点年代久远。但这篇论文是HBase的基础,你花上一点时间来读读,就可以了解到,对各种非关系型数据存储优缺点的一个很好的比较。
|
||
</li>
|
||
<li>
|
||
[NoSQL Data Modeling Techniques](https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/) ,NoSQL建模技术。这篇文章我曾经翻译在了 CoolShell 上,标题为 [NoSQL 数据建模技术](https://coolshell.cn/articles/7270.htm),供你参考。
|
||
<ul>
|
||
<li>
|
||
[MongoDB - Data Modeling Introduction](https://docs.mongodb.com/manual/core/data-modeling-introduction/) ,虽然这是MongoDB的数据建模介绍,但是其很多观点可以用于其它的NoSQL数据库。
|
||
</li>
|
||
<li>
|
||
[Firebase - Structure Your Database](https://firebase.google.com/docs/database/android/structure-data) ,Google的Firebase数据库使用JSON建模的一些最佳实践。
|
||
</li>
|
||
|
||
因为CAP原理,所以当你需要选择一个NoSQL数据库的时候,你应该看看这篇文档 [Visual Guide to NoSQL Systems](http://blog.nahurst.com/visual-guide-to-nosql-systems)。
|
||
|
||
选SQL还是NoSQL,这里有两篇文章,值得你看看。
|
||
|
||
<li>
|
||
[SQL vs. NoSQL Databases: What’s the Difference?](https://www.upwork.com/hiring/data/sql-vs-nosql-databases-whats-the-difference/)
|
||
</li>
|
||
<li>
|
||
[Salesforce: SQL or NoSQL ](https://engineering.salesforce.com/sql-or-nosql-9eaf1d92545b)
|
||
</li>
|
||
|
||
# 各种NoSQL数据库
|
||
|
||
学习使用NoSQL数据库其实并不是一件很难的事,只要你把官方的文档仔细地读一下,是很容易上手的,而且大多数NoSQL数据库都是开源的,所以,也可以通过代码自己解决问题。下面我主要给出一些典型的NoSQL数据库的一些经验型的文章,供你参考。
|
||
|
||
**列数据库Column Database**
|
||
|
||
<li>
|
||
Cassandra相关
|
||
<ul>
|
||
<li>
|
||
沃尔玛实验室有两篇文章值得一读。
|
||
<ul>
|
||
- [Avoid Pitfalls in Scaling Cassandra Cluster at Walmart](https://medium.com/walmartlabs/avoid-pitfalls-in-scaling-your-cassandra-cluster-lessons-and-remedies-a71ca01f8c04)
|
||
- [Storing Images in Cassandra at Walmart](https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593)
|
||
|
||
[Yelp: How We Scaled Our Ad Analytics with Apache Cassandra](https://engineeringblog.yelp.com/2016/08/how-we-scaled-our-ad-analytics-with-cassandra.html) ,Yelp的这篇博客也有一些相关的经验和教训。
|
||
|
||
[Discord: How Discord Stores Billions of Messages](https://blog.discordapp.com/how-discord-stores-billions-of-messages-7fa6ec7ee4c7) ,Discord公司分享的一个如何存储十亿级消息的技术文章。
|
||
|
||
[Cassandra at Instagram](https://www.slideshare.net/DataStax/cassandra-at-instagram-2016) ,Instagram的一个PPT,其中介绍了Instagram中是怎么使用Cassandra的。
|
||
|
||
[Netflix: Benchmarking Cassandra Scalability on AWS - Over a million writes per second](https://medium.com/netflix-techblog/benchmarking-cassandra-scalability-on-aws-over-a-million-writes-per-second-39f45f066c9e) ,Netflix公司在AWS上给Cassandra做的一个Benchmark。
|
||
|
||
HBase相关
|
||
|
||
<li>
|
||
[Imgur Notification: From MySQL to HBASE](https://medium.com/imgur-engineering/imgur-notifications-from-mysql-to-hbase-9dba6fc44183)
|
||
</li>
|
||
<li>
|
||
[Pinterest: Improving HBase Backup Efficiency](https://medium.com/@Pinterest_Engineering/improving-hbase-backup-efficiency-at-pinterest-86159da4b954)
|
||
</li>
|
||
<li>
|
||
[IBM : Tuning HBase performance](https://www.ibm.com/support/knowledgecenter/en/SSPT3X_2.1.2/com.ibm.swg.im.infosphere.biginsights.analyze.doc/doc/bigsql_TuneHbase.html)
|
||
</li>
|
||
<li>
|
||
[HBase File Locality in HDFS](http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html)
|
||
</li>
|
||
<li>
|
||
[Apache Hadoop Goes Realtime at Facebook](http://borthakur.com/ftp/RealtimeHadoopSigmod2011.pdf)
|
||
</li>
|
||
<li>
|
||
[Storage Infrastructure Behind Facebook Messages: Using HBase at Scale](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.294.8459&rep=rep1&type=pdf)
|
||
</li>
|
||
<li>
|
||
[GitHub: Awesome HBase](https://github.com/rayokota/awesome-hbase)
|
||
</li>
|
||
|
||
针对于HBase有两本书你可以考虑一下。
|
||
|
||
<li>
|
||
首先,先推荐两本书,一本是偏实践的《[HBase实战](https://book.douban.com/subject/25706541/)》,另一本是偏大而全的手册型的《[HBase权威指南](https://book.douban.com/subject/10748460/)》。
|
||
</li>
|
||
<li>
|
||
当然,你也可以看看官方的 [The Apache HBase™ Reference Guide](http://hbase.apache.org/0.94/book/book.html)
|
||
</li>
|
||
<li>
|
||
另外两个列数据库:
|
||
<ul>
|
||
<li>
|
||
[ClickHouse - Open Source Distributed Column Database at Yandex](https://clickhouse.yandex/)
|
||
</li>
|
||
<li>
|
||
[Scaling Redshift without Scaling Costs at GIPHY](https://engineering.giphy.com/scaling-redshift-without-scaling-costs/)
|
||
</li>
|
||
|
||
**文档数据库 Document Database - MongoDB, SimpleDB, CouchDB**
|
||
|
||
<li>
|
||
[Data Points - What the Heck Are Document Databases?](https://msdn.microsoft.com/en-us/magazine/hh547103.aspx)
|
||
</li>
|
||
<li>
|
||
[eBay: Building Mission-Critical Multi-Data Center Applications with MongoDB](https://www.mongodb.com/blog/post/ebay-building-mission-critical-multi-data-center-applications-with-mongodb)
|
||
</li>
|
||
<li>
|
||
[The AWS and MongoDB Infrastructure of Parse: Lessons Learned](https://medium.baqend.com/parse-is-gone-a-few-secrets-about-their-infrastructure-91b3ab2fcf71)
|
||
</li>
|
||
<li>
|
||
[Migrating Mountains of Mongo Data](https://medium.com/build-addepar/migrating-mountains-of-mongo-data-63e530539952)
|
||
</li>
|
||
<li>
|
||
[Couchbase Ecosystem at LinkedIn](https://engineering.linkedin.com/blog/2017/12/couchbase-ecosystem-at-linkedin)
|
||
</li>
|
||
<li>
|
||
[SimpleDB at Zendesk](https://medium.com/zendesk-engineering/resurrecting-amazon-simpledb-9404034ec506)
|
||
</li>
|
||
<li>
|
||
[Github: Awesome MongoDB](https://github.com/ramnes/awesome-mongodb)
|
||
</li>
|
||
|
||
**数据结构数据库 Data structure Database - Redis**
|
||
|
||
<li>
|
||
[Learn Redis the hard way (in production) at Trivago](http://tech.trivago.com/2017/01/25/learn-redis-the-hard-way-in-production/)
|
||
</li>
|
||
<li>
|
||
[Twitter: How Twitter Uses Redis To Scale - 105TB RAM, 39MM QPS, 10,000+ Instances ](http://highscalability.com/blog/2014/9/8/how-twitter-uses-redis-to-scale-105tb-ram-39mm-qps-10000-ins.html)
|
||
</li>
|
||
<li>
|
||
[Slack: Scaling Slack’s Job Queue - Robustly Handling Billions of Tasks in Milliseconds Using Kafka and Redis](https://slack.engineering/scaling-slacks-job-queue-687222e9d100)
|
||
</li>
|
||
<li>
|
||
[GitHub: Moving persistent data out of Redis at GitHub](https://githubengineering.com/moving-persistent-data-out-of-redis/)
|
||
</li>
|
||
<li>
|
||
[Instagram: Storing Hundreds of Millions of Simple Key-Value Pairs in Redis](https://engineering.instagram.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c)
|
||
</li>
|
||
<li>
|
||
[Redis in Chat Architecture of Twitch (from 27:22)](https://www.infoq.com/presentations/twitch-pokemon)
|
||
</li>
|
||
<li>
|
||
[Deliveroo: Optimizing Session Key Storage in Redis](https://deliveroo.engineering/2016/10/07/optimising-session-key-storage.html)
|
||
</li>
|
||
<li>
|
||
[Deliveroo: Optimizing Redis Storage](https://deliveroo.engineering/2017/01/19/optimising-membership-queries.html)
|
||
</li>
|
||
<li>
|
||
[GitHub: Awesome Redis](https://github.com/JamzyWang/awesome-redis)
|
||
</li>
|
||
|
||
**时序数据库 Time-Series Database**
|
||
|
||
<li>
|
||
[What is Time-Series Data & Why We Need a Time-Series Database](https://blog.timescale.com/what-the-heck-is-time-series-data-and-why-do-i-need-a-time-series-database-dcf3b1b18563)
|
||
</li>
|
||
<li>
|
||
[Time Series Data: Why and How to Use a Relational Database instead of NoSQL](https://blog.timescale.com/time-series-data-why-and-how-to-use-a-relational-database-instead-of-nosql-d0cd6975e87c)
|
||
</li>
|
||
<li>
|
||
[Beringei: High-performance Time Series Storage Engine @Facebook](https://code.facebook.com/posts/952820474848503/beringei-a-high-performance-time-series-storage-engine/)
|
||
</li>
|
||
<li>
|
||
[Introducing Atlas: Netflix’s Primary Telemetry Platform @Netflix](https://medium.com/netflix-techblog/introducing-atlas-netflixs-primary-telemetry-platform-bd31f4d8ed9a)
|
||
</li>
|
||
<li>
|
||
[Building a Scalable Time Series Database on PostgreSQL](https://blog.timescale.com/when-boring-is-awesome-building-a-scalable-time-series-database-on-postgresql-2900ea453ee2)
|
||
</li>
|
||
<li>
|
||
[Scaling Time Series Data Storage - Part I @Netflix](https://medium.com/netflix-techblog/scaling-time-series-data-storage-part-i-ec2b6d44ba39)
|
||
</li>
|
||
<li>
|
||
[Design of a Cost Efficient Time Series Store for Big Data](https://medium.com/@leventov/design-of-a-cost-efficient-time-series-store-for-big-data-88c5dc41af8e)
|
||
</li>
|
||
<li>
|
||
[GitHub: Awesome Time-Series Database](https://github.com/xephonhq/awesome-time-series-database)
|
||
</li>
|
||
|
||
**图数据库 - Graph Platform**
|
||
|
||
<li>
|
||
首先是IBM Devloperworks 上的两个简介性的PPT。
|
||
<ul>
|
||
<li>
|
||
[Intro to graph databases, Part 1, Graph databases and the CRUD operations](https://www.ibm.com/developerworks/library/cl-graph-database-1/cl-graph-database-1-pdf.pdf)
|
||
</li>
|
||
<li>
|
||
[Intro to graph databases, Part 2, Building a recommendation engine with a graph database](https://www.ibm.com/developerworks/library/cl-graph-database-2/cl-graph-database-2-pdf.pdf)
|
||
</li>
|
||
|
||
然后是一本免费的电子书《[Graph Database](http://graphdatabases.com)》。
|
||
|
||
接下来是一些图数据库的介绍文章。
|
||
|
||
<li>
|
||
[Handling Billions of Edges in a Graph Database](https://www.infoq.com/presentations/graph-database-scalability)
|
||
</li>
|
||
<li>
|
||
[Neo4j case studies with Walmart, eBay, AirBnB, NASA, etc](https://neo4j.com/customers/)
|
||
</li>
|
||
<li>
|
||
[FlockDB: Distributed Graph Database for Storing Adjacency Lists at Twitter](https://blog.twitter.com/engineering/en_us/a/2010/introducing-flockdb.html)
|
||
</li>
|
||
<li>
|
||
[JanusGraph: Scalable Graph Database backed by Google, IBM and Hortonworks](https://architecht.io/google-ibm-back-new-open-source-graph-database-project-janusgraph-1d74fb78db6b)
|
||
</li>
|
||
<li>
|
||
[Amazon Neptune](https://aws.amazon.com/neptune/)
|
||
</li>
|
||
|
||
**搜索数据库 - ElasticSearch**
|
||
|
||
<li>
|
||
[Elasticsearch: The Definitive Guide](https://www.elastic.co/guide/en/elasticsearch/guide/master/index.html) 这是官网方的ElasticSearch的学习资料,基本上来说,看这个就够了。
|
||
</li>
|
||
<li>
|
||
接下来是4篇和性能调优相关的工程实践。
|
||
<ul>
|
||
<li>
|
||
[Elasticsearch Performance Tuning Practice at eBay](https://www.ebayinc.com/stories/blogs/tech/elasticsearch-performance-tuning-practice-at-ebay/)
|
||
</li>
|
||
<li>
|
||
[Elasticsearch at Kickstarter](https://kickstarter.engineering/elasticsearch-at-kickstarter-db3c487887fc)
|
||
</li>
|
||
<li>
|
||
[9 tips on ElasticSearch configuration for high performance](https://www.loggly.com/blog/nine-tips-configuring-elasticsearch-for-high-performance/)
|
||
</li>
|
||
<li>
|
||
[Elasticsearch In Production - Deployment Best Practices](https://medium.com/@abhidrona/elasticsearch-deployment-best-practices-d6c1323b25d7)
|
||
</li>
|
||
|
||
最后是GitHub上的资源列表 [GitHub: Awesome ElasticSearch](https://github.com/dzharii/awesome-elasticsearch) 。
|
||
|
||
# 小结
|
||
|
||
好了,总结一下今天分享的内容。虽然有人会认为数据库与程序员无关,是DBA的事儿。但我坚信,数据库才真正是程序员的事儿。因为程序是需要和数据打交道的,所以程序员或架构师不仅需要设计数据模型,还要保证整体系统的稳定性和可用性,数据是整个系统中关键中的关键。
|
||
|
||
对于数据库方向,重点就是两种数据库,一种是以SQL为代表的关系型数据库,另一种是以非SQL为代表的NoSQL数据库。因而,在这篇文章中,我给出了MySQL和各种开源NoSQL的一些相关的有价值的文章和导读,主要是让你对这些数据库的内在有一定的了解,但又不会太深。同时给出了一些知名企业使用数据库的工程实践,这对于了解各种数据库的优劣非常有帮助,值得认真读读。
|
||
|
||
从下篇文章开始,我们将进入分布式系统架构方面的内容,里面不仅涵盖了大量的理论知识,更有丰富的入门指导和大量的工程实践。敬请期待。
|
||
|
||
下面是《程序员练级攻略》系列文章的目录。
|
||
|
||
- [开篇词](https://time.geekbang.org/column/article/8136)
|
||
<li>入门篇
|
||
<ul>
|
||
- [零基础启蒙](https://time.geekbang.org/column/article/8216)
|
||
- [正式入门](https://time.geekbang.org/column/article/8217)
|
||
|
||
- [程序员修养](https://time.geekbang.org/column/article/8700)
|
||
|
||
- [编程语言](https://time.geekbang.org/column/article/8701)
|
||
- [理论学科](https://time.geekbang.org/column/article/8887)
|
||
- [系统知识](https://time.geekbang.org/column/article/8888)
|
||
|
||
- [软件设计](https://time.geekbang.org/column/article/9369)
|
||
|
||
- [Linux系统、内存和网络(系统底层知识)](https://time.geekbang.org/column/article/9759)
|
||
- [异步I/O模型和Lock-Free编程(系统底层知识)](https://time.geekbang.org/column/article/9851)
|
||
- [Java底层知识](https://time.geekbang.org/column/article/10216)
|
||
- [数据库](https://time.geekbang.org/column/article/10301)
|
||
- [分布式架构入门(分布式架构)](https://time.geekbang.org/column/article/10603)
|
||
- [分布式架构经典图书和论文(分布式架构)](https://time.geekbang.org/column/article/10604)
|
||
- [分布式架构工程设计(分布式架构)](https://time.geekbang.org/column/article/11232)
|
||
- [微服务](https://time.geekbang.org/column/article/11116)
|
||
- [容器化和自动化运维](https://time.geekbang.org/column/article/11665)
|
||
- [机器学习和人工智能](https://time.geekbang.org/column/article/11669)
|
||
- [前端基础和底层原理(前端方向)](https://time.geekbang.org/column/article/12271)
|
||
- [前端性能优化和框架(前端方向)](https://time.geekbang.org/column/article/12389)
|
||
- [UI/UX设计(前端方向)](https://time.geekbang.org/column/article/12486)
|
||
- [技术资源集散地](https://time.geekbang.org/column/article/12561)
|
||
|
||
|