
I am a Data Architect for Shutterfly Inc. in Redwood Shores, CA.
LinkedIn Profile-
Github activity
2010-09-07 20:31:11: kgorman created gist: 5690412010-09-07 20:18:04: kgorman created gist: 5690182010-08-28 01:48:18: kgorman updated gist: 5545412010-08-28 01:48:05: kgorman updated gist: 5545412010-08-28 01:46:34: kgorman updated gist: 554541 Categories
- Data Architecture (5)
- Database Engineering (12)
- Mongodb (12)
- MySQL (1)
- Oracle (5)
- PostgreSQL (18)
- Python (7)
- Random (6)
Recent Comments
- jametong on Data clustering in MongoDB using embedded docs
- Data clustering in MongoDB using embedded docskennygorman.com | kennygorman.com on Cluster data, save cash
- kgorman on WordPress 3.0
- Disk I/O: PCI Based SSDs « makeitfaster on Fusion-io SSD
- Log Buffer #189, A Carnival of the Vanities for DBAs « So Many Oracle Manuals, So Little Time on Wayback Machine: snapshots still valid technique
Tags
acid architecture auto-increment backups cap theorem clustering database Database Engineering direct i/o durability fusion-io gnuplot hp Mongodb mongosf mongostat MySQL nosql open source Oracle performance pgstat pg_reorg PostgreSQL postgresql Python replica sets replication rrdtool rss scalability sequence Shutterfly slony snapshots Solaris SSD statspack strace Sun tools vxfs WordpressBlogroll
Archives
Category Archives: Database Engineering
MongoDB: Lagged Replica with Replica Sets
In an enterprise database architecture, it’s very common to create a standby or replica database with a ‘lag’ in it’s state relative to the primary. Operations applied to the primary are not seen on the replica for some amount of … Continue reading
Posted in Data Architecture, Database Engineering, Mongodb
Tagged Mongodb, replica sets
View Comments
Why Not Auto Increment in MongoDB
I came across this blog post with a nice pattern for auto-increment in MongoDB. It’s a great post, but there is something to think about beyond how to logically perform the operation; performance. The idea presented in the blog is … Continue reading
Posted in Data Architecture, Database Engineering, Mongodb, Python
Tagged auto-increment, Mongodb, postgresql, Python, sequence
View Comments
Data clustering in MongoDB using embedded docs
I wrote a while ago about how to cluster data to save cash. This post was geared towards relational stores. But in reality, the technique is applicable to any block store on disk. To recap, the premise is simple. When … Continue reading
Posted in Data Architecture, Database Engineering, Mongodb
Tagged architecture, clustering, Mongodb, performance
View Comments
Wayback Machine: snapshots still valid technique
I came across this old article I wrote for the NOCOUG newsletter in 2002 about using OS snapshots for backups. This technique is still very much a valid and widely used technique to perform backups. The idea is simple: – … Continue reading
Posted in Database Engineering, Mongodb, Oracle, PostgreSQL
Tagged backups, Oracle, postgresql, snapshots
View Comments
MongoSF Slides
I had a great time at the MongoSF Conference on Friday. There were a ton of great presentations, and lots and lots of excitement. A big thanks to 10gen for inviting me to speak. I had a great time and … Continue reading
Dropping ACID
The de-facto durability story in MongoDB is essentially… there is none. Or at least single server durability. OMFG! No ACID WTF! &^%#^#?! For the next generation of internet scale, downtime intolerant systems, ACID may not be a desirable property. Traditional … Continue reading
Posted in Data Architecture, Database Engineering, Mongodb, MySQL, Oracle
Tagged acid, architecture, cap theorem, durability, MySQL, Oracle, postgresql, replication, scalability
View Comments
Hello Shutterfly
I am very excited to start a new position at Shutterfly.com. Shutterfly is a well known internet property with a massivly growing customer base and thus new and interesting challenges in how to store, share, and organize data. What fantastic … Continue reading
pg_reorg 1.0.4
At Hi5, we currently use pg_reorg1.0.3 in order to organize data in a clustered fashion. I posted previously about the strategy. Our version is slightly modified, the modifications I made to the C code essentially allow pg_reorg to spin/wait for … Continue reading
Fusion-io SSD
I got the opportunity to test out some of the new Fusion-io Solid State ioDrive, and I thought I would post some results. Fusion-io has created a SSD product called ioDrive that is based on PCIe cards vs replacing SAS … Continue reading
Posted in Database Engineering, PostgreSQL
Tagged fusion-io, performance, postgresql, SSD, vxfs
View Comments
Cluster data, save cash
Since the economy is not exactly rocking these days, I suspect there are a lot of companies out there trying to save a buck or two on infrastructure. Databases are not exactly cheap, so anything that an engineer or DBA … Continue reading
Posted in Database Engineering, PostgreSQL, Python
View Comments