
I am a Data Architect for Shutterfly Inc. in Redwood Shores, CA.
LinkedIn Profile-
Github activity
2010-09-07 20:31:11: kgorman created gist: 5690412010-09-07 20:18:04: kgorman created gist: 5690182010-08-28 01:48:18: kgorman updated gist: 5545412010-08-28 01:48:05: kgorman updated gist: 5545412010-08-28 01:46:34: kgorman updated gist: 554541 Categories
- Data Architecture (5)
- Database Engineering (12)
- Mongodb (12)
- MySQL (1)
- Oracle (5)
- PostgreSQL (18)
- Python (7)
- Random (6)
Recent Comments
- jametong on Data clustering in MongoDB using embedded docs
- Data clustering in MongoDB using embedded docskennygorman.com | kennygorman.com on Cluster data, save cash
- kgorman on WordPress 3.0
- Disk I/O: PCI Based SSDs « makeitfaster on Fusion-io SSD
- Log Buffer #189, A Carnival of the Vanities for DBAs « So Many Oracle Manuals, So Little Time on Wayback Machine: snapshots still valid technique
Tags
acid architecture auto-increment backups cap theorem clustering database Database Engineering direct i/o durability fusion-io gnuplot hp Mongodb mongosf mongostat MySQL nosql open source Oracle performance pgstat pg_reorg PostgreSQL postgresql Python replica sets replication rrdtool rss scalability sequence Shutterfly slony snapshots Solaris SSD statspack strace Sun tools vxfs WordpressBlogroll
Archives
Tag Archives: postgresql
Why Not Auto Increment in MongoDB
I came across this blog post with a nice pattern for auto-increment in MongoDB. It’s a great post, but there is something to think about beyond how to logically perform the operation; performance. The idea presented in the blog is … Continue reading
Posted in Data Architecture, Database Engineering, Mongodb, Python
Tagged auto-increment, Mongodb, postgresql, Python, sequence
View Comments
Wayback Machine: snapshots still valid technique
I came across this old article I wrote for the NOCOUG newsletter in 2002 about using OS snapshots for backups. This technique is still very much a valid and widely used technique to perform backups. The idea is simple: – … Continue reading
Posted in Database Engineering, Mongodb, Oracle, PostgreSQL
Tagged backups, Oracle, postgresql, snapshots
View Comments
Dropping ACID
The de-facto durability story in MongoDB is essentially… there is none. Or at least single server durability. OMFG! No ACID WTF! &^%#^#?! For the next generation of internet scale, downtime intolerant systems, ACID may not be a desirable property. Traditional … Continue reading
Posted in Data Architecture, Database Engineering, Mongodb, MySQL, Oracle
Tagged acid, architecture, cap theorem, durability, MySQL, Oracle, postgresql, replication, scalability
View Comments
pgstat 1.0 released
I added a couple of fixes to the code and released it as 1.0. We have been using it here at hi5 for some time w/o problems. Thanks everyone who has helped with feedback. Also thanks to Devrim GUNDUZ for … Continue reading
pg_reorg 1.0.4
At Hi5, we currently use pg_reorg1.0.3 in order to organize data in a clustered fashion. I posted previously about the strategy. Our version is slightly modified, the modifications I made to the C code essentially allow pg_reorg to spin/wait for … Continue reading
Fusion-io SSD
I got the opportunity to test out some of the new Fusion-io Solid State ioDrive, and I thought I would post some results. Fusion-io has created a SSD product called ioDrive that is based on PCIe cards vs replacing SAS … Continue reading
Posted in Database Engineering, PostgreSQL
Tagged fusion-io, performance, postgresql, SSD, vxfs
View Comments
Tuning for very large shared_buffers, using Veritas VxFS
The debate about the optimum shared_buffers size in PostgreSQL is clearly far from over. However, I have not seen any results where the buffer cache and the FS cache were tuned in unison. Because PostgreSQL knows what buffers are in … Continue reading
Graphing pgstat output
If you’re using the pgd pgstat utility I posted about previously, you can graph the output with very little effort using gnuplot. In my case I use pgd pgstat for capturing output for various PostgreSQL performance tests, and of course … Continue reading
pgstat: a database utility like iostat
I needed a utility for capturing various data-points about a PostgreSQL database as I performed load tests. I copied a utility I have used previously on Oracle that worked quite well. The new utility is called pgd pgstat. This utility … Continue reading
Unique identifier for a database without connecting to the database?
In PostgreSQL, the working directory is a unique identifier for a database, and sometimes you want to use that working dir in your script(s). But what if you don’t want to actually connect and query the database? Is there a … Continue reading