Facebook Rolls Out New Storage System – Haystack
News link: here
Gadget
BT Vision is a freeview box and digital TV recorder combined - you can enjoy Freeview channels, tv on demand, sports tv and more.
Get a great deal with our digital tv packages or look at the BT credit card for discounts. Visit us to find out more.
BT Phone line
BT Total Broadband
A TV and aerial
Freeview coverage
Want fast, broadband wireless Internet? Get a BT dongle Or get a wireless internet connection from BT .
If you are unsure of how fast your line is, use our broadband checker. You just have to enter your telephone number or postcode below. You'll need a minimum of 2MB speed to be able to get BT Vision.
Want to see check broadband availability in your local area? Enter your postcode in our broadband postcode checker below and find out what is available to you.
BT offers great support including services for computer security and media storage. Contact us and we'll be more than happy to help you.
82 Responses
4.8.2009
I would rather they work on higher resolution photo albums.
4.8.2009
Yea. they offer high quality video, but no high-res photos?
4.8.2009
I’d rather Facebook didn’t exist.
4.8.2009
If you’re not on it, what harm does it cause you?
4.8.2009
No, I feel the same way and I am on it
4.8.2009
It’s probably a wise move to keep the photos at a medium resolution. Let’s face it, most people do not own a 30" monitor and do not really care about the photo size or resolution. I have some friends on there that have over a thousand photos each.
4.8.2009
Lets get a group on Facebook going to see if they would. Group called: Facebook allow full size photo download at facebook.com/group.php?sid=b46bad7e62d1 …
4.8.2009
There are no details so its impossible to tell if this is interesting or not. It would be pretty efficient to store users photos in a folder like /home/<userid>/<album>/photo1.jpg
4.8.2009
clever. i’ll bet they hadn’t thought of that one.
4.8.2009
Now twitter needs to do the same.
4.8.2009
twitter uses S3 for the storing of images.
4.8.2009
whether you like or hate facebook, the tech to keep should a huge amount of data and the volume of requests is amazing.
4.8.2009
My first thought, before reading the article:nginx, reiserfs, MD5/similar hash of the entire file and indexed something like /photos/3/40/340c2df22a22533f75b6820566d02b97 — eliminates duplication in the process, and if you like you could do a byte-for-byte check for hash collisions when uploading, since that’s done infrequently.My second thought, after reading the article:Why does anyone give a *****? Was it so noticeably slow before (dunno never used it) that a 50% speed-up requires a press release?And Haystack — isn’t that like their terms of service? The needle being: your rights to your data and what they can do with it?
4.8.2009
Thanks for the great idea on file structure. I been using file hashes in my current project, but never thought about using the first 3 digits in the path as well. I was going to use dates, but didnt think that would work to well.
4.8.2009
Their main problem was the time it took to retrieve a photo due to the metadata associated with each file, mtime, atime, etc.Check out this presentation: flowgram.com/p/2qi3k8eicrfgkv/ I believe it’s about a year old now so the numbers have increased somewhat. I think you’ll agree that what they are doing is pretty bad-ass.
4.8.2009
And your string before applying the md5 hash was … "digg".
L33t H4x indeed!
4.8.2009
actually, it was horribly slow. Sometimes photos would come up instantly, sometimes they’d take 5-60 seconds. Sometimes photos simply never came up.
4.8.2009
Im sorry, but i just dont believe that they designed this "in house." Our IT guys at work dont do this and they manage >4PB of data. They upgrade to new *****, like thumpers, but facebook designing "in house" is *****. So they got some new file servers, big deal.
4.8.2009
Facebook uses NetApp Filers on the backend but that is not the point. Their problem wasn’t how to store all this data, it was how to retrieve it EXTREMELY quickly. They were hitting a ceiling when their NetApp’s were doing about 15 IO’s per photo request. They rearranged the file and directory structure and got it down to 3 IO’s per photo request. Haystack let’s them write a ton of photos into a series of 10GB files. They then store metadata about the photos in memory. When a request comes in they retrieve the inode and offset (the metadata) from memory and can retrieve the photo using only 1 disk IO. In my opinion, that’s pretty awesome stuff.Have you ever followed their developer blog or looked at their projects? They develop a lot of stuff "in house" and some of it is open sourced, like their big memcache contributions. They’ve also done some awesome work modifying MySQL to automatically update memcache keys. I suggest you check it out.I’m curious, what do your IT guys manage that is 4PB?
4.8.2009
Thanks for the info. Very interesting.Im a scientist at a joint harvard/mit institute for bio research. Much of what we do is considered "data rich." High content, throughput, sequencing, etc. If you did a quick google im sure you could find it.
4.8.2009
Thanks for that info, i knew they had to have a huge san backend … people here comenting about fileservers lol.Thanks for the details, you got a link to dev blog?
4.8.2009
@bilbusDeveloper blog: developers.facebook.com/news.php?tab=blogPresentation on Haystack from about a year ago: flowgram.com/p/2qi3k8eicrfgkv/
4.8.2009
I’m still pissed off about the new look, it’s fking annoying and I hate it. Every day I realize more and more how it is exactly like twitter. I could care less about a new photo storage system, facebook can ***** off and die for all I care.***** you Mark Zuckerberg.
4.8.2009
man, that was rude!
4.8.2009
OMG this system just lost my photo of a needle. How will I ever find it?!
4.8.2009
Wait for the kid in the ball pit to get poked by it and have his parents sue you?
4.8.2009
the name is such because finding a good picture of someone on facebook, is like, well, y’know
4.8.2009
***** you facebook!
4.8.2009
I knew this had to be written April 1st, facebook doesn’t make improvements.
4.8.2009
Finally a great way to store my [number greater than a thousand] pictures!
4.8.2009
sweet now we can stalk 50% faster!
4.8.2009
Nobody needs it.