Friday, May 8, 2015

Oops.

I made an accident yesterday.


What happened? When creating a matrix of zeros I accidentally typed matrix() instead of Matrix().

What's the difference? 4.8 terabytes versus less than one GB. I was creating a document-term-matrix of about 100,000 documents with a vocabulary of about 6,000,000 tokens. This is the thing with linguistic data: one little mistake is the difference between working on a Macbook Air with no fuss and something that would make a super computer choke. (Anyone want to get me a quote on hardware with 5 TB of RAM?)

1 comment:

  1. re: (Anyone want to get me a quote on hardware with 5 TB of RAM?)

    AWS EC2 r3.8xlarge (244GB RAM @ $2.80 per hour) x 21 = 5,124GB RAM @ $58.80 per hour

    ReplyDelete