Originally Posted on July 1, 2013:
Several weeks ago, Edward Snowden “introduced” the world to the domestic spying techniques of the NSA and the now household term PRISM. I hope that you read the word “introduced” with the appropriate sarcastic air quotes attached because I think all of us at some level or another understood and continue to understand that the US government, among other national entities, takes several liberties with our personal data. At the end of the day, without some level of encryption or tactical diversion, everything we send out on a wire to the rest of the world via a computer is exposed and potentially archived for posterity. That being said, the media crush surrounding Snowden and the NSA has started several interesting conversations worth exploring.
Let’s start with everyone’s new favorite word “PRISM”. Over the past several weeks, Steve Gibson and other blogging, tweeting, and podcasting security experts have tackled this word and its potential meaning and have come up with a highly plausible explanation. The general consensus is PRISM is not any form of acronym, but instead a real world description of the activity taking place. In the real world, prisms split light and display that light in its many forms. Also in the real world, the vast majority of all Internet traffic travels through fiber optic cable across the backbones of the worlds Tier 1, 2, and 3 level Internet Providers or ISP’s. Fiber optic networks literally carry light signals from point A to point B that can be converted at the router-level back to that Internet traffic we all love and upon which we so heavily rely. The NSA’s PRISM program is most probably splitting that fiber optic light at the ISP level, allowing them to collect, analyze, and store in near real-time all of the country’s Internet traffic flowing through that particular ISP at that particular time.
Many people, after reading this, would ask the question “But I thought Snowden said the NSA was getting information directly from Google or Facebook or Yahoo?” Fair question, but think about the impact of a program like PRISM and how it would enable attempts to gather sensitive information from those companies. As I stated, the NSA most probably uses PRISM to split and monitor Internet traffic at the ISP level. In the world today there are only 12 – 15 major Tier 1 ISP’s and the majority of those are located in the United States. As far back as 2004, 2006, and 2007, stories came out on the Internet and in limited national media outlets about secret rooms controlled by the NSA at Tier 1 ISP’s like AT&T WorldNet. These are not just conspiracy theory web blog entries. In at least one case, we have legal depositions from Technical personnel at AT&T describing one of the rooms in question as it existed at their major Internet POP (point of presence) in San Francisco. By targeting these ISP facilities immediately upstream of companies like Facebook or Google, the NSA can build a fairly accurate image of the type of traffic hitting the servers at these companies in question. Even the encrypted traffic generated by these companies and the users utilizing their services creates patterns that can be interpreted and data mined. One well executed FISA warrant at the ISP level seems ever so much more effective than 1000’s of challenged warrants submitted to a Google or Facebook.
The next question you are asking or at least should be asking is “But these are FISA warrants…how can they be used against US citizens?” That is a good and important question with a very simple, yet complicated answer. FISA stands for Foreign Intelligence Surveillance Act and is a law intended to empower US law enforcement against terrorists and others who seek to do harm against our country. No one really likes terrorists and no one likes people harming our country, so that was not a difficult law to pass. As I mentioned before, there are only a small number of Tier 1 ISPs in the world today and the majority are in the United States, so if I am the NSA and I want to target terrorists using the Internet, then those US-based ISPs are a great place to start. See, I told you it was a simple answer. Now, here’s the complicated part – all or most of our, the US citizen’s, Internet traffic is also carried by those ISPs.
One interesting tidbit of information that arose in the Snowden story arc is the supposed fact that the NSA has a facility somewhere in Utah that houses 5 Zettabytes of data on Internet traffic and behavior. Now, I have been an IT professional for nearly 16 years and I had to go look up what a zettabyte of data actually is. Most of you are reading this article from a computer or smart device that stores data on media measured in gigabytes. A decent smartphone stores between 32 and 64 gigabytes of data. Most laptops have hard drives storing 500 to 750 gigabytes of data. The next step up in storage terms is the terabyte. A terabyte in simplified terms is approximately 1000 gigabytes. Some of the newest computers on the market now come with hard drives measuring 1 or 2 terabytes. Large corporate SAN’s (storage area networks) are measured in terabytes. The next step up is a petabyte. A petabyte is approximately 1000 terabytes. Petabytes are ridiculously large storage units. Large Internet caching engines and centralized backup facilities measure their storage in petabytes. This is where my general knowledge and usage of storage terms ends. I regularly use the terms gigabyte and terabyte and even occasionally have a reason to throw out petabyte from time to time, but the next two terms are not part of my day-to-day vocabulary. After petabyte comes the measurement term Exabyte. An Exabyte is approximately 1000 petabytes. After Exabyte we finally arrive at zettabyte. A zettabyte is approximately 1000 exabytes. As an aside, I find it quite funny as I type this point in MS Word on my Macbook Air, Word knows how to properly spell gigabyte and petabyte and Exabyte, but it has never seen apparently zettabyte. Strangely, I feel a little better about myself.
Let’s get back to the NSA and that supposed facility in Utah housing 5 zettabytes of data. I need to help put zettabytes in perspective. As I mentioned in the last paragraph, some of the latest computers to hit the market tout hard drives of 1 to 2 terabytes in size. That’s a big computer. That would hold roughly 1,000,000 to 2,000,000 of your nicest pictures and 100,000’s of movies. A zettabyte is approximately 1 Billion terabytes. That’s 1 followed by 000,000,000 zeros. The NSA has 5,000,000,000 terabytes of storage in Utah, housing Internet history from many major ISP’s. And the NSA not only houses this data but also has devised a way to successfully analyze and data mine all of this information. That is rather impressive. Dwayne Melancon, CTO for Tripwire, posted on his blog that if nothing else, this activity by the NSA can be seen as one of the first major successes of Big Data analytics and I would have to agree. What they have accomplished is the Holy Grail of data warehousing and forensics. Unfortunately, because of the nature of the program, this technology will most likely never make its way into the private sector, at least not intentionally.
Now that we understand what a zettabyte is and how much data is involved, let’s go back to the complicated half of the question surrounding the use of a FISA warrant by the NSA. The warrant is legitimate in that the NSA is targeting foreign intelligence and from what we can tell, is succeeding in collecting and analyzing it. Unfortunately, they are also collecting a tremendous amount and domestic intelligence as well and we are forced to take their word for it when they say that data is not being used against US citizens.
Trust is at the heart of the Snowden story and is the reason so many people are so upset to learn our government has all of this data. Do we trust our government to not use this data against its citizens. Do we trust our government to protect this data. Who do we trust? I tend to not get too upset about all of this because I learned a long time ago to not trust the Internet or anything I placed on the wire. I tend to use the Internet with the mindset that I have nothing to lose because I try not to expose anything worth losing.
I have spent far too long in this post explaining and trying to understand the nature of PRISM, so I am going to pause here and come back in my next entry with a little more insight, including my take on how you can better protect yourself online and why it shouldn’t really matter.