When doing research on a subject that has some measure of obscurity by design, such as the fusion center in Philadelphia, the Delaware Valley Intelligence Center (DVIC), I often find the only way to fill in the gaps is to “data-mine” for documents. I use quotes, because data-mining strictly involves aggregating and analyzing more fragmented bits of *data, I deal more in *information, and data-mining usually applies to a much more intensive level of computation applied to a much larger corpus to be processed than I will discuss here.
A more appropriate term would be “forensic indexing,” in that I am applying basic methods of digital forensics like metadata extraction to a general knowledge management system for large collection of documents, too large realistically to open one by one. And I’ve just made it sound more organized than it usually is.
In the case of the DVIC what this meant was using an application which automates queries to metasearch engines as well as enumerating a specified domain to find relationships and other information. I used FOCA. I saved the documents that were the result of this search in separate folders according to which domain I had chosen for the search. I collected around 1800 documents.
I then run a simple command line program called pdfgrep, I used the command pdfgrep -n -i “dvic” *.pdf to bring me a list displaying every line in every pdf file in the same directory containing the phrase “dvic,” tagged with file name, page of line, and ignoring case. One such query returned:
As you might imagine if you have followed the Declaration’s coverage, I was a bit confused. I went to the corresponding folder on my desktop and opened the file in my reader:
This document is titled “Nebraska Information Analysis Center,” another fusion center which it just so happens is missing a document from the fusion center association website. Where metadata plays in, and why I had missed this by manually “googling” until now, is in how FOCA searches for documents – by file name which is in the metadata of the document which gives its file path on the machine that stores it, its uri– something you can sometimes do by typing inurl:[term] into Google, but then you would have to know the exact name of the file to get relevant results. The name of this file is “Delaware-Valley-Intelligence-Center-Privacy-PolicyMar-2013.” It would have been very difficult to come up with this by educated accident.
So while there are still serious questions about the date gap between beginning a “cell” and submitting a policy, and concerns about a lack of full time privacy officer among others, it seems that everyone that was sure that a policy was completed and was approved by the DHS was quite correct, and I’d like to thank them for adding accurate memory to their graciously-given time to discuss the subject. It seems that a March draft was labeled somewhere in its life as the Nebraska Information Analysis Center’s policy perhaps at the National Fusion Center Associate website, where the “comprehensive” list is found, by whomever didn’t link it to the analysis center website.
This is only one elucidation among many from recent developments, the fruits of fresh approaches, and as mentioned, more documents to parse. Read the Declaration