Research Hacking – Searching for Sensitive Documents on FTP; Captchas and the Google Governor

If you want to find *sensitive documents using Google search (*documents with impacting information which someone does not want revealed, more or less), I’ve found that in addition to targeting queries to search for specific domains and file types, an alternative and potent approach is to restrict your results to files residing on an ftp server. 

The rationale is that while many allow anonymous log-in and even more are indexed by Google, FTP servers are used more for uploading and downloading, storing files than viewing pages, and typically house more office-type documents (as well as software).  As limiting your searches to ftp servers also significantly restricts the overall number of results to be returned, choice keywords combined with a query that tells Google to bring back files that have “ftp://” but NOT “http://” or “https://” in the url yield a high density of relevant results. This search type is easily executed:

Screenshot - 12032013 - 08:10:35 AM

A caveat one encounters before long using this method is that eventually Google will present you with a “captcha.” Many, many websites use captchas and pretty much everyone who uses the internet has encountered one. The basic idea behind a captcha is to prevent people from using programs to send automated requests to a webserver, they are a main tool in fighting spam by thwarting bots that mine the internet for email addresses and other data, and which register for online accounts and other services en masse. The captcha presents the user with a natural language problem which they must provide an answer to.

Google is also continuously updating its code to make it difficult to exploit Google “dorks,” queries using advanced operators similar to one used above (but usually more technical and specific). Dorks are mostly geared toward penetration testers looking for web application and other vulnerabilities, but the cracker’s tools can easily be adapted for open source research.

Screenshot - 12032013 - 08:13:41 AM

Unless you are in fact a machine (sometimes you’re a machine, in which case there are solutions), this should be easily solved; however lately, instead of returning me to my search after answering the captcha, Google has been sending me back to the first search page of my query (forcing me to somewhat start the browsing process again and to encounter another captcha). I’m calling it a Google Governor, as it seems to throttle searchers’ ability to employ high-powered queries.

The good news is that the workaround is really just smart searching. One thing you’ll notice upon browsing your results is that dozens of files from the same, irrelevant site will be presented. Eliminate these by adding -inurl:”websitenameistupid.com” (which tells Google NOT exactly “websitenameistupid.com” in the url). Further restrict your results by omitting sites in foreign domains (especially useful with acronym-based keyword searches): -site:cz -site:nk.

When you find an ftp site which looks interesting, copy and past the url into a client like Filezilla for easier browsing.

To give you an idea of the sensitivity of documents that can be found: One folder was titled “[Name] PW and Signature,” which contained dozens of files with passwords as well as .crt, .pem, and .key files; another titled “admin10” contained the file “passwords.xls.” This was the site of a Department of Defense and Department of Homeland Security contractor – the document contains the log-in credentials for bank accounts, utilities, and government portals. This particular document is of more interest to the penetration tester; for our purposes it serves as a meter for the sensitivity of the gigabytes of files that accompanied it on the server. The recklessness of the uploader exposed internal details of dozens of corporations and their business with government agencies.

The hopefully sufficiently blurred "passwords.xls"
The hopefully sufficiently blurred “passwords.xls”

*As of this writing, the FTP mentioned above is no longer accessible

Perl Crawler Script “fb-crawl” Lets You Automate and Organize Your Facebook Stalking

While browsing for scripts that might make my often very high-volume webmining for research less time-consuming/more automated, I came upon the following on Google Code 

fb-crawl.pl is a script that crawls/scrapes Facebook friends and adds their information to a database.
It can be used for social graph analysis and refined Facebook searching.

FEATURES

– Multithreaded
– Aggregates information from multiple accounts

ttt

This is very useful for social engineering and market research, and could also very easily find fans among the more unsavory Wall creepers. They don’t even have to be programming-competent, so most neck-bearded shiftless layabouts and of course Anons can do it. You only have to plug in your FB email address and  a MySQL password (you can download and click-to-install MySQL with simple prompts if you don’t have it).

EXAMPLES

Crawl your friends’ Facebook information, wall, and friends:
$ ./fb-crawl.pl -u email@address.com -i -w -f

Crawl John Smith’s Facebook information, wall, and friends:
$ ./fb-crawl.pl -u email@address.com -i -w -f -name ‘John Smith’

Crawl Facebook information for friends of friends:
$ ./fb-crawl.pl -u email@address.com -depth 1 -i

Crawl Facebook information of John Smith’s friends of friends:
$ ./fb-crawl.pl -u email@address.com -depth 1 -i -name ‘John Smith’

Extreme: Crawl friends of friends of friends of friends with 200 threads:
$ ./fb-crawl.pl -u email@address -depth 4 -t 200 -i -w -f

Users of the script can also aggregate information about relationship status by location or by school, essentially allowing stalkers to create automated queries for lists of potential victims.

MYSQL EXAMPLES

Find local singles:
SELECT `user_name`, `profile` FROM `info` WHERE `current_city` = ‘My Current City, State’ AND `sex` = ‘Female’ AND `relationship` = ‘Single’

Find some Harvard singles:
SELECT `user_name`, `profile` FROM `info` WHERE `college` = ‘Harvard University’ AND `sex` = ‘Female’ AND `relationship` = ‘Single’

And if a stalker wants to make an even handier database of GPS located targets, there are plug-ins:

To load a plug-in use the -plugins option:
$ ./fb-crawl.pl -u email@address -i -plugins location2latlon.pl
location2latlon.pl:
This plug-in adds the user’s coordinates to the database using the Google Geocoding API.

And as no stalker want to terrorize someone age-inappropriate, they can sort by DoB as well

birthday2date.pl:
This plug-in convert the user’s birthday to MySQL date (YYYY-MM-DD) format.