Businesspersons information gathering

The need for the intelligence cycle, a basic workflow, and some tradecraft on how to investigate people and organisations between OSINT and HUMINT.

14 min readApr 6, 2021

Foreword: I am still looking for a job but — hey! It’s beautiful to learn along the way

During my job search for an intelligence role, I have been tasked with the most varied requests and assignments, sometimes for a 3-day test rush, other times for a few hours. Well, enough to finally understand a clear methodology is much more needed than an in-depth knowledge of a particular geographic area or industry (not that this is irrelevant but a good thinking system comes first — there is a limit to the information your brain can store, so it seems more efficient to invest in methodology).
All this really applies when investigating people. Humans are wonderful: huge servers of experiences, passions, interests, and contacts. If you start digging, there are so many pivot points along the way that you absolutely need not lose focus and respect some kind of workflow.

So, this is OSINT From Scratch, and this blog post is about planning for, collecting, analysing, and exploiting publicly available information (PAI) about business persons (wow, this sounds professional!).

The intelligence cycle

Good planning saves time, allows for control and checks, favours prioritisation, and helps to balance assumptions and biases. Again, the intelligence cycle comes into play.

Joint Chiefs of Staff — http://www.dtic.mil/doctrine/new_pubs/jp2_0.pdf

If I have to investigate an individual, I first need to understand the rules and the setting: what kind of information is required? To what end? Which are the rules of engagement? To what extent do I have to dig into that human being’s story?
These questions seem to be taken for granted, especially when you are like me, at the very beginning, and you just want to get into the action, start searching, and wow, look at this, and this — and in half an hour, you have dozens of browser tabs opened, files, screenshots, and the least idea of the process you have followed.

Meaning: impossible to retrieve the methodology, track the process, support and document your reasoning. Therefore, it appears imperative to establish a workflow that helps keep track of the process and follows a certain logic, in as tidy as possible fashion.

So, the intelligence cycle:

1- PLANNING AND DIRECTION

a) I have to investigate the business profile, activities, and networks of Jane Doe, executive director of GreatFirm London Limited. What is a good strategy? What kind of information is required? What are the rules of engagement? Can I use sock puppets? On which platforms? Do I have to only engage in passive reconnaissance activity to get purely PAI, or I am allowed to connect and send friend requests to open up a private profile and gather further information? Even, can I get in touch with the individual? Am I allowed to establish some kind of rapport — online, or even offline? (Okay, this opens a totally different scenario. I understand there are rules but I am really trying to unpack all potential avenues and, if you have direct experience of these cases, I would love to chat — drop me a message!).

b) What is going to be my process? If I have a full name, I can try some searches for public records; if I only have an email, maybe there are different options; a university degree? Maybe an Alumni page, and so on.
How am I going to proceed? Will I start from media exposure, searching for news, or from social media profiles and content? Do I need to study, map, and approach the target’s network of people?

https://gph.is/g/4LN8mJp

2- COLLECTION

a) Which are my operational security (OPSEC) settings? How am I protecting my identity, my customer, my organisation, and avoid leaving traces? How am I going to store evidence and document the process? Well, you can use a text editor to take notes, mindmaps to connect dots, you can have a secure folder for files.

b) What are my methodological priorities? Do I want to exhaust all personal information I can find on Jane Doe, or do I prefer starting from the appointments she has or had in companies before the current one, develop a timeline, map her network?

3- PROCESSING AND EXPLOITATION

How do I pivot from a piece of information to another?

If I have Jane Doe’s date of birth and address, do I search for official documentation or try to attribute to her a social media profile? If I have a social handle, like @DoeJLondon, do I search for other accounts, or try to work around finding her email? If I know a related company officer who has been involved in a certain interesting event, do I move my focus on that?
Of course, this phase is not totally compartmentalised and separated from the collection of information: any information tends to become a PIVOT POINT to find further details, change perspective, or understand another piece of information. Exploiting information to get other information is a dynamic process: I can use an address to find the neighbourhood and individuate the residency of a Doe’s contact who represent a threat, or, even, if your rules allow that, you can forge a credible persona, craft an approach, and actively engage your target (again, this scenario goes beyond open-source intelligence and becomes something else: consider the beauty of OSINT in support of HUMINT, or human intelligence operations!).

4- ANALYSIS AND PRODUCTION

Again, pure open-source information is just that — information (Information is not the new oil! What you need is intelligence: check out this on-the-spot podcast by Janes).
In isolation, it is pure noise, a buzzing noise that does not convey any meaning. It is like having a single pixel — that does not tell much about what you are really looking at, does it? Yet, to develop that picture, information must be analysed to assess validity and relevance, infer causal relations and produce an analytic judgement.

https://www.merriam-webster.com/dictionary/inference

Verification is a key step of this process. Quite often, PAI is hard to verify, because sources have questionable levels of confidence; almost always, your conclusions are based on partial information. Intelligence without some predictive leverage is not intelligence, isn’t it? Therefore, assessing the validity (is the information legit?) and relevance (to what degree do I need this information?) of sources and information is key to control for your results and establish margins of errors in your intelligence product.

5- DISSEMINATION

Reporting findings is something I am currently struggling with. The power of communication, in fact, cannot be underestimated: the most striking conclusion, if conveyed in a poor fashion, could be overlooked, or underestimated. But hey, I am a beginner, this is advanced stuff, right?

Jane Doe of GreatFirm London Limited (GFL Ltd): A Business Person Investigation

Please notice that is a hypothetical process only: there are so many alternative paths one could follow! Here the main takeaway should be that any piece of information is a pivot point to another.

Let’s say I am provided with a full name, Jane Doe, and I know she has been in some troubles in her previous appointment at GreatFirm London Limited (GFL Ltd). I could start my searching strategy based on these three pieces of initial information I have:

Full name
Company
Location

This is a good starting point to break the ice and gather a wider picture of the person of interest.

From this very simple starting point, boolean search strings can be exploited to gather initial PAI on Jane Doe:

We can evaluate media exposure by checking local news media outlets:

Of course, we would need to filter out some noise; let’s say that our focus on “London” yields mixed results retrieving both information about London, UK, and London, Ontario, Canada:

Of course, there is more than that: if we know that the target manages businesses in China, we would use a more targeted search engine like Baidu; we can go deeper with a tailored search strategy by capitalising on great directories of search engines such as Search Engine Colossus.

We can then start searching for official documents and public records by submitting queries like this one:

“Jane Doe” +London ext:xlsx

Sometimes, government bodies and agencies use spreadsheets to store data about individuals and organisations, and it can happen that they are publicly available on the deep side of the web. This is more common with NGOs and non-profit organisations, where the person of interest could be a contributor, donor, or volunteer for a certain project at a local level.

With the same approach, if they have a personal blog, a website, or have submitted their CV on public spaces such as on professional networking platforms, you can try to find the document:

“Jane Doe” +(resume OR “curriculum vitae” OR cv) ext:pdf

If we have access to the target’s LinkedIn public profile, information about education is greatly useful. In the image above, for instance, a CV has been found in a .edu domain. This means that we could craft our search for information — not just a resume, but publications, awards, interview, testimonials, or anything — that relates to Jane Doe’s institution. For instance:

Site:lsel.ac.uk “Jane Doe” +essay

Site:lse.ac.uk “Jane Doe” +award

Additionally, we could pivot that information on social media and have a look at alumni groups.

If Jane Doe is a graduate from the London School of Economics (LSE), we can search something like this on Facebook:

In this specific case, the group is private, meaning that we cannot passively gather information. We would need to actively send a request (crossing the OSINT line!). It really depends on the type of investigation being conducted.

Publicly available information (PAI) can be found with different crafts. If we have a full name, we can try directories of people to retrieve phone numbers and addresses, such as on the Phone Book directory:

Or even on specialised platforms such as RocketReach:

If instead, we have a phone number (or an email address) we can reverse lookup that piece of information to find out a full name (we could get lucky and find the complete one, with a middle name, or a physical address — this, of course, works beautifully with individuals based in the US, where public data are abundant and rich).

After a screening of media exposure and educational background, pivoting from information about a university to an alumni group or recent contacts, or from a news article to the name of a related person of interest, we need to exploit the power of companies public records.

Of course, the extent of data we can find really depends on the company, the industry, and the country (different country, different disclosure).

For our Jane Doe of GFL Ltd, we can use Companies House UK:

The British public registry of companies is among the richest and most valuable; here we can search for companies and officers and have information on real beneficiaries, as well as repositories of official documentation.

This means that we can start gathering information on the business, on officers, and on shareholders. Where are they from? Does anyone have ties to an opaque jurisdiction? Are official documents displaying unusual appointments or changes of ownership?

We can now pivot from the registration number to official documentation and records, from the address to other information, and check the history of the companies along with its changes of names.

All this information can then be processed through other databases to collect additional data points, e.g. Open Corporates, which is useful to explore the company’s events to look out for unusual patterns:

On the OCCRP Aleph Database, then, we can search for matches in leaked records and watch out for red flags in connections between a Jane Doe’s company and other actors:

The International Consortium of Investigative Journalists has also made available the Offshore Leak Database which allegedly contains “information on more than 785,000 offshore entities that are part of the Paradise Papers, the Panama Papers, the Offshore Leaks and the Bahamas Leaks investigations. The data links to people and companies in more than 200 countries and territories. The real value of the database is that it strips away the secrecy that cloaks companies and trusts incorporated in tax havens and exposes the people behind them. This includes, when available, the names of the real owners of those opaque structures.”

Let’s say we find a Czech investor linked to GFL Ltd with prior legal issues for fraud or involvement in a certain local scandal. Using country-specific tools, especially business registries, is always a good approach to narrow down searches that would be unfeasible by using generic searches.
For instance, we can dig into the Czech business registry:

-and find out a network of companies signalling certain business patterns we could explore more in-depth:

At any point during our process, according to the needs of our intelligence phase of collecting, processing and exploiting information about Jane Doe, we can pivot from an associate to an address, from a business partner to a company located abroad. As mentioned before, collection and analysis are not really compartmentalised, rather they interact in a feedback mechanism.

Mapping the organogram of a business also helps to find connections. LinkedIn is the main resource to resort to, as it allows for quite targeted searches by selecting a company, then looking for employees, and tailoring the search based on keywords, industry, and other filters. Each individual, linked to a certain degree to our Jane Doe, is a gateway to further information.

Even if the company does not have a proper LinkedIn page, we can still open the list of individuals claiming to work for the firm; in this example, there are 22 employees (actually 24, when we launch the list).

We can then tailor our search per location or connections; each one of the employees could be relevant to the investigation. Mapping out the organogram of a target business can help understand what needs to be researched, and what is noise to be ruled out of the process.

With a step further, we can start reasoning in terms of prior appointments and companies.
Let’s explain that: if we are looking for information on GreatFirm London Ltd, connected to our target, it is better to look for previous employees. There is a chance they could be willing to disclose some information if approached. Not all work relations end perfectly well, and there is potential for gathering important information from individuals who have left an organisation. Again, this is just a way of thinking about pivoting, and of course, again, it exceeds the boundaries of OSINT.

We can even check if a person of interest from a company of interest is related to Jane Doe:

We can exploit this information to search social media accounts, where usually individuals tend to lower their defences and let work-related details leak outside.
We can check shared pictures, engagements, and comments to find out relationships between Jane and other persons. We can spot intertwined affiliations, business relationships, even discrepancies about lifestyle, locations, habits.
In (not so) rare cases, individuals post their passport, flight tickets, or other sensitive documents from which PAI can be retrieved and become pivot points again.

When you browse Instagram and find former Australian Prime Minister Tony Abbott's passport number

So you know when you're flopping about at home, minding your own business, drinking from your water bottle in a way…

mango.pdf.zone

For instance, if you have got a mobile number, you could add it to your contacts and check their WhatsApp profile; or you can check LinkedIn and Facebook events to further understand habits, movements, and interests in case you are allowed to establish rapport and engage with Jane to gather additional information, or to pivot from her to another objective according to your intelligence requirements.

From information to intelligence

After collecting, processing, and pivoting information on Jane Doe and GreatFirm London Ltd, we need to turn that information into something more: answers. Insights. Intelligence. Information alone would be just a compilation of data that cannot speak and definitely not tell any useful story.
So, if we find out that Jane Doe has been in touch with a certain person from a company that has experienced issues in another country, we can flag that. Then, we can exploit our knowledge of her business history to infer some conclusions about her trustworthiness. Even, if we are allowed, depending on our rules of engagement, we can leverage our OSINT to inform a HUMINT operation and engage the target directly. For instance, we could impersonate an employer and capitalise on personal details to trigger interest and to establish rapport. And of course, by remaining within the boundaries of OSINT and passive collection, we can provide a significant and meaningful picture of GreatFirm London Limited and Jane Doe, and infer a judgement to help a decision-maker act.

In any case, a process, a workflow, is essential to implement an established set of rules, figure out planning and direction, operate collection and processing, pivot from one point to another, keep track of the steps, select the available options, filter the noise out, and then provide a judgement on something or someone to inform further decisions.