Back to all articles

June 24, 2021

Metadata forensics, when files can speak and reveal the truth

Ironhack - Changing The Future of Tech Education


All Courses

Metadata will be one of these words that is tiptoeing into our daily vocabulary. This is mainly because of the exponential amount of data being generated and stored at every minute of every day. Did you know that the amount of existing data stored on the Internet doubles every two years (yikes! That's a lot of data!!!). How are we going to sort through all that? And avoid losing time, possibly getting a few headaches on the way and, at worst, losing important data completely? Here is where the usefulness of Metadata comes in. 

But first, what is metadata anyway?

To clarify, metadata is not to be mixed up with the actual contents, as metadata is not the content itself; instead, it describes the content of an object or piece of information

One of the simplest ways to put it is "data about data" or "a set of data used to describe and represent an information object" or, even, "documentation that describes the stored data". For example, an email has written content or "information" within it, yet the metadata would be the time it was sent, the sender, the subject, etc. 

But there are different kinds of metadata that make the system complete and operable. Let us run you through three different types of metadata.

#1 Descriptive

Descriptive metadata is basic information, who, what, when and where. Think of it as a description of a file or a piece of art with the plaque next to it. Descriptive metadata is there to help individuals know what they are looking at; therefore, the description changes depending on the contents of the object or information piece. 

Types of descriptive metadata:

  • Time and date of creation

  • Program or processes used for the creation of the data

  • Purpose of the data

  • Creator or author of the data

  • Location on a device where the data was created

  • Technical standards used

  • File size

  • Data quality

  • Source of the data

  • Modifications or programs used to modify the file

#2 Structural 

Structural metadata defines how the data should be categorised to fit into a more extensive system of other objects or information sets. Therefore structural metadata represents what the fields mean, so there can be a relationship established between many files so that they can be organised and used accordingly. 

#3 Administrative

Finally, there is administrative metadata; this is information about the history of the data or object. Like, owners, rights, licences and permissions, this is particularly helpful for information management. 

So, word files, songs, videos and images, for example, all follow an information method regarding origins, creation and uses. 

So, why all the fuss about metadata?

One of the main problems with the exponential growth of data is how it is treated and stored. If the data isn't appropriately descriptive, it makes it significantly harder for users of that data to retrieve or recover it. Hence description elements need to be more accurately representative so that current tools can efficiently and effectively find them for the user. Think about it: we have all been there… quickly saved a file without labelling it properly and then spending hours trying to find it or maybe even never seeing it again...forever lost in the data abyss (oh the heartbreak!).

Experts studying description, search and retrieval information point out that the best solutions to avoid this problem may be creating well-planned and designed metadata information systems tools for users. This would allow optimal information processing stored in computers to be exchanged over the networks, particularly for data available on the Internet. Such a resource would mean that electronically stored data can be accessed and retrieved, regardless of format, such as text, image, sound, video, a web page, etc. Hence helping individuals find the exact information they are searching for and avoiding heartache or heartbreak!

So, what's metadata's role in forensics?

Now that we have briefed you on what metadata is and its various forms, are you ready to get your detective hat on and magnifying glass out? Because there's an even more specific field of use: forensic metadata. Think of electronic evidence or the bread crumbs that lead to the main culprit or suspect! Forensic metadata has been a global topic in various investigations as the key to cracking a case as vital information could be hidden in a tiny file that reveals something major!


Forensic metadata in use

So pretty much, metadata allows digital or computer forensic investigators to understand the "traces" and the history of an electronic file. These digital traces are fragile and need to be properly preserved. Think of it like real physical evidence at a crime scene and the level of care required, so there isn't cross-contamination, missed clues, or tampering with evidence. metadata must be treated in the same way.

Here are examples of some metadata that may be of interest to a criminal investigation:

  • Recover file names, their extensions, their respective creation, modification and access dates 

  • History of executions, failures, number of writes and reads of records, etc.

  • When the file was created, modified and accessed

  • Access all the information stored within a document

  • Access hidden document information

  • Provide collaboration evidence 

Metadata even serves to help authenticate electronic evidence or help identify when evidence has been falsified or "doctored". When carrying out an investigation, a professional needs very versatile tools that are fast and safe to use; these professionals include security nationals, computer experts, investigation companies and security departments in large companies or corporations. Such tools will assist in facilitating tests and reports with a complete guarantee so, when requesting strict access to these files, they will already know that it is entirely relevant to their case.

We like to take inspiration from Hany Farid, a forensic and computer scientist known as the “Sherlock Holmes of the Instagram era". Journalists, courts, intelligence agencies and the FBI come to him to sort real images from fake ones, as now it's becoming more and more impossible to tell the difference. He states that "the ability to manipulate digital content has accelerated". This acceleration could present a real public threat as public figures can even fall victim to "deep fake" videos or photos. He is striving to carry out his work in many ways, using various new tools, but one initial clue that indicates that an image may have been falsified is the number of times the image has been saved or compressed. Therefore, metadata helps unlock insights into whether the image has been manipulated or not, and that is just the beginning! 

If you want to be as badass as Hany Farid and many others or would like to learn more about this super interesting topic, check out our Workshop Webinar below!

powered by Crowdcast

We love keeping you informed about the gigantic possibilities in the "professions of the future" and will keep doing so as we do not want you to miss any of these golden career opportunities that are happening in front of us: right now.

As you can probably tell, cybersecurity is a huge topic with lots of ground to cover. If you wanna keep on learning more check out our cybersecurity bootcamp, which you can of course carry out, learn and enjoy remotely!


Related Articles

Ready to join?

More than 10,000 career changers and entrepreneurs launched their careers in the tech industry with Ironhack's bootcamps. Start your new career journey, and join the tech revolution!