Manual File Recovery Approach for NTFS


 

Manual NTFS File Recovery Approach

What is File System?

File system provide a mechanism for users to store data in a hierarchy of files and directories. A File system consists of structural and user data that are organized such that the computer knows where to find them [1].

How files gets recovered despite being permanently deleted?

Operating systems keeps track of where files are on a hard drive through reference to file that tells the OS as to where the file begins and ends. When we delete the data the reference is only deleted marking its space as available instead of actually erasing the file.

Why the computer doesn’t just erase files when you delete  them?

Deleting a file’s reference and marking its space as available is an extremely fast operation. In contrast, actually erasing a file by overwriting its data takes significantly longer. For example, if you’re deleting a 10 GB file, that would be near-instantaneous. To actually erase the file’s contents, it may take several minutes/hours – just as long as if you were writing 10 gigabytes of data to your hard drive.

Notes: 

1.None of this applies to solid state drives (SSDs). When you use a TRIM-enabled SSD (all modern SSDs support TRIM) [2] , deleted files are removed immediately and can’t be recovered. Essentially, data can’t be overwritten onto flash cells – to write new data, the contents of the flash memory must first be erased.

2.File recovery is different from File carving because it takes assistance of the filesystem that created the file unlike carving. [3]

Let's dig into how commercial file recovery softwares for windows work at their core.

You’ll need the following

  • Drive with the deleted file on it.
  • A host PC to perform the file recovery operation
  • A hex editor [4] (such as WinHex or HxD)
  • A second drive to copy recovered data on to
  • The name of the deleted file

Connect the drive to your PC, fire up your hex editor, and you’re ready to begin.

Initial Steps

Open the disk in hex editor (WinHex used)

Fig 1 . Opening Disk to recover data from in WinHex

There are three steps to the data recovery process using hex editors:

  1. Manually scanning the disk to identify deleted files (or entries).
  2. Identifying the clusters chain for the deleted file of interest.
  3. Recovering the clusters that contain the deleted file.

1.          Scanning the NTFS volume 

Looking at the drive in the editor for the name of the file that is no longer there (UEFI Spec 2.8B May 2020.pdf in our case). In this example, we’re looking for the PDF called "UEFI Spec 2.8B May 2020.pdf"-

 

Fig.2 File Record Header for the file in WinHEX

The hex editor will return the MFT Record in a string like shown above in the right-hand column, you can just make out the file name as  U E F I S p e c 2 . 8 B M a y 2 0 2 0 . p d f €   as can be seen in Fig 2.

MFT Record has pre-defined structure. It has a set of attributes defining any file or directory parameters.

MFT Record begins with standard File Record Header (more details can be found at [5]) but to summarize the MFT header is as follows:

  • "FILE" identifier (4 bytes)
  • Offset to update sequence (2 bytes)
  • Size of update sequence (2 bytes)
  • $LogFile Sequence Number (LSN) (8 bytes)
  • Sequence Number (2 bytes)
  • Reference Count (2 bytes)
  • Offset to Update Sequence Array (2 bytes)
  • Flags (2 bytes)
  • Real size of the FILE record (4 bytes)
  • Allocated size of the FILE record (4 bytes)
  • File reference to the base FILE record (8 bytes)
  • Next Attribute Id (2 bytes)

Amongst these attributes returned by a disk search, is one called Flags, located at 22 and 23rd bytes into the File Record Header. If the field is set to 1, the file is “in use”, or not deleted. In our example, the field is set to 0, which means that UEFI Spec 2.8B May 2020.pdf  has been deleted.

At this point we also note the values for the Cluster sizeCompression Unit SizeAllocated size of the attributeReal size of the attribute, and Data Runs attributes.We’ll need them for stage 2 of the recovery process.

2.         Defining disk clusters

With the cluster size we obtained from step 1 we need to rescan the drive, going through all the file clusters until you identify the file size that is equal to the selected clusters.In WinHex this can be done as shown below :





The NTFS file system assigns each file a _DATA_ attribute that defines “data runs”, which in turn point to the location of the file clusters that need to be recovered. [5]

Before proceeding, you need to decode the data runs.

In the example below, the DATA attribute is marked in green. The Data Runs within are marked Bold.

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00012580   2E 00 70 00 70 00 74 00  80 00 00 00 48 00 00 00   ..p.d.f.Ђ...H...
00012590   01 00 00 00 00 00 04 00  00 00 00 00 00 00 00 00   ................
000125A0   6D 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00   m.......@.......
000125B0   00 DC 00 00 00 00 00 00  00 DC 00 00 00 00 00 00   .U.......U......
000125C0   00 DC 00 00 00 00 00 00  31 6E EB C4 04 00 00 00   .U......6feA....
000125D0   FF FF FF FF 82 79 47 11  00 00 00 00 00 00 00 00   yyyy‚yG.........

This is where things become more complicated:

  • The first byte (0x31) shows how many bytes allocate for the length of the data run
  • The next byte – 0x6E – shows the length of the data run.
  • The following three bytes indicate the start cluster offset – 0xEBC404.
  • By changing the bytes order [6] , we discover that the first cluster is 312555 (0x04C4EB).
  • By applying the length of the data run identified above, we know that the next 110 clusters (0x6E) contain our PDF file.

To check if this is correct observe that the next byte is 0x00, indicating that no further data runs exist.

    3.    Recovering the cluster chains

With the cluster chain identified, the last task is to copy the “deleted” data back to your other hard drive. Using the first cluster address identified in step 2 (312555), copy the 110 clusters that follow it – but first you need to calculate the offset of the first cluster.

You do this by multiplying the cluster size (512) by the First cluster address as shown below:

512 * 31255 = 160028160

This value then must convert into hex, giving you the offset that marks the start of your missing data = 0x0989D600

By copying the next 110 clusters (512*110 = 56320 bytes) to your second drive, you will have successfully recovered the “deleted” file from your NTFS partition.

To save the recovered data (WinHex) :

a) Press the 'Control + N' keys (or click on the "File" menu and choose "New") to start a new temporary file!

b) Use either the 'Control + Bor 'Control + V' keys (it makes no difference if you use "Paste write" or "Paste insert") to copy data into a new ('Untitled') empty TAB. (Or, select either one from the "Edit" menu).

c) Click the "OK" button in 'Confirm' pop-up window to proceed (to insert/write data). 

d)click on the "File" menu and choose "Save as...". In the pop-up window, navigate to the destination folder.

e) Click the "Save" button


f) Open the file to check success 
 


 

 References:

[1].Carrier, Brian. File system forensic analysis. Addison-Wesley Professional, 2005.

[2] https://en.wikipedia.org/wiki/Trim_(computing)

[3] https://resources.infosecinstitute.com/file-carving/

[4] https://en.wikipedia.org/wiki/Hex_editor

[5] https://docs.microsoft.com/en-us/windows/win32/devnotes/file-record-segment-header

[6] https://devopedia.org/byte-ordering


By:

Mr. Ayush Kumar Sharma (B.Tech. 4th Yr., IT, IIIT Bhopal)

Dr. Nitesh K Bharadwaj (Faculty, Deptt. of CSE, IIIT Bhopal)

Comments

Post a Comment

Popular posts from this blog

Analysis of Volatile Memory(RAM) Using Volatility3

$Recycle.Bin Forensics: Analysis of $I (metadata file) and $R (actual content)

Usefulness of Epoch in Digital Forensics Investigation (UNIX and MacOS perspective)