File Scanning
File scanning is the process of analyzing a, potentially large, file to find information about it. This can be useful to find hidden data, or to simply find the data type and structure of a file.
Tools
file
Deduce the file type from the headers.
binwalk
Look for embedded files in other files.
binwalk <file> # List embedded files binwalk -e <file> # Extract embedded files binwalk --dd=".*" <file> # Extract all embedded files
Alternatives:
foremost
,hachoir-subfile
…strings
Extract strings from a file.
grep
Search for a string, or regex, in a file.
grep <string> <file> # Search in a file grep -r <string> <directory> # Search recursively in a directory
hexdump
Display the hexadecimal representation of a file.
hexdump -C <file> # Dump bytes with address and ascii representation hexdump <file> # Dump bytes with address only xxd -p <file> # Dump only bytes
yara
- WebsiteScan a file with Yara rules to find (malicious) patterns. rules can be found in the Yara-Rules repository.
Here is an exemple rule to find a PNG file in a file:
png.yar
rule is_png { strings: $png = { 89 50 4E 47 0D 0A 1A 0A } condition: $png }
yara png.yar <file> # Scan a file, outputs rule name if match yara -s png.yar <file> # Print the offset and the matched strings
File signatures
file signatures
- WikipediaFile signatures are bytes at the beginning of a file that identify the file type. This header is also called magic numbers.
Most files can be found here, but the most common ones are :
Hex signature File type Description FF D8 FF
(???)JPEG JPEG image 89 50 4E 47 0D 0A 1A 0A
(?PNG)PNG PNG image 50 4B
(PK)ZIP ZIP archive For exemple, the first 16 bytes of PNG are usually b’\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR'
This data can be outputed to a file with
echo -n -e "\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR" > png.sig