How to open very large text files on Windows

Some graphical tools and two command line tips

I’ve had to search the occurrency of a string within some very large text files, as result of a “file carving” made with Autopsy.

Usually on Windows I use Notepad ++, that provides a convenient feature of ‘Search in files’, but this great tool has difficulty to open files larger than 2Gb.

However there are some other solutions on Windows:

  • gVim: you need to be familiar with VI/ VIM commands to use it, and loads entire file into memory.
  • 010Editor: Opens giant (think 5 GB) files in binary mode and allow you to edit and search the text
  • Liquid XML Community Edition Opens and edits TB+ files instantly, supports UTF-8, Unicode etc
  • SlickEdit: Useful IDE that can open very large files
  • Emacs: Must be compiled in 64Bit mode: has a low maximum buffer size limit if compiled in 32-bit mode.
  • glogg: Read only, allows search using regular expressions.
  • PilotEdit: Loads entire file into memory first
  • HxD: Hex editor, good for large files: portable version available
  • LogExpert: opens smoothly log files greater than 6GB
  • FileSeek: It can find text strings, or match regular expressions.

Furthermore, if you feel comfortable using the command line, there are some console solutions (built-in on Windows):

  • The more command might be good enough:
Displays output one screen at a time.
MORE [/E [/C] [/P] [/S] [/Tn] [+n]] < [drive:][path]filename
command-name | MORE [/E [/C] [/P] [/S] [/Tn] [+n]]
MORE /E [/C] [/P] [/S] [/Tn] [+n] [files]
[drive:][path]filename Specifies a file to display one
 screen at a time.
command-name Specifies a command whose output
 will be displayed.
/E Enable extended features
 /C Clear screen before displaying page
 /P Expand FormFeed characters
 /S Squeeze multiple blank lines into a single line
 /Tn Expand tabs to n spaces (default 8)
Switches can be present in the MORE environment
 variable.
+n Start displaying the first file at line n
files List of files to be displayed. Files in the list
 are separated by blanks.
If extended features are enabled, the following commands
 are accepted at the — More — prompt:
P n Display next n lines
 S n Skip next n lines
 F Display next file
 Q Quit
 = Show line number
 ? Show help line
 <space> Display next page
 <ret> Display next line

There is also a Windows built-in program called findstr.exe with which you can search within files:

Searches for strings in files.
FINDSTR [/B] [/E] [/L] [/R] [/S] [/I] [/X] [/V] [/N] [/M] [/O] [/P] [/F:file]
        [/C:string] [/G:file] [/D:dir list] [/A:color attributes] [/OFF[LINE]]
        strings [[drive:][path]filename[ ...]]
/B         Matches pattern if at the beginning of a line.
  /E         Matches pattern if at the end of a line.
  /L         Uses search strings literally.
  /R         Uses search strings as regular expressions.
  /S         Searches for matching files in the current directory and all
             subdirectories.
  /I         Specifies that the search is not to be case-sensitive.
  /X         Prints lines that match exactly.
  /V         Prints only lines that do not contain a match.
  /N         Prints the line number before each line that matches.
  /M         Prints only the filename if a file contains a match.
  /O         Prints character offset before each matching line.
  /P         Skip files with non-printable characters.
  /OFF[LINE] Do not skip files with offline attribute set.
  /A:attr    Specifies color attribute with two hex digits. See "color /?"
  /F:file    Reads file list from the specified file(/ stands for console).
  /C:string  Uses specified string as a literal search string.
  /G:file    Gets search strings from the specified file(/ stands for console).
  /D:dir     Search a semicolon delimited list of directories
  strings    Text to be searched for.
  [drive:][path]filename
             Specifies a file or files to search.
Use spaces to separate multiple search strings unless the argument is prefixed
with /C.  For example, 'FINDSTR "hello there" x.y' searches for "hello" or
"there" in file x.y.  'FINDSTR /C:"hello there" x.y' searches for
"hello there" in file x.y.
Regular expression quick reference:
  .        Wildcard: any character
  *        Repeat: zero or more occurrences of previous character or class
  ^        Line position: beginning of line
  $        Line position: end of line
  [class]  Character class: any one character in set
  [^class] Inverse class: any one character not in set
  [x-y]    Range: any characters within the specified range
  x       Escape: literal use of metacharacter x
  <xyz    Word position: beginning of word
  xyz>    Word position: end of word
For full information on FINDSTR regular expressions refer to the online Command
Reference.

For example:

findstr /s "Login failed" *.txt

Do you know other tools? I accept tips!

The ‘HoeflerText’ font wasn’t found? Beware, it’s a trap!

A new malware campaign targets Chrome users

NeoSmart Technologies recently identified a malicious campaign that spreads through legitimate, but compromised, websites:

Today while browsing a (compromised) WordPress site that shall remain unnamed, I came across a very interesting “hack” that was pulled off with a bit more finesse than most of the drive-by-infection attempts.

Continue reading “The ‘HoeflerText’ font wasn’t found? Beware, it’s a trap!”

Malware analysis, my own list of tools and resources

A constantly updated list — Last update: August 2, 2018

During my daily activities of analysis and research, often I discover new useful tools.
I collected them in this list (periodically updated).

Enjoy!


Detection

  • AnalyzePE — Wrapper for a variety of tools for reporting on Windows PE files.
  • chkrootkit — Linux rootkit detector.
  • Rootkit Hunter — Detect Linux rootkits.
  • Detect-It-Easy — A program for determining types of files.
  • hashdeep — Compute digest hashes with a variety of algorithms.
  • Loki — Host based scanner for IOCs.
  • MASTIFF — Static analysis framework.
  • MultiScanner — Modular file scanning/analysis framework
  • nsrllookup — A tool for looking up hashes in NIST’s National Software Reference Library database.
  • PEV — A multiplatform toolkit to work with PE files, providing feature-rich tools for proper analysis of suspicious binaries.
  • totalhash.py — Python script for searching in TotalHash.cymru.com database.
  • TrID — File identifier.
  • YARA — Pattern matching tool for analysts.

Online scanners and sandboxes

  • NVISO ApkScan — Dynamic analysis of APKs
  • APK Analyzer — Dynamic analysis of APKs
  • AndroTotal — Online analysis of APKs against multiple mobile antivirus apps
  • AVCaesar —Online scanner and malware repository
  • Cryptam — Analyze suspicious office documents
  • Cuckoo Sandbox — Open source sandbox and automated analysis system
  • Malwr — Free analysis with an online Cuckoo Sandbox instance
  • DeepViz — Multi-format file analyzer with machine-learning classification
  • detux — A sandbox developed to do traffic analysis of Linux malwares and capturing IOCs
  • Document Analyzer — Analysis of DOC and PDF files
  • DRAKVUF — Dynamic malware analysis system.
  • File Analyzer — Free dynamic analysis of PE files
  • firmware.re — Unpacks, scans and analyzes firmware packages
  • Hybrid Analysis — Online malware analysis tool
  • IRMA — An asynchronous and customizable analysis platform for suspicious files
  • Joe Sandbox — Deep malware analysis.
  • Jotti — Online AV scanner
  • Limon — Sandbox for Analyzing Linux Malwares
  • Malheur — Automatic sandboxed analysis of malware behavior
  • MASTIFF Online — Online static malware analysis
  • Metadefender.com — Scan a file, hash or IP address for malware
  • PDF Examiner — Analyse suspicious PDF files
  • SEE — “Sandboxed Execution Environment”, a framework for building test automation in secured environments
  • URL Analyzer — Dynamic analysis of URL files
  • VirusTotal — Online analysis of malware samples and URLs
  • NoDistribute — Scan files with over 35 anti-viruses.
    The results of the scans are never distributed.

Deobfuscation

  • Balbuzard — Analysis tool for reversing obfuscation
  • de4dot — .NET deobfuscator and unpacker
  • FLOSS — Tool to automatically deobfuscate strings from malware binaries
  • NoMoreXOR — Guess a 256 byte XOR key using frequency analysis
  • PackerAttacker — Hidden code extractor for Windows malware
  • unpacker — Automated malware unpacker for Windows malware
  • unxor — Guess XOR keys using known-plaintext attacks
  • VirtualDeobfuscator — Reverse engineering tool for virtualization wrappers
  • JS Beautifier — JavaScript unpacking and deobfuscation
  • JS Deobfuscator — Deobfuscation tool for Javascript
  • XORBruteForcer — A Python script for brute forcing single-byte XOR keys

Reverse Engineering and Debugging

  • angr — Platform-agnostic binary analysis framework
  • bamfdetect — Identifies and extracts information from bots and malware
  • BARF — Open source multiplatform Binary Analysis and Reverse engineering Framework.
  • binnavi — Binary analysis IDE for reverse engineering
  • Capstone — Disassembly framework for binary analysis and reversing
  • codebro — Web based code browser with basic code analysis.
  • dnSpy — .NET assembly editor, decompiler and debugger
  • Evan’s Debugger (EDB) — Modular debugger with a Qt GUI
  • Fibratus — Windows kernel exploration and tracing tool
  • GDB — The GNU debugger
  • GEF — GDB Enhanced Features, for exploiters and reverse engineers
  • hackers-grep — Uility to search for strings in PE executables
  • IDA Pro — Windows disassembler and debugger
  • Immunity Debugger — Debugger for malware analysis
  • ltrace — Dynamic analysis tool for Linux executables
  • strace — Dynamic analysis tool for Linux executables
  • objdump — Static analysis tool for Linux binaries
  • OllyDbg — Debugger for Windows executables
  • PANDA — Platform for Architecture-Neutral Dynamic Analysis
  • PEDA — Python Exploit Development Assistance for GDB
  • pestudio —Static analysis tool for Windows executables
  • plasma — Interactive disassembler for x86/ARM/MIPS
  • PPEE (puppy) — PE file inspector.
  • Process Monitor — Advanced monitoring tool for Windows programs
  • Pyew — Python tool for malware analysis
  • Rdare2 — Reverse engineering framework
  • ROPMEMU — Framework to analyze, dissect and decompile complex code-reuse attacks
  • SMRT — Sublime Malware Research Tool, a plugin for Sublime Text 3 focused on malware analyis.
  • Triton — A dynamic binary analysis (DBA) framework
  • Udis86 — Disassembler library and tools
  • Vivisect — Python tool for malware analysis
  • X64dbg — Debugger for windows

Memory Forensics

  • Volatility — Advanced memory forensics framework.
  • DAMM — Differential Analysis of Malware in Memory, built on Volatility
  • evolve — Web interface for the Volatility Memory Forensics Framework
  • FindAES — Find AES encryption keys in memory
  • Muninn — A script to automate portions of analysis using Volatility, and create a readable report
  • Rekall — Memory analysis framework (from a Volatility fork).
  • TotalRecall — Script based on Volatility for automating various malware analysis tasks
  • WinDbg — Kernel debugger for Windows systems

Packet Analysis

  • PacketTotal — Online engine for analyzing .pcap files and visualizing the network traffic within, useful for malware analysis and incident response. My review
  • NetworkTotal — Online analysis of pcap files to detect viruses, worms, trojans and malware.
  • Network Miner — A Network Forensic Analysis Tool (NFAT) for Windows
  • Wireshark — Widely-used network protocol analyzer.

Website Analysis

  • Desenmascara.me — Tool to retrieve metadata from websites
  • Dig — Online dig and other network tools
  • dnstwist — Domain name permutation engine for detecting typo squatting, phishing and corporate espionage
  • IPinfo — Gather information about an IP or domain by searching online resources
  • TekDefense Automator — OSINT tool for gathering information about URLs, IPs, or hashes
  • Machinae — OSINT tool for gathering information about URLs, IPs, or hashes
  • mailchecker — Cross-language temporary email detection library
  • SenderBase — Search for IP, domain or network owner
  • SpamCop — IP based spam block list
  • SpamHaus — Block list based on domains and IPs
  • Sucuri SiteCheck — Website Malware and Security Scanner
  • URLQuery — URL Scanner
  • Malzilla — Analyze malicious web pages.
  • Whois — DomainTools free online whois search
  • ZScalar Zulu — Zulu URL Risk Analyzer
  • Firebug — Firefox extension for web development.
  • Java Decompiler — Decompile and inspect Java apps
  • Java IDX Parser — Parses Java IDX cache files
  • JSDetox — JavaScript malware analysis tool
  • jsunpack-n — Javascript unpacker that emulates browser functionality
  • Krakatau — Java decompiler, assembler, and disassembler
  • RABCDAsm — ActionScript Bytecode Disassembler
  • swftools — Adobe Flash decompiler.
  • xxxswf — Analysis tool for Flash files
  • Spidermonkey — Mozilla’s JavaScript engine, for debugging malicious JS
  • PunkSpider — Web application vulnerability search engine. My review

Resources