Understanding Assembly


Assembly language is a low-level programming language that closely corresponds to machine code, using mnemonics to represent CPU instructions. It provides direct control over hardware and memory, making it essential for tasks requiring granular analysis and manipulation. Assembly language is foundational for cybersecurity because it enables deep introspection and manipulation of software behavior. Mastery of assembly equips professionals to reverse engineer binaries, dissect malware, and develop exploits, bridging the gap between high-level abstractions and hardware-level execution. This low-level expertise is crucial for both defending systems and understanding adversary tactics.

x86

CISC (Complex Instruction Set Computer)

Here is a really good explanation of registers, instructions, stack and calling conventions:

https://www.cs.virginia.edu/~evans/cs216/guides/x86.html https://www.eecg.utoronto.ca/~amza/www.mindsec.com/files/x86regs.html https://sonictk.github.io/asm_tutorial/ https://github.com/sonictk/asm_tutorial/tree/master

images/1022-1.png

CPU instructions to memory: MOV, PUSH, POP, LEA CPU instructions to I/O Devices: IN, INS, INB, OUT, OUTS, OUTB https://redirect.cs.umbc.edu/~cpatel2/links/310/slides/chap11_lect08_IO1.pdf

Registers:

General registers EAX EBX ECX EDX

Segment registers CS DS ES FS GS SS

Index and pointers ESI EDI EBP EIP ESP

Indicator EFLAGS

images/1022-2.png

General registers

As the title says, general register are the one we use most of the time Most of the instructions perform on these registers. They all can be broken down into 16 and 8 bit registers.

32 bits : EAX EBX ECX EDX

16 bits : AX BX CX DX

8 bits : AH AL BH BL CH CL DH DL

The “H” and “L” suffix on the 8 bit registers stand for high byte and low byte. With this out of the way, let’s see their individual main use

EAX,AX,AH,AL : Called the Accumulator register.

          It is used for I/O port access, arithmetic, interrupt calls,

          etc...

EBX,BX,BH,BL : Called the Base register

          It is used as a base pointer for memory access

          Gets some interrupt return values

ECX,CX,CH,CL : Called the Counter register

          It is used as a loop counter and for shifts

          Gets some interrupt values

EDX,DX,DH,DL : Called the Data register

          It is used for I/O port access, arithmetic, some interrupt

          calls.

Indexes and pointers

Indexes and pointer and the offset part of and address. They have various uses but each register has a specific function. They some time used with a segment register to point to far address (in a 1Mb range). The register with an “E” prefix can only be used in protected mode.

ES:EDI EDI DI : Destination index register

           Used for string, memory array copying and setting and

           for far pointer addressing with ES

DS:ESI EDI SI : Source index register

           Used for string and memory array copying

SS:EBP EBP BP : Stack Base pointer register

           Holds the base address of the stack

SS:ESP ESP SP : Stack pointer register

           Holds the top address of the stack

CS:EIP EIP IP : Instruction Pointer

           Holds the offset of the next instruction

           It can only be read

Instructions

Arithmetics instructions: ADD, SUB, INC, DEC, IMUL, IDIV, AND, OR, XOR, NOT, NEG, SHL and SHR

They can be easy to explain from a Logic Door and Electronic circuit perspective.

Branching:

Additional Mnemonics:

Memory Layout

images/1022-3.png

Other program segments

You may have noticed in the the diagram above that in addition to the stack and heap, there are separate, distinct regions of memory in the program’s address space that I drew, namely the bss, data and text segments. If you recall the Hello, world section example, you might also have realized that there are labels in that example that line up with the names of these program segments.

Well, that’s no coincidence; it turns out that is what is being defined in those regions, and these regions exist as a result of the PE executable file format, which is a standardized format that Microsoft dictates for how executable’s memory segments should be arranged in order for the operating system’s loader to be able to load it into memory.

Crossing the platforms

On Linux/OS X, the executable file format is known as the Executable and Linkable Format, or ELF. While there are differences between it and the PE file format used by Windows, the segments we’ll be dealing with here exist on both, and function the same way on both. On a fun note, Fabian Giesen has an amusing tweet regarding the name.

Let’s go over what each of the major segments are for.

Virtual Memory Address System

If you’ve never heard of the term virtual memory before, this section is for you. Basically, every time you’ve stepped into a debugger and seen a hexadecimal address for your pointers, your stack variables, or even your program itself, this has all been a lie; the addresses you see there are not the actual addresses that the memory resides at.

What actually happens here is that every time a user-mode application asks for an allocation of memory by the operating system, whether from calling HeapAlloc, reserving it on the stack through int a[5] or even doing it in assembly, the operating system does a translation from a physical address of a real, physical addressable location on the hardware to a virtual address, which it then provides you with to perform your operations as you deem fit. Processes are mapped into a virtual address space, and memory is reserved by the operating system for that space, but only in virtual memory; the physical hardware could still have memory unmapped to that address space until the user-mode process actually requests an allocation.

If the memory is already reserved and available for use, the CPU translates the physical addresses into virtual addresses and hands them back to the user-mode process for use. If the physical memory is not available yet (i.e. perhaps because the CPU caches are full and we now need to use the DDR RAM sticks’ storage instead), the OS will page in memory from whatever available physical memory is available, whether that be DDR RAM, the hard drive, etc.

The translation of the physical memory addresses of the hardware to virtual memory addresses that the operating system can distribute to applications is done using a special piece of hardware known as the Memory Management Unit, or MMU. As you might expect, if every single time a program requested memory required a corresponding translation operation, this might prove costly. Hence, there is another specialized piece of hardware known as the Translation Look-aside Buffer, or TLB cache. That also sits on the CPU and caches the result of previous addresses translations, which it then hands to the operating system, which in turn hands it to the application requesting the allocation.

images/1022-4.png

Calling Conventions

images/1022-5.png

images/1022-6.png

EFLAGS register

images/1022-7.png

x64

Basically the same but registers start with R instead of with E: RAX, RBX, RSP, RIP,…

In RFLAGS we basically use the same as EFLAGS because 32-63 bits are reserved.

Calling Convention

Recall that some calling conventions require parameters to be passed on the stack on x86. On x64, most calling conventions pass parameters through registers. For example, on Windows x64, there is only one calling convention and the first four parameters are passed through RCX, RDX, R8, and R9; the remaining are pushed on the stack from right to left. On Linux, the first six parameters are passed on RDI, RSI, RDX, RCX, R8, and R9.

ntdll.dll:

*************************************************************

* FUNCTION

*************************************************************

int __fastcall NtAllocateVirtualMemory (int pHandle , voi

assume GS_OFFSET = 0xff00000000

int EAX:4 <RETURN>

int ECX:4 pHandle

void * RDX:8 allocAddr

int R8D:4 some

void * R9:8 allocSize

int Stack[0x28]:4 mem

int Stack[0x30]:4 page

Bits, bytes and sizes

At this point, it’s also probably a good idea to give a quick refresher for the non-Win32 API programmers out there who might not be as familiar with the nomenclature of the Windows data types.

Bit: 0 or 1. The smallest addressable form of memory.

Nibble: 4 bits.

Byte: 8 bits. In C/C++, you might be familiar with the term char.

WORD: On the x64 architecture, the word size is 16 bits. In C/C++, you might be more familiar with the term short.

DWORD: Short for “double word”, this means 2 × 16 bit words, which means 32 bits. In C/C++, you might be more familiar with the term int.

QWORD: quad word, this is for 64 bits. NASM syntax as well.

oword: Short for “octa-word” this means 8 × 16 bit words, which totals 128 bits. This term is used in NASM syntax.

yword: Also used only in NASM syntax, this refers to 256 bits in terms of size (i.e. the size of ymm register.)

float: This means 32 bits for a single-precision floating point under the IEEE 754 standards.

double: This means 64 bits for a double-precision floating point under the IEEE 754 standard. This is also referred to as a quad word.

Pointers: On the x64 ISA, pointers are all 64-bit addresses.

Ring 0

There are certain instructions that will only work with Ring 0 intel privilege level.

  1. HLT (Halt)
  2. MOV to/from Control Registers (CR0, CR4)
  3. IN/OUT (Input/Output)
  4. CLI/STI
  5. LGDT/LDT (Load Global/Local Descriptor Table)
  6. SMI (System Management Interrupt)

Hello world in ASM

https://neetx.github.io/posts/windows-asm-hello-world/

test.asm

; wsl nasm -f win64 test.asm
; link test.obj /defaultlib:Kernel32.lib /subsystem:console /entry:main /LARGEADDRESSAWARE:NO /out:test.exe
global main
; kernel32 functions to use in printcustom
extern GetStdHandle
extern WriteFile

section .text
printcustom:
    push    rbp
    mov     rbp, rsp                
    sub     rsp, 40                 ; shadow space added to stack section
    ; this would be optional in case of stack variables
    ; but it is mandatory for Windows x64 calling convention when you call more than 4 arguments

    mov     r8, [rdx]               ; store 2nd argument into 3rd argument to WriteFile
    mov     rdx, rcx                ; store 1st argument into 2nd argument to WriteFile
    
    mov     rcx, 0fffffff5h
    call    GetStdHandle

    mov     rcx, rax                ; 1st argument StdOut Handle
    mov     r9, BytesWritten        ; 4th argument bytes written
    mov     dword [rsp + 32], 00h   ; 5th argumnet using shadow space
    call    WriteFile
    
    ;add     rsp, 40                 ; clear shadow space
    ;mov     rsp, rbp                ; restores base pointer into stack pointer
    leave                           ; with leave we can achieve both previous instructions
    ret                             ; pop the stack (to find previous rip in main) and puts it in rip
main:
    mov     rcx, Buffer
    mov     rdx, BytesToWrite
    call    printcustom             ; call pushes rip into the stack and sets location as current rip
    mov     rcx, newline
    mov     rdx, newlineBytes
    call    printcustom
    mov     rcx, Buffer2
    mov     rdx, BytesToWrite2
    call    printcustom
ExitProgram:
    xor     eax, eax
    ret

section .data
Buffer:        db    'Hello, World!', 00h
BytesToWrite:  dq    0eh
Buffer2:       db    0x40,0x41,0x42,0x00         ; msfvenom -f powershell
BytesToWrite2: dq    4h
newline:       db    0ah
newlineBytes:  dq    1h

section .bss
BytesWritten: resd  01h

Assembly tool

When assembling with NASM, the -f win64 option plays a crucial role in how the output file is structured. Here’s a breakdown of what happens:

Binary Format (-f bin)

Win64 Format (-f win64)

Key Points:

-f bin provides a very basic object file with just raw instructions.

-f win64 creates a PE format object file suitable for linking and generating Windows executables or DLLs.

While -f win64 generates sections, it still doesn’t include the .idata section at this stage because import information is determined during linking.

Import Information and .idata in ELF:

Examples:

Assembling into shellcode:

nasm test.asm (does not work with external names, file has to specify BITS 32 or BITS 64)

This can be useful for study, get opcodes and dissassemble bytes:

ndisasm -b 64 test (-b flag depends on the BITS 64 instruction of the assembly code)

Assembling into obj (with PE headers):

C:\Users\nobody\Desktop\assembly>nasm -f win64 test.asm

Linking (setting up IAT in the .idata section):

C:\Users\nobody\Desktop\assembly>link test.obj “C:\Program Files (x86)\Windows Kits\10\lib\10.0.18362.0\um\x64\kernel32.lib” /subsystem:console /entry:main /LARGEADDRESSAWARE:NO /out:test.exe

also: link test.obj /defaultlib:Kernel32.lib /subsystem:console /entry:main /LARGEADDRESSAWARE:NO /out:test.exe

you can also link and provide permissions to specific sections (in addition to other flags, as well)

link test.obj /entry:main /section:.custom,RWE /out:main.exe

https://academy.hackthebox.com/module/227/section/2493

Tools:

link is included into VS C/C++ toolchain

[nasm] (https://nasm.us) or using wsl, linux or macOS

Linux

In linux you can link with ld command in the binutils package

# Assemble the .asm file to an object file

nasm -f elf64 test.asm -o test.o

# Link the object file to create executable

ld test.o -o main

# Assemble 32bit

nasm -f elf32 test.asm -o test.o

# Link

ld -m elf_i386 test.o -o main

full link:

ld test.o -o main --entry=main --section-flags .custom=alloc,write,execute

???