This article is an overview of C code in assembly, including variables and “if” statements, “for” and “while” loops, switch statements, arrays, structs, linked lists, stacks and heaps.

Variables

Variables are used in code to hold values. Based on where the variable is declared, variables are of two types — local variables and global variables. Values stored in a local variable are accessible within a function, whereas values stored in global variables can be accessed from anywhere in the program.  The following C program shows how local and global variables are used in a C program. The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded. void main() { int b = 20; /local variable/ a = a+b;  printf(“The new value of a is %dn”, a); }

“If” statements

“If” statements are commonly used in C programming. They are used to change the control flow based on certain conditions. The following code example shows an if condition being used in a C program. The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded. { int a = 30; int b = 20; if (a > b){ printf(“a is greater than bn”); } else{ printf(“b is greater than an”); } }

Loops

Loops are used in programming for executing repetitive tasks. They are used for executing a block of statements repeatedly until a given condition returns false. “For” loops have the following syntax: The following program is an example of a “for” loop in C.    //code to be executed until the condition fails. } The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded. { int i; for (i=0; i<7; i++){ printf(“value of a is %dn”, i); } }

“While” loops are another commonly used concept in programming and have a purpose similar to for loops. They are used for executing a block of statements repeatedly until a given condition fails. “While” loops have the following syntax. The following is an example of a while loop in C. } When going through the disassembly of malware samples, one should be able to identify these for and while loops, as they are common in malware. { int i=0; while(i<7){ printf(“value of a is %dn”, i); i++; } } The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded.

Switch statements

Another commonly used concept in C is the switch statement. Switch statements can be used to write multiple code blocks, which are written against case labels and execute one among them. This is done by evaluating an expression and comparing the output with the values of each case label. The following is the syntax of switch statements in C. The following code is an example of how switch statements can be used in C programming.     case value 1:     //statements to be executed     break;     case value 2:     //statements to be executed     break;     case value 3:     //statements to be executed     break;     default:     //statements to be executed     break; } The following is the assembly equivalent of the preceding code. This is generated by OllyDbg when the executable is loaded. { int i = 3; switch(i) { case 1: printf(“Value entered is 1n”);      break; case 2: printf(“Value entered is 2n”);      break; case 3: printf(“Value entered is 3n”);      break; default: printf(“Value out of rangen”);   } }

Arrays and structs

Arrays and structs are used by programmers to store multiple items. They both operate in a similar fashion, but arrays are used to store elements of the same type while structs can have elements of different types.  Malware authors can use arrays and structs in their code and it is important to be able to identify these code constructs when examining the disassembly of an executable. The following snippet is an example of how arrays can be used in C programming. The following snippet is an example of how structs can be used in C programming. void main() { int i; for(i=0; i<4; i++){ printf(“value from array is %dn”, arr[i]); } }

Stack and heap

It is important to be aware of how stack and heap are used. Stack is used for static memory allocation and heap is used for dynamic memory allocation. It is also important to understand that stack is used when function calls are made. Function arguments and local variables are pushed onto the stack before the function definition is executed. struct car {  int id;  char brand[10]; }; void main() { struct car entry = {0}; entry.id = 1; strcpy(entry.brand, “Audi”); printf(“The brand is %sn”, entry.brand); }

Conclusion

Reverse engineering malware requires analysts to understand how C code in assembly is linked to machine instructions. They need a foundation of knowledge that includes the purpose and capabilities of C code constructs like variables and “if” statements, switch statements, arrays, structs, linked lists, stacks, heaps, and in particular, which “for” and “while” loops are commonly used by malware.  This article has provided an overview of C code programming concepts alongside coding example snippets. Regardless of skill level, knowledge and understanding of C code in assembly will prove fundamentally useful to students and professionals alike in their ongoing code analysis. 

Sources

Brian W. Kernighan and Dennis M. Ritchie, “C Programming Language, 2nd Edition,” Prentice Hall, April 1988 Michael Sikorski and Andrew Honig, “Practical Malware Analysis,” No Starch Press, February 2012 Reverse Engineering for Beginners, Dennis Yurichev