Build a Flag Workshop: A Beginner’s Guide to Solving a Reversing challenge from Sunshine CTF
Introduction
Reverse engineering involves analyzing software to understand its components and functionalities without access to its source code. In CTF challenges, reverse engineering tasks often present participants with compiled binaries or decompiled code snippets that they must dissect to reveal hidden flags — secret strings that prove the challenge has been solved.
In the “Build a Flag Workshop” challenge from Sunshine CTF, we were provided with a linux executable elf file. Our mission was to reverse engineer it to construct a valid flag that is Chompy’s favorite flag.
Approaching the challenge, first step recon
First as usual, let’s see what type of file this is to better understand it.
As we can see, a x86–64 Linux executable that is position-independent (PIE) and dynamically linked, meaning it uses shared libraries and supports ASLR (Address Space Layout Randomization). It is also stripped, so it lacks symbol information, making analysis more challenging. For reverse engineering, we’ll need to use tools like Ghidra or IDA Pro that can handle PIE and dynamically linked libraries, employ techniques to bypass or work with ASLR, and rely on disassembly and pattern recognition to understand the program’s functionality without symbolic clues.
strings test
As usual I like to test if there is flag string embedded in the strings of the program, unfortunately that’s not the case here.
Dynamic Analysis
As we can see there are many strings output after running the program which we can use to trace the interesting function that is handling this process. Knowing how this process is working with the pseudo C code given by Ghidra can help us crack this program to get the flag.
Reverse Engineering with Ghidra
Now let’s jump to the interesting part where we will decompile the program with Ghdira and read pseudo C code generated by Ghidra to analyze the program statically to figure out how we can get the flag from it.
The first step for me is to check for all the strings using string search with Ghidra to find if there are any interesting texts such as “flag check”.
As we can see here, using string search feature in Ghidra we were able to locate a string “is Chompy’s favorite flag!….”, this can be used to pin point the address of the string.
As we can see here, it’s 00102060 and this address can be used to check which function made a reference to this address to further analyze which function handles getting Chompy’s favorite flag.
Thus I was able to locate the interesting function FUN_001019c0.
/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */
void FUN_001019c0(void)
{
int iVar1;
char *__s;
char *pcVar2;
uchar *d;
char *__s_00;
size_t n;
long in_FS_OFFSET;
long local_48;
long local_40;
long local_30;
local_30 = *(long *)(in_FS_OFFSET + 0x28);
__s = (char *)FUN_00101370();
if (__s == (char *)0x0) {
if (local_30 == *(long *)(in_FS_OFFSET + 0x28)) {
puts("Failed to generate flag.");
return;
}
goto LAB_00101bb6;
}
pcVar2 = strtok(__s,"-");
d = (uchar *)strtok((char *)0x0,"-");
__s_00 = strtok((char *)0x0,"-");
if (pcVar2 == (char *)0x0) {
if (d == (uchar *)0x0) goto LAB_00101b80;
puts((char *)d);
if (__s_00 == (char *)0x0) goto LAB_00101b3a;
LAB_00101b32:
puts(__s_00);
LAB_00101b3a:
pcVar2 = "isn\'t Chompy\'s favorite, but it\'s yours and that\'s what matters.";
}
else {
puts(pcVar2);
if (d == (uchar *)0x0) {
LAB_00101b80:
if (__s_00 != (char *)0x0) goto LAB_00101b32;
goto LAB_00101b3a;
}
puts((char *)d);
if (__s_00 != (char *)0x0) {
puts(__s_00);
}
pcVar2 = strstr(pcVar2,"decide");
if (pcVar2 == (char *)0x0) goto LAB_00101b3a;
n = strlen((char *)d);
MD5(d,n,(uchar *)&local_48);
if ((local_48 != _DAT_00104010 || local_40 != _DAT_00104018) || (__s_00 == (char *)0x0))
goto LAB_00101b3a;
iVar1 = strcmp(__s_00,"chompy");
pcVar2 = "is Chompy\'s favorite flag! Great work.";
if (iVar1 != 0) goto LAB_00101b3a;
}
__printf_chk(2,"sun{%s} %s\n",__s,pcVar2);
if (local_30 == *(long *)(in_FS_OFFSET + 0x28)) {
free(__s);
return;
}
LAB_00101bb6:
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
FUN_001019c0
Function
This function is responsible for generating, validating, and displaying the flag. If you can’t understand C code generated by Ghidra, you can always use chatgpt to understand it. Either way you might have to google some part of it. Here’s a step-by-step breakdown:
Stack Protection:
long local_30;
local_30 = *(long *)(in_FS_OFFSET + 0x28);
Purpose: Implements a stack canary to detect buffer overflows. If the canary value changes unexpectedly, it indicates a potential security breach, and the program terminates to prevent exploitation.
Flag Generation:
__s = (char *)FUN_00101370();
if (__s == (char *)0x0) {
puts("Failed to generate flag.");
return;
}
Purpose: Calls FUN_00101370()
to generate the flag string. If flag generation fails (i.e., returns NULL
), it notifies the user and exits.
Flag Tokenization:
pcVar2 = strtok(__s, "-");
d = (uchar *)strtok(NULL, "-");
__s_00 = strtok(NULL, "-");
Purpose: Splits the flag string __s
into three parts using the hyphen (-
) as a delimiter:
pcVar2
: First segment (part1
).d
: Second segment (part2
), cast as an unsigned character pointer.__s_00
: Third segment (part3
).
Conditional Validation:
- Scenario 1: If
pcVar2
isNULL
, it handles the presence ofd
and__s_00
, displaying them if they exist and setting a default failure message. - Scenario 2: If
pcVar2
exists: - Checks if it contains the substring
"decide"
. - Computes the MD5 hash of
d
and compares it against predefined constants. - Verifies if
__s_00
exactly matches"chompy"
. So we can know that our third segment of the flag is “chompy”. - If all conditions are met, sets a success message indicating the flag is valid.
Final Output:
__printf_chk(2, "sun{%s} %s\n", __s, pcVar2);
Purpose: Prints the flag in the format sun{<full_flag>} <message>
, where <full_flag>
is the constructed flag string and <message>
conveys the validation result.
Cleanup and Exit:
Frees allocated memory and ensures stack integrity before exiting. If the stack canary is altered, it triggers a failure to prevent security breaches.
FUNCTION FUN_00101370 Analysis
char * FUN_00101370(void)
{
// Variable declarations
size_t sVar1;
size_t sVar2;
void *pvVar3;
char *pcVar4;
size_t sVar5;
long lVar6;
int param_7;
int iStack000000000000000c;
int param_8;
char cStack0000000000000014;
// Accessing a string from a table based on computed index
pcVar4 = (&PTR_s_live_long_prosper_00104020)
[(long)param_7 * 9 + (long)iStack000000000000000c * 3 + (long)param_8];
// Calculating lengths
sVar1 = strlen(pcVar4);
sVar2 = strlen((char *)((long)&stack0x00000010 + 4));
// Total size: length of pcVar4 + 2 + length of the second string
sVar1 = sVar1 + 2 + sVar2;
// Allocating memory
pvVar3 = calloc(sVar1,1);
// Concatenating pcVar4 into the allocated buffer with safety check
pcVar4 = (char *)__strcat_chk(pvVar3, pcVar4, sVar1);
// Conditional logic based on a stack variable
if (cStack0000000000000014 == '\0') {
return pcVar4;
}
// Appending '-' to the string
sVar5 = strlen(pcVar4);
*(undefined2 *)(pcVar4 + sVar5) = 0x2d; // 0x2d is ASCII for '-'
lVar6 = (long)(pcVar4 + sVar5) + 1;
// Appending the second string from the stack
__memcpy_chk(lVar6, (char *)((long)&stack0x00000010 + 4), sVar2 + 1, pcVar4 + (sVar1 - lVar6));
return pcVar4;
}
If you recall earlier, this function is used to generate the flag by selecting and concatenating specific strings. Here’s how it operates:
String Selection:
pcVar4 = (&PTR_s_live_long_prosper_00104020)
[(long)param_7 * 9 + (long)iStack000000000000000c * 3 + (long)param_8];
- Purpose: Accesses a predefined string table (
PTR_s_live_long_prosper_00104020
) containing various famous quotes. - Index Calculation: The index into the string table is calculated based on three variables:
param_7
,iStack000000000000000c
, andparam_8
. This determines which base string to select.
Double Clicking on PTR_s_live_long_prosper_00104020
will bring up the table:
Memory Allocation and String Concatenation:
sVar1 = strlen(pcVar4) + 2 + strlen((char *)((long)&stack0x00000010 + 4));
pvVar3 = calloc(sVar1, 1);
pcVar4 = (char *)__strcat_chk(pvVar3, pcVar4, sVar1);
- Purpose: Allocates memory for the concatenated string, accounting for the base string, a hyphen (
-
), and a potential suffix. - Concatenation: Safely appends the base string into the allocated buffer.
Conditional Appending:
if (cStack0000000000000014 == '\0') {
return pcVar4;
}
- Condition: Checks if
cStack0000000000000014
is null ('\0'
). - Outcome:
- If
'\0'
: Returns the string as is (onlypcVar4
). - Else: Proceeds to append additional data.
Appending the Hyphen and Second String
sVar5 = strlen(pcVar4);
*(undefined2 *)(pcVar4 + sVar5) = 0x2d; // Appends '-'
lVar6 = (long)(pcVar4 + sVar5) + 1;
__memcpy_chk(lVar6, (char *)((long)&stack0x00000010 + 4), sVar2 + 1, pcVar4 + (sVar1 - lVar6));
- Appending
'-'
: Inserts a hyphen at the end of the first string. 0x2d
is the ASCII code for'-'
.- Appending the Second String:
- Source: The string located at
(long)&stack0x00000010 + 4
. - Destination: Right after the hyphen in the allocated buffer.
- Length:
sVar2 + 1
bytes to include the null terminator.
Final Return
return pcVar4;
Returns: The constructed string, which is either:
- Just the first string (
pcVar4
), or - The first string concatenated with a hyphen and the second string.
Exploring the String Table
The string table (PTR_s_live_long_prosper_00104020
) contains various famous quotes used as base strings for flag generation. Here's a glimpse of some entries:
Key Takeaway: The function FUN_00101370
selects one of these strings based on which contain the word “ decide” as deduced from FUN_001019c0
to construct the base part of the flag.
From FUN_001019c0
we can deduce that the first part of the flag will be: “all_we_have_to_decide_is_what_to_do_with_the_time_given_to_us”, as it is the only string that contain the word “decide” in it and the third part will be “chompy”.
Now to get our second part of the flag, let’s look at this code fromFUN_001019c0
again:
MD5(d, n, (uchar *)&local_48);
if (((local_48 ^ _DAT_00104010 | local_40 ^ _DAT_00104018) == 0) && (__s_00 != (char *)0x0))
if (strcmp(__s_00, "chompy") == 0) {
pcVar2 = "is Chompy's favorite flag! Great work.";
}
Purpose:
- Computes the MD5 hash of
part2
(d
). - Compares the computed hash segments (
local_48
andlocal_40
) against the stored hash segments (_DAT_00104010
and_DAT_00104018
). - Ensures that both segments match exactly.
Now these are the values of _DAT_00104010
and _DAT_00104018
:
By looking at the memory addresses we can deduce that:
DAT_00104010
:
00104010 ab 17 85 09 78 e3 6a af
DAT_00104018
:
00104018 6a 2b 88 08 f1 de d9 71
These two segments together form the expected 16-byte MD5 hash:
ab 17 85 09 78 e3 6a af 6a 2b 88 08 f1 de d9 71
In hexadecimal notation, the complete MD5 hash is:
ab17850978e36aaf6a2b8808f1ded971
MD5 Hash Confirmation:
Using hashes.com I was able to get back the decrypted value which is gandalf.
Constructing the Valid Flag
With a clear understanding of the code and the validation process, let’s piece together the correct flag.
Flag Structure:
sun{part1-part2-chompy}
Components:
part1
:"all_we_have_to_decide_is_what_to_do_with_the_time_given_to_us"
- Reasoning: This string contains the substring
"decide"
, satisfying the first validation condition. part2
:"gandalf"
- Reasoning: Its MD5 hash (
ab17850978e36aaf6a2b8808f1ded971
) matches the stored hash segments, satisfying the second validation condition. part3
:"chompy"
- Reasoning: Exactly matches the required string, satisfying the third validation condition.
Final Flag:
sun{all_we_have_to_decide_is_what_to_do_with_the_time_given_to_us-gandalf-chompy}
Conclusion
Solving the “Build a Flag Workshop” reverse engineering CTF challenge involved a systematic analysis of decompiled code, understanding string manipulation, and leveraging cryptographic hash functions for validation. By breaking down the functions responsible for flag generation and validation, identifying the correct string components, and verifying their integrity through hash matching, we successfully uncovered the hidden flag.
Key Takeaways:
- Understanding Code Flow: Grasping how different functions interact is crucial in reverse engineering.
- String Manipulation: Recognizing how strings are selected and concatenated helps in reconstructing the expected output.
- Hash Functions: Familiarity with cryptographic hashes like MD5 is essential for validating data integrity.
- Attention to Detail: Ensuring exact matches in string components and formats is vital for passing validation checks.
Reverse engineering challenges not only test your technical skills but also enhance your problem-solving abilities. By methodically dissecting the code and understanding its logic, you can uncover hidden secrets and master the art of reverse engineering. Happy hacking!