Applications of AI in InfoSec — Writeup

Module ID	Difficulty	Estimated Duration	Number of Sections	Reward
292	Easy · Tier II	8 hours	25 (including 4 interactive assessments + 1 skill assessment)	20 Cubes

Module Link: academy.hackthebox.com/module/details/292

#	Section	Type	Topic
1	Introduction	Theory	—
2	Environment Setup	Interactive	Q1 ✅
3	JupyterLab	Interactive	—
4	Python Libraries for AI	Theory	—
5	Datasets	Theory	—
6	Data Preprocessing	Theory	—
7	Data Transformation	Theory	—
8	Metrics for Evaluating a Model	Theory	—
9	Spam Classification	Theory	—
10	The Spam Dataset	Interactive	—
11	Preprocessing the Spam Dataset	Interactive	—
12	Feature Extraction	Interactive	—
13	Training and Evaluation (Spam Detection)	Interactive	—
14	Model Evaluation (Spam Detection)	Interactive	Q1 ✅
15	Network Anomaly Detection	Interactive	—
16	Preprocessing and Splitting the Dataset	Interactive	—
17	Training and Evaluation (Network Anomaly Detection)	Interactive	—
18	Model Evaluation (Network Anomaly Detection)	Interactive	Q1 ✅
19	Malware Classification	Theory	—
20	The Malware Dataset	Interactive	—
21	Preprocessing the Malware Dataset	Interactive	—
22	The Model	Interactive	—
23	Training and Evaluation (Malware Image Classification)	Interactive	—
24	Model Evaluation (Malware Image Classification)	Interactive	Q1 ✅
25	Skills Assessment	Interactive	Q1 ✅

1. Introduction

Key Learning Points

This module builds three complete AI projects: Spam SMS Classifier (NLP + Naive Bayes), Network Anomaly Detection (Tabular Data + Random Forest), and Malware Image Classification (CNN + ResNet50 Transfer Learning)
All code is provided in Python code blocks, to be executed sequentially in Jupyter Notebook.
Runtime Environment: Playground VM (http://<TARGET_IP>:8888) or Local Environment (recommended 4GB+ RAM, 4-core CPU)
Module Evaluation Method: Trained models are uploaded to the Playground VM's evaluation port, and a flag is returned after passing the performance threshold.

Understanding and Insights

The value of this module is not in individual algorithms, but in completing the full closed loop of an ML project—from raw data to a deliverable model. The three projects cover both scikit-learn and PyTorch toolchains, as well as three data types: text, tabular, and image.

A key realization: This module does not test theoretical derivations; it tests whether you can run through the end-to-end process. Code can be directly copied and pasted, but if the preprocessing pipeline is inconsistent with training, the model will fail after upload—this is precisely the most common source of bugs in real ML engineering.

Practical Takeaways

Established a complete mental model of "Data → Preprocessing → Feature Engineering → Model Training → Evaluation → Deployment", so that when encountering new ML tasks later, one knows what to do at each step.

2. Environment Setup

Key Concepts

Miniconda: A lightweight version of Anaconda, provides the conda package manager, and can create isolated Python virtual environments
Installation Method: Windows: Use Scoop, macOS: Use Homebrew, Linux: Download installation script
Creating a virtual environment: conda create -n ai python=3.11 → conda activate ai
Core dependency installation: conda install numpy scipy pandas scikit-learn matplotlib seaborn nltk + conda install pytorch torchvision
Channel configuration: conda config --add channels conda-forge etc., channel_priority strict ensures package version consistency
conda config --set auto_activate_base false can prevent the base environment from automatically activating every time a terminal is opened

Understanding and Insights

The fundamental reason for using Miniconda instead of system Python or pip: ML projects have extremely complex dependency trees (PyTorch requires matching CUDA versions, scikit-learn and numpy have ABI coupling), and pip often encounters version conflicts when installing deep learning libraries. Conda can manage binary compatibility at the source level and is the de facto standard in the ML domain.

Common pitfalls

After conda init, you must restart the terminal for it to take effect, otherwise conda activate will report an error.
When installing PyTorch, be sure to specify the CUDA version (pytorch-cuda=12.4). If installed incorrectly, the GPU will not report an error, but training will fall back to the CPU, and the speed will be more than 10 times slower.
Playground VM can be used but has limited performance; running training locally is much more efficient.

Exercise Solutions

Answer: DONE

3. JupyterLab

Key Takeaways

JupyterLab: A web-based interactive development environment, a standard tool for data science and ML.
Three cell types: Code cells, Markdown cells, Raw cells
Stateful Environment (Stateful): Variables, functions, and imports defined in one cell remain available in all subsequent cells until the kernel is restarted.
Execute code: Shift + Enter (execute and jump to the next), Ctrl + Enter (execute without jumping)
Restart Kernel: Kernel → Restart Kernel or Restart & Clear All Outputs
Installation: conda install -y jupyter jupyterlab notebook ipykernel

Understanding and Insights

Jupyter's stateful nature is a double-edged sword. The advantage is that you can see the results as you write and gradually build data pipelines; the risk is that if cells are executed out of order, variable states will not match the code order. The first reaction when debugging should be to Restart & Run All, running everything from scratch to confirm consistent states.

Practical Takeaways

Mastered the working mode of Jupyter as an ML experimental environment—using notebooks for rapid iteration and visual exploration, and after confirming the feasibility of the solution, exporting it as a .py script for automation and version control.

4. Python Libraries for AI

Key Knowledge Points

Scikit-learn: Traditional ML library based on NumPy/SciPy/Matplotlib
- Data Preprocessing: StandardScaler (Standardization), MinMaxScaler (Normalization), OneHotEncoder (Categorical Encoding), SimpleImputer (Missing Value Imputation)
- Unified API: model.fit(X_train, y_train) Training → model.predict(X_test) Prediction
- Model Evaluation: train_test_split (Data Splitting), cross_val_score (Cross-validation), accuracy_score / f1_score (Metric Calculation)
PyTorch: Deep learning framework developed by Facebook
- Tensor: Similar to NumPy ndarray, but supports gradient tracking and GPU acceleration
- Dynamic Computation Graph: Computation graph built on-the-fly during forward pass, debugging is more intuitive than static graphs (TensorFlow 1.x)
- Model Building: nn.Sequential (simple stacking) or inheriting nn.Module (custom forward pass)
- Five Steps of Training Loop: zero_grad() → Forward Pass → Calculate Loss → backward() → step()
- Data Loading: Dataset + DataLoader for batch iteration, shuffling, and multi-process parallelism
- Model Persistence: torch.save(model.state_dict()) to save parameters / torch.jit.script() to save the complete model

Understanding and Insights

The fundamental difference between scikit-learn and PyTorch is not "one is simple and one is complex", but rather different levels of abstraction. scikit-learn encapsulates the training loop (fit() in one line), suitable for structured data and classical algorithms; PyTorch exposes every step of the training loop, suitable for deep learning tasks that require custom network structures, loss functions, or training strategies. The criterion for choosing which one: If scikit-learn has a ready-made algorithm that can solve your problem, use scikit-learn; only use PyTorch when CNN/RNN/Transformer is needed.

Commonly Confused Concepts

PyTorch Tensors and NumPy ndarrays look very similar, but Tensors come with gradient tracking (requires_grad=True) and can operate on GPUs. Converting between the two requires .numpy() / torch.from_numpy(), and Tensors on the GPU must first be .cpu() before converting to NumPy.

5. Datasets

Key Takeaways

Four main types of datasets: Tabular Data (CSV/databases), Image Data (pixel arrays), Text Data (natural language), Time Series (sequences with timestamps)
Seven attributes of high-quality datasets: Relevance, Completeness, Consistency, Accuracy, Representativeness, Balance, Scale
Example dataset demo_dataset.csv contains network logs: source_ip, destination_port, protocol, bytes_transferred, threat_level
Three essential tools for data exploration: df.head() (viewing samples), df.info() (checking types and missing values), df.isnull().sum() (counting missing values)

Understanding and Insights

The upper limit of a model is determined by the data, not the algorithm. A complex model trained on dirty data is inferior to a simple model trained on clean data. The first thing to do after getting data is always to examine the data, not write the model—if a numeric column in df.info() shows a object type, it's almost certain that non-numeric strings are mixed in and require cleaning.

Practical Takeaways

Developed data quality awareness: A 'good' dataset isn't just about being larger; it needs to be balanced (positive and negative sample ratios are close) and representative (covering real-world scenarios). If 99% of the traffic is normal, a model predicting 'normal' for everything would achieve 99% accuracy, but it wouldn't have learned anything—this is the data root cause of why the Accuracy metric in Section 8 can be misleading.

6. Data Preprocessing

Key Takeaways

Four main tasks of data preprocessing: Cleaning (handling missing values/outliers), Transformation (encoding/scaling), Integration (merging multi-source data), Formatting (type conversion/reshaping)
Invalid value detection methods: Regex validation for IP format, Range validation for ports (0-65535) / byte count (≥0) / threat level (0-2)
Two strategies for handling invalid data:
- Discarding: data.drop(invalid.index, errors='ignore'), suitable for large datasets with few bad data points
- Imputation: SimpleImputer(strategy='median') fills numeric columns with the median, strategy='most_frequent' fills categorical columns; KNNImputer infers based on neighbor relationships
Unify invalid value representation: First, use df.replace() to replace various placeholders (MISSING_IP, STRING_PORT, ?) with NaN, then convert with pd.to_numeric(errors='coerce').

Understanding and Insights

This section teaches not just API calls, but a troubleshooting approach—first using a validation function to locate where the bad data is, then deciding how to handle it based on the data volume. In the demo dataset of 100 entries, 23 are bad, leaving only 77 after dropping them—in this situation, it must be imputed, not dropped.

Key technique: First, unify all kinds of invalid values by replacing them with NaN, then process them all at once with SimpleImputer, which is much more efficient than fixing them one by one.

Practical Takeaways

Mastered the data cleaning process of "Locate bad data → Determine strategy → Unify representation → Batch process", which is the first step in any ML project.

7. Data Transformation

Key Knowledge Points

Encoding Categorical Features:
- OneHotEncoder: generates a binary column for each category (protocol_TCP=1/0), does not introduce ordinal relationships
- LabelEncoder: integer encoding (TCP=0, UDP=1), introduces spurious ordering, only suitable for ordinal categories
Handling Skewed Data: Perform np.log1p() logarithmic transformation on numerical features with uneven distribution to compress extreme values and make the distribution more uniform; log1p instead of log because log(0) is undefined
Data Splitting (Train/Validation/Test = 60%/20%/20%):
- First train_test_split(test_size=0.2) → 80% Train + 20% Test
- Second train_test_split(test_size=0.25) on 80% → 0.8×0.25=0.2 → 60% Train + 20% Validation

Understanding and Insights

The pitfalls of One-Hot vs LabelEncoder is the most important concept in this section: LabelEncoder assigns TCP=0, UDP=1, HTTP=2, and the model might mistakenly assume HTTP > UDP > TCP. The rule is simple—always use One-Hot for unordered categorical variables.

The second test_size=0.25 in the three-segment split is often confusing—it's a proportion of the remaining 80%, not of the whole. Remembering 0.8 × 0.25 = 0.2 will prevent confusion.

Practical Takeaways

Mastered the intermediate steps of the ML data pipeline: Cleaned data → Encoding categorical variables → Handling skewed distributions → Three-segment split. Each step has a clear 'why': encoding is because algorithms only recognize numbers, log transformation is because extreme values can dominate the model, and splitting is because independent validation/test sets are needed to prevent overfitting.

8. Metrics for Evaluating a Model

Key Concepts

Accuracy = (TP + TN) / All, overall correctness rate; misleading when classes are imbalanced
Precision = TP / (TP + FP), proportion of true positives among predicted positives; High Precision = fewer false positives
Recall = TP / (TP + FN), proportion of actual positives correctly identified; High Recall = fewer false negatives
F1-Score = 2 × Precision × Recall / (Precision + Recall), harmonic mean of the two
Other metrics: Specificity, AUC-ROC, Confusion Matrix

Understanding and Insights

Accuracy can be deceptive is the most important lesson in this section. When 99% of the dataset is normal traffic, a model predicting everything as 'normal' can achieve 99% Accuracy, but fail to detect any attacks. This is why the security domain rarely relies solely on Accuracy.

Precision and Recall are essentially a seesaw—raising the threshold reduces false positives (Precision increases), but more true attacks will also be missed (Recall decreases). F1-Score provides a balance point between the two.

Practical Takeaways

Developed the ability to select metrics based on business scenarios: Intrusion detection favors Recall; spam filtering favors Precision. There is no one-size-fits-all 'best metric', it depends on the cost of errors.

9. Spam Classification

Key Concepts

Bayes' Theorem: P(A|B) = P(B|A) × P(A) / P(B)—after observing evidence B, update the belief in event A
Applied to spam detection: P(Spam|Features) = P(Features|Spam) × P(Spam) / P(Features)
Naive Bayes' "Naive" Assumption: features are conditionally independent, i.e., P(F1,F2|Spam) = P(F1|Spam) × P(F2|Spam)
Classification Decision: Calculate P(Spam|Features) and P(Not Spam|Features) separately, and choose the class with the larger posterior probability.

Understanding and Insights

The "naive" assumption is almost always false in reality—"free" and "prize" are highly correlated in spam SMS, not independent. But Naive Bayes still performs well because classification only requires comparing the relative magnitudes of two probabilities, not precise estimation of their absolute values. Although the conditional independence assumption makes probability estimates inaccurate, it does not affect the ranking.

This section's calculation example nicely demonstrates the power of Bayesian updating: prior P(Spam)=0.3 (30%), posterior P(Spam|F1,F2)≈0.588 (59%) after observing features—observational evidence pulled the belief from 30% to nearly 60%.

Practical Takeaways

Understood the complete inference chain of Naive Bayes in text classification—from prior probability to likelihood calculation to posterior comparison, and why a "wrong" assumption can still produce an effective classifier.

10. The Spam Dataset

Key Points

SMS Spam Collection: 5574 SMS messages, labeled as ham or spam, from the UCI Machine Learning Repository
Data Format: TSV, must specify sep="\t", header=None, names=["label", "message"] when loading
Three steps for data checking: df.head() (check if parsing is correct), df.isnull().sum() (check for missing values), df.duplicated().sum() (check for duplicates)
Deduplication: df.drop_duplicates() removed 403 duplicates, leaving 5169 entries

Understanding and Insights

This dataset is TSV, not CSV—if sep="\t" is not paid attention to, the entire line of text will be treated as a single field, and all subsequent steps will be wrong but may not necessarily throw an error; the results will just be a mess.

Duplicate entries must be removed: If the same SMS message appears multiple times, it might appear in both the training and test sets, leading to inflated test scores (the model has "seen" this data rather than truly learned to classify it).

Practical Gains

Developed the habit of immediately head() + info() + duplicated() after loading data, confirming data completeness and correctness before entering any modeling steps.

11. Preprocessing the Spam Dataset

Key Knowledge Points

Standard NLP Preprocessing Pipeline (executed sequentially):

Lowercase Conversion: str.lower() — Merges "Free" and "free" into the same feature.
Punctuation and Number Removal: Regex [^a-z\s$!] — Retains $ (implies amount) and ! (implies urgency), as these two symbols have strong discriminative power for spam messages.
Tokenization: word_tokenize() — More accurate than split(" "), correctly handles contractions (e.g., don't → do, n't).
Stop Word Removal: Removes high-frequency, low-information words like the, is, and, using nltk.corpus.stopwords.
Stemming: PorterStemmer converts running/runs/ran → run, significantly reducing vocabulary size.
Re-joining: " ".join(tokens) restores to a string for consumption by CountVectorizer.

Understanding and Insights

The order of preprocessing cannot be arbitrary—tokenization must precede stop word removal, which must precede stemming. If stemming is done before stop word removal, the stemmed forms of certain stop words might not be in the stop word list, leading to missed deletions.

Retaining $ and ! is a noteworthy design decision: most NLP tutorials blindly remove all punctuation, but in the specific context of spam message classification, these two symbols carry crucial discriminative information. Preprocessing is not a mechanical process; it requires judgment combined with domain knowledge.

Common Pitfalls

The exact same preprocessing function must be used during training and prediction. The Pipeline only encapsulates steps after CountVectorizer; the preceding text cleaning (lowercasing/regex/tokenization/stop word removal/stemming) needs to be manually ensured for consistency. Any discrepancy will lead to a mismatch in vocabulary and meaningless prediction results.

12. Feature Extraction

Key Knowledge Points

Bag of Words：Build a vocabulary, each message becomes a vector, element value = number of times the word appears in that message
CountVectorizer implements the Bag of Words model, key parameters:
- min_df=1：words must appear in at least 1 document to be kept (in practice, can be increased to 5 to remove extremely rare words)
- max_df=0.9：words appearing in 90%+ of documents are excluded (too common, equivalent to another layer of stop word filtering)
- ngram_range=(1, 2)：simultaneously extracts unigrams and bigrams, capturing local word order
Output: Sparse matrix X, most elements are 0
Label transformation: y = df["label"].apply(lambda x: 1 if x == "spam" else 0) converts ham/spam to 0/1

Understanding and Insights

ngram_range=(1, 2) is a key parameter for improving performance. When using only unigrams, "free" appearing alone is not necessarily spam ("feel free to ask"), but the bigram "free prize" almost certainly is. Bigrams recover the local word order information lost by the Bag of Words model.

The output matrix is very sparse—5000+ messages × tens of thousands of words, but each message contains only a dozen words on average, and 99%+ of elements are 0. scikit-learn stores it in scipy.sparse format, which is far more memory-efficient than dense matrices.

Practical Takeaways

Mastered the complete text → numerical feature conversion chain: Raw text → Preprocessing (Section 11) → CountVectorizer → Feature matrix consumable by ML models. This pattern applies to all Bag of Words-based text classification tasks.

13. Training and Evaluation (Spam Detection)

Key Knowledge Points

Pipeline：chains CountVectorizer + MultinomialNB into a unified estimator
- pipeline.fit(X, y) automatically vectorizes first, then trains
- pipeline.predict(new_text) automatically vectorizes first, then predicts
- joblib.dump(pipeline) saves the complete pipeline, directly usable after loading.
GridSearchCV: Iterates through hyperparameter combinations and selects the optimal configuration using cross-validation.
- Search for alpha (Laplace smoothing factor) for the optimal value in [0.01, 0.1, 0.15, 0.2, 0.25, 0.5, 0.75, 1.0].
- Evaluation metric: scoring="f1", 5-fold cross-validation.
Model Evaluation: For new messages, you must first manually perform the exact same preprocessing as during training, then call pipeline.predict().
Model Persistence: joblib.dump() save / joblib.load() restore, what is saved is the complete Pipeline.

Understanding and Insights

Intuition for the alpha parameter: MultinomialNB's alpha is the Laplace smoothing factor. If alpha=0, when a new word not seen in the training set is encountered, its probability directly becomes zero, causing the probability calculation for the entire message to fail (a zero multiplied by any number is zero). The larger alpha is, the more "conservative" it is (more uniform probability distribution); the smaller alpha is, the more "aggressive" it is (more trust in training data).

Limitations of Pipeline: Pipeline only encapsulates the steps after CountVectorizer. The preceding text cleaning (lowercase/regex/tokenization/stopwords removal/stemming) is not within the Pipeline. When predicting new messages, the same preprocess_message() function must be called manually.

Practical Takeaways

Mastered two core engineering patterns of scikit-learn—Pipeline and GridSearchCV. These two tools will be repeatedly used in any scikit-learn project.

14. Model Evaluation (Spam Detection)

Exercise Solutions

Q1: What is the flag you get from submitting a good model for evaluation?

Solution Approach:

Overall process: Download SMS Spam Collection dataset → Text preprocessing (lowercase/punctuation removal/tokenization/stopwords removal/stemming) → CountVectorizer feature extraction → GridSearchCV training MultinomialNB → Save and upload.

Complete Training Code:

import requests, zipfile, io, os, re, json
import pandas as pd
import numpy as np
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import GridSearchCV
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
import joblib

nltk.download("punkt", quiet=True)
nltk.download("punkt_tab", quiet=True)
nltk.download("stopwords", quiet=True)

# 1. Download dataset
url = "https://archive.ics.uci.edu/static/public/228/sms+spam+collection.zip"
response = requests.get(url, verify=False)
with zipfile.ZipFile(io.BytesIO(response.content)) as z:
    z.extractall("sms_spam_collection")

# 2. Load Data
df = pd.read_csv(
    "sms_spam_collection/SMSSpamCollection",
    sep="\t", header=None, names=["label", "message"],
)
df = df.drop_duplicates()

# 3. Text Preprocessing
df["message"] = df["message"].str.lower()
df["message"] = df["message"].apply(lambda x: re.sub(r"[^a-z\s$!]", "", x))
df["message"] = df["message"].apply(word_tokenize)

stop_words = set(stopwords.words("english"))
df["message"] = df["message"].apply(lambda x: [w for w in x if w not in stop_words])

stemmer = PorterStemmer()
df["message"] = df["message"].apply(lambda x: [stemmer.stem(w) for w in x])
df["message"] = df["message"].apply(lambda x: " ".join(x))

# 4. Feature Extraction + Training
vectorizer = CountVectorizer(min_df=1, max_df=0.9, ngram_range=(1, 2))
y = df["label"].apply(lambda x: 1 if x == "spam" else 0)

pipeline = Pipeline([
    ("vectorizer", vectorizer),
    ("classifier", MultinomialNB())
])

param_grid = {"classifier__alpha": [0.01, 0.1, 0.15, 0.2, 0.25, 0.5, 0.75, 1.0]}
grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring="f1")
grid_search.fit(df["message"], y)
best_model = grid_search.best_estimator_

# 5. Save Model
joblib.dump(best_model, "spam_detection_model.joblib")

Upload Model:

curl -F "model=@spam_detection_model.joblib" http://<TARGET_IP>:8000/api/upload

Answer: HTB{sp4m_cla55if13r_3v4lu4t0r}

15. Network Anomaly Detection

Key Knowledge Points

Random Forest: An ensemble learning algorithm that builds multiple decision trees and aggregates their results (classification = majority voting, regression = taking the average)
Three core mechanisms:
- Bootstrap Sampling: Sampling with replacement to create multiple training subsets, so each tree sees different data
- Random Feature Selection: Only considers a subset of features for each split, reducing correlation between trees
- Voting Aggregation: A single tree might not be accurate, but accuracy significantly improves after multiple trees vote
NSL-KDD Dataset: A standard benchmark for network intrusion detection, an improvement over KDD Cup 1999 (eliminating redundant records and class imbalance)
The data contains 41 features (statistical properties of network connections) and attack type labels

Understanding and Insights

Random Forest is an ideal choice for this task: network traffic data has 40+ features and is high-dimensional. Random Forest is naturally good at handling high-dimensional data, does not require feature scaling (based on split thresholds rather than distance calculations), is robust to outliers, trains much faster than deep learning, and can yield good results with default parameters.

NSL-KDD is to network intrusion detection what MNIST is to image classification—a standard benchmark in academia. It fixed two fatal problems of the original KDD dataset: redundant records (leading models to be biased towards frequent patterns) and severe class imbalance.

Practical Takeaways

Understood why "ensemble" is stronger than "individual": Each tree only sees a subset of data and features, and might be inaccurate individually, but after 100 trees vote, the noise is averaged out, and accuracy significantly improves. This idea is not limited to Random Forest; it's the foundation of the entire field of ensemble learning.

16. Preprocessing and Splitting the Dataset

Key Points

Binary Classification Target: attack_flag — normal → 0, any attack → 1
Multi-class Classification Target: attack_map — mapping dozens of specific attack names to 5 classes:
- 0 = Normal, 1 = DoS, 2 = Probe
- 3 = Privilege Escalation, 4 = Access
Categorical Variable Encoding: pd.get_dummies(df[['protocol_type', 'service']]) One-Hot Encoding
Numerical Features: 34 statistical metrics (duration, src_bytes, dst_bytes, serror_rate, etc.) used directly
Data Splitting: 80/20 split for test set → then 70/30 split from training set for validation set

Understanding and Insights

Trade-offs between Binary vs. Multi-class Classification: Binary classification (normal/attack) is simple but provides less information—only knowing "there's an attack" but not its type. Multi-class classification (5 classes) can distinguish attack types, which is more valuable for actual security responses (DoS requires rate limiting, Probe requires monitoring, Privilege Escalation requires immediate isolation). The evaluation port for this module requires a multi-class model.

Random Forest does not require feature scaling (as it's based on splitting thresholds rather than distance calculations), so the 34 numerical features can be used directly, without needing StandardScaler like SVM/KNN.

Common Pitfalls

random_state=1337 must be consistent with the tutorial, otherwise, if the data split is different, the final model performance may differ from expectations.
Some spellings in the attack name list (e.g., dos_attacks, probe_attacks) are unintuitive (e.g., loadmdoule instead of loadmodule). Copy them directly from the tutorial, do not type manually.

17. Training and Evaluation (Network Anomaly Detection)

Key Takeaways

Training: RandomForestClassifier(random_state=1337) default parameters are sufficient.
Evaluation Metrics: accuracy_score, precision_score, recall_score, f1_score (use average='weighted' for multi-class classification)
Visualization: confusion_matrix + seaborn.heatmap to plot the confusion matrix; classification_report to output details for each class.
Two-round evaluation: First, tune parameters/confirm direction on the validation set, then report performance on the test set.
Model Saving: joblib.dump(rf_model, 'network_anomaly_detection_model.joblib')

Understanding and Insights

Random Forest with default parameters + no feature engineering achieved 99.5% F1, while Naive Bayes for Spam Detection only reached 93% even with careful parameter tuning. This indicates that the match between algorithm and data is more important than parameter tuning—Random Forest is naturally suited for high-dimensional tabular data, whereas text classification has more noise to contend with.

Meaning of average='weighted': For multi-class classification, F1 has two types: macro (equal weighting for each class) and weighted (weighted by sample count). The Privilege class only has dozens of samples; if macro is used, its low F1 will severely drag down the overall score; weighted weighted by sample count is fairer.

How to read a confusion matrix: Diagonal = number of correctly classified instances, off-diagonal = misclassified instances. If there's a number in the Access column of the Probe row, it means some Probe attacks were misclassified as Access, and targeted improvements can be made.

Practical Takeaways

Experienced the complete closed loop from training to evaluation to visualization, and mastered the method of using confusion matrices and classification reports to pinpoint model weaknesses—this is much more valuable than just looking at a single F1 score.

18. Model Evaluation (Network Anomaly Detection)

Exercise Solutions

Q1: What is the flag you get from submitting a good model for evaluation?

Approach:

Using the NSL-KDD dataset, traffic is divided into 5 categories (Normal/DoS/Probe/Privilege/Access), and Random Forest is employed.

Complete Training Code:

import requests, zipfile, io
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score
import joblib

# 1. Download Dataset
url = "https://academy.hackthebox.com/storage/modules/292/KDD_dataset.zip"
response = requests.get(url, verify=False)
z = zipfile.ZipFile(io.BytesIO(response.content))
z.extractall('.')

# 2. Load Data
columns = [
    'duration', 'protocol_type', 'service', 'flag', 'src_bytes', 'dst_bytes',
    'land', 'wrong_fragment', 'urgent', 'hot', 'num_failed_logins', 'logged_in',
    'num_compromised', 'root_shell', 'su_attempted', 'num_root', 'num_file_creations',
    'num_shells', 'num_access_files', 'num_outbound_cmds', 'is_host_login', 'is_guest_login',
    'count', 'srv_count', 'serror_rate', 'srv_serror_rate', 'rerror_rate', 'srv_rerror_rate',
    'same_srv_rate', 'diff_srv_rate', 'srv_diff_host_rate', 'dst_host_count', 'dst_host_srv_count',
    'dst_host_same_srv_rate', 'dst_host_diff_srv_rate', 'dst_host_same_src_port_rate',
    'dst_host_srv_diff_host_rate', 'dst_host_serror_rate', 'dst_host_srv_serror_rate',
    'dst_host_rerror_rate', 'dst_host_srv_rerror_rate', 'attack', 'level'
]
df = pd.read_csv('KDD+.txt', names=columns)

# 3. Create Multi-class Target
dos = ['apache2','back','land','neptune','mailbomb','pod','processtable','smurf','teardrop','udpstorm','worm']
probe = ['ipsweep','mscan','nmap','portsweep','saint','satan']
priv = ['buffer_overflow','loadmdoule','perl','ps','rootkit','sqlattack','xterm']
access = ['ftp_write','guess_passwd','http_tunnel','imap','multihop','named','phf',
          'sendmail','snmpgetattack','snmpguess','spy','warezclient','warezmaster','xclock','xsnoop']

def map_attack(a):
    if a in dos: return 1
    elif a in probe: return 2
    elif a in priv: return 3
    elif a in access: return 4
    else: return 0

df['attack_map'] = df['attack'].apply(map_attack)

# 4. Encode Categorical Variables + Select Numerical Features
encoded = pd.get_dummies(df[['protocol_type', 'service']])
numeric_features = [
    'duration','src_bytes','dst_bytes','wrong_fragment','urgent','hot',
    'num_failed_logins','num_compromised','root_shell','su_attempted',
    'num_root','num_file_creations','num_shells','num_access_files',
    'num_outbound_cmds','count','srv_count','serror_rate','srv_serror_rate',
    'rerror_rate','srv_rerror_rate','same_srv_rate','diff_srv_rate',
    'srv_diff_host_rate','dst_host_count','dst_host_srv_count',
    'dst_host_same_srv_rate','dst_host_diff_srv_rate',
    'dst_host_same_src_port_rate','dst_host_srv_diff_host_rate',
    'dst_host_serror_rate','dst_host_srv_serror_rate',
    'dst_host_rerror_rate','dst_host_srv_rerror_rate'
]
train_set = encoded.join(df[numeric_features])
multi_y = df['attack_map']

# 5. Data Split
train_X, test_X, train_y, test_y = train_test_split(train_set, multi_y, test_size=0.2, random_state=1337)
multi_train_X, _, multi_train_y, _ = train_test_split(train_X, train_y, test_size=0.3, random_state=1337)

# 6. Train + Save
rf_model = RandomForestClassifier(random_state=1337)
rf_model.fit(multi_train_X, multi_train_y)
joblib.dump(rf_model, 'network_anomaly_detection_model.joblib')

Upload Model:

curl -F "model=@network_anomaly_detection_model.joblib" http://<TARGET_IP>:8001/api/upload

Answer: HTB{n3tw0rk_tr4ff1c_4n0m4ly_d3t3ct0r}

19. Malware Classification

Key Points

Malware Families: Categories of malware classified by behavior, propagation methods, and technical characteristics (e.g., Emotet, WannaCry). Detailed information can be found on malpedia
Traditional classification methods: Static analysis (disassembly/decompilation) + Dynamic analysis (sandbox execution to observe behavior) + Reverse engineering, which are time-consuming and require specialized skills.
Malware Image Classification: Maps each byte (0-255) of a binary file to a grayscale pixel value to generate a visual image; malware from the same family exhibits similar image textures due to shared code structures.
Using CNNs to classify these images transforms the malware family identification problem into an image classification problem.

Understanding and Insights

"Drawing binaries as images" might seem counter-intuitive at first glance, but it becomes quite natural once you think it through—a binary file is essentially a sequence of bytes from 0-255, and mapping each byte to a grayscale pixel forms an image. The key insight is: malware from the same family, due to shared code segments, packing methods, and data structures, visually exhibits similar texture patterns in the generated images—this is the basis for CNNs to classify them.

Two Practical Advantages of Image Classification:(1) CNNs are very mature in image classification (ResNet/VGG/EfficientNet) and can be directly transferred and used; (2) Operating on images will not infect your machine, which is much safer than directly analyzing malicious binary files.

Practical Takeaways

Understood how to transform a difficult problem (malware classification) into a problem with existing mature solutions (image classification) through data representation transformation (binary → image). This "mapping problems to known domains" approach is very common in ML engineering.

20. The Malware Dataset

Key Points

malimg dataset : 9339 grayscale PNG images of malware, covering 25 malware families
Directory structure: one folder per family, folder name = family name (e.g., Adialer.C, Allaple.A, Rbot!gen, etc.)
Image source: each byte value (0-255) of a PE file (Windows executable) is directly mapped to grayscale pixel brightness (0=black, 255=white)
Image sizes are inconsistent (because different binary files have different lengths), requiring uniform Resize later.
Download method: wget kaggle.com/.../malimg-original -O malimg.zip && unzip malimg.zip

Understanding and Insights

The dataset's directory organization (one folder per family) perfectly matches the expected format of PyTorch ImageFolder — no need to manually write annotation files or CSV mapping tables, ImageFolder automatically uses the folder name as the label. This is the most common data organization method in PyTorch image classification projects.

Imbalanced data distribution: an average of 374 images per class, but some classes have fewer than 100 images, while others have over 1000. This imbalance may lead to poor recognition of minority classes by the model, which can be mitigated in practice using oversampling / class weights.

Practical Takeaways

Mastered the concept of malware byteplot — where each pixel is the value of a byte in the binary file. Malware from different families, due to varying code structures and packing methods, exhibits distinct byteplot textures, which provides a visual basis for CNN classification.

21. Preprocessing the Malware Dataset

Key Points

Data Splitting : using the split-folders library to split the training and test sets in an 80/20 ratio (splitfolders.ratio(ratio=(0.8, 0, 0.2)))
Image Preprocessing (transforms.Compose):
- Resize((75, 75)): Unify all image sizes (original sizes vary due to different binary lengths)
- ToTensor(): Convert PIL Image to PyTorch Tensor, pixel values scaled from [0,255] to [0,1]
- Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]): Normalization parameters required by ImageNet pre-trained models
ImageFolder: Automatically load dataset from directory structure (folder name = class name), automatically generate labels
DataLoader: Encapsulates batch iteration, key parameters batch_size (number of samples per batch), shuffle (whether to shuffle), num_workers (number of parallel loading processes)

Understanding and Insights

Why are the mean/std values for Normalize not the statistics of our own dataset? Because the ResNet50 we use is pre-trained on ImageNet, and its convolutional kernel weights are learned based on ImageNet's distribution. Input data must use the same normalization parameters, otherwise the response of the pre-trained weights will shift, and the feature extraction effect will be greatly reduced.

Trade-offs of Resizing to 75×75: Larger sizes (e.g., 224×224, ResNet's original input size) preserve more texture details, but training is slower and memory consumption is higher; 75×75 is a compromise made by the tutorial for training speed, sacrificing some accuracy.

Logic for choosing batch_size: Too small (32): high gradient noise per batch, slow training; Too large (2048): may exceed GPU memory. 512 is a good choice that runs efficiently on most GPUs.

Practical Takeaways

Mastered the standard trio for PyTorch image data pipelines: transforms (preprocessing) → ImageFolder (dataset) → DataLoader (iterator). This pattern applies to all PyTorch image classification projects.

22. The Model

Key Knowledge Points

Transfer learning based on ResNet50: Load ImageNet pre-trained weights (weights='DEFAULT'), 50 layers deep, approximately 23 million parameters
Freezing Strategy: requires_grad = False freeze all pre-trained layers, only train the replaced last layer
Custom fully connected layer: Linear(2048, 1000) → ReLU → Linear(1000, n_classes)
- 2048 = output dimension of ResNet50's second to last layer
- 1000 = adjustable hidden layer size
- n_classes = 25 (dynamically obtained from len(train_dataset.classes))
Model definition inherits nn.Module, requiring implementation of __init__() and forward() methods

Understanding and Insights

Why transfer learning works: The low-level features (edges, textures, shapes) learned by ResNet50 on ImageNet are general and equally applicable to malware byte maps. We only need to replace the last layer to adapt to 25 classes, without training 23 million parameters from scratch.

Trade-off: Frozen vs. Unfrozen: After freezing, only the last fully connected layer is trained (approx. 2 million parameters vs. full 23 million), resulting in a training speed 10x faster or more. The trade-off is that the low-level feature extractor cannot be fine-tuned for the specific textures of malware images. In practice, freezing still achieves ~89% test accuracy, which is sufficient for a PoC; if higher accuracy is desired, more layers can be gradually unfrozen.

Benefits of dynamically obtaining n_classes: Using len(train_dataset.classes) instead of hardcoding 25 means the code does not need to be modified after adding or removing malware families.

Practical Takeaways

Mastered the standard pattern for transfer learning in PyTorch: Load pre-trained model → Freeze layers → Replace last layer → Train. This pattern applies to most image classification tasks, only requiring modification of the last layer's output dimension.

23. Training and Evaluation (Malware Image Classification)

Key Knowledge Points

Five-step training loop template (repeated for each batch):
1. optimizer.zero_grad() — Clear previous gradients (PyTorch accumulates gradients by default)
2. outputs = model(inputs) — Forward pass
3. loss = criterion(outputs, labels) — Calculate CrossEntropyLoss
4. loss.backward() — Backward pass, calculate gradients for each parameter
5. optimizer.step() — Adam optimizer updates parameters using gradients
Evaluation Mode：model.eval() + torch.no_grad() disables gradient computation and training behaviors of BatchNorm/Dropout
Model Saving：torch.jit.script(model) serializes to TorchScript (.pth), including model structure + parameters, which can be loaded independently
Training Parameters：10 epochs, batch_size=512, Adam optimizer (default learning rate), CrossEntropyLoss
Actual Performance：Training accuracy ~96%, Test accuracy ~89%

Understanding and Insights

model.eval() vs model.train() is not just a semantic tag：in eval mode, BatchNorm uses global statistics instead of batch statistics, Dropout stops random zeroing. Forgetting to switch to eval mode will lead to inconsistent inference results.

Why use jit.script instead of state_dict：torch.save(model.state_dict()) only saves parameter weights, requiring instantiation of a model class with the same structure when loading. jit.script serializes structure + parameters together, allowing evaluation ports to load without your MalwareClassifier class definition — this is crucial for model delivery.

Must .to("cpu") before saving：Models trained on GPU internally reference GPU devices, and will throw an error if loaded in a pure CPU environment.

Practical Takeaways

Mastered the complete PyTorch training → evaluation → saving closed loop
Experienced the actual effect of GPU acceleration：CPU ~210 seconds per epoch vs MPS ~19 seconds (11x acceleration), CUDA GPU might be even faster
Understood the differences between scikit-learn and PyTorch paradigms and their respective applicable scenarios

24. Model Evaluation (Malware Image Classification)

Exercise Solutions

Q1: What is the flag you get from submitting a good model for evaluation?

Approach:

Using the malimg dataset (byte images of 25 malware families), performing transfer learning based on a pre-trained ResNet50.

Prerequisites:

pip3 install torch torchvision split-folders
wget https://www.kaggle.com/api/v1/datasets/download/ikrambenabd/malimg-original -O malimg.zip
unzip malimg.zip

Complete Training Code:

import os, time
import torch
import torch.nn as nn
import torchvision.models as models
from torchvision import transforms
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
import splitfolders

# Automatically detect GPU: CUDA > MPS (Apple Silicon) > CPU
if torch.cuda.is_available():
    device = torch.device("cuda")
elif torch.backends.mps.is_available():
    device = torch.device("mps")
else:
    device = torch.device("cpu")
print(f"Using device: {device}")

# 1. Split dataset (80% train / 20% test)
splitfolders.ratio(
    input="./malimg_paper_dataset_imgs/",
    output="./newdata/",
    ratio=(0.8, 0, 0.2)
)

# 2. Data loading and preprocessing
transform = transforms.Compose([
    transforms.Resize((75, 75)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
train_dataset = ImageFolder(root="./newdata/train", transform=transform)
test_dataset = ImageFolder(root="./newdata/test", transform=transform)
train_loader = DataLoader(train_dataset, batch_size=512, shuffle=True, num_workers=0)
test_loader = DataLoader(test_dataset, batch_size=1024, shuffle=False, num_workers=0)
n_classes = len(train_dataset.classes)

# 3. Model definition (ResNet50 transfer learning, freeze all weights except the last layer)
class MalwareClassifier(nn.Module):
    def __init__(self, n_classes):
        super().__init__()
        self.resnet = models.resnet50(weights='DEFAULT')
        for param in self.resnet.parameters():
            param.requires_grad = False
        num_features = self.resnet.fc.in_features
        self.resnet.fc = nn.Sequential(
            nn.Linear(num_features, 1000),
            nn.ReLU(),
            nn.Linear(1000, n_classes)
        )

    def forward(self, x):
        return self.resnet(x)

model = MalwareClassifier(n_classes).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

# 4. Training
for epoch in range(10):
    model.train()
    running_loss, n_total, n_correct = 0, 0, 0
    t0 = time.time()
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        _, predicted = outputs.max(1)
        n_total += labels.size(0)
        n_correct += predicted.eq(labels).sum().item()
        running_loss += loss.item()
    acc = 100 * n_correct / n_total
    print(f"Epoch {epoch+1}/10: Acc={acc:.2f}% Loss={running_loss/len(train_loader):.4f} ({time.time()-t0:.1f}s)")

# 5. Evaluation
model.eval()
n_correct, n_total = 0, 0
with torch.no_grad():
    for data, target in test_loader:
        data, target = data.to(device), target.to(device)
        output = model(data)
        _, predicted = torch.max(output, 1)
        n_total += target.size(0)
        n_correct += (predicted == target).sum().item()
print(f"Test accuracy: {100*n_correct/n_total:.2f}%")

# 6. Save (requires moving back to CPU then jit.script)
model_cpu = model.to("cpu")
model_scripted = torch.jit.script(model_cpu)
model_scripted.save("malware_classifier.pth")

Upload Model:

curl -F "model=@malware_classifier.pth" http://<TARGET_IP>:8002/api/upload

Answer: HTB{9569648083a8106ba057bbbe2d00d8ec}

25. Skills Assessment

Exercise Solutions

Q1: What is the flag you get from submitting a good model for evaluation?

Problem-solving Approach:

IMDB Movie Review Sentiment Analysis: Determine if a movie review is positive (1) or negative (0). The dataset is in JSON format, with 25,000 movie reviews. Using TF-IDF + LinearSVC performs better than Naive Bayes.

Complete Training Code:

import os, re, json
import pandas as pd
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline
import joblib
import requests, zipfile, io

nltk.download("punkt", quiet=True)
nltk.download("punkt_tab", quiet=True)
nltk.download("stopwords", quiet=True)

# 1. Download dataset
url = "https://academy.hackthebox.com/storage/modules/292/skills_assessment_data.zip"
response = requests.get(url, verify=False)
with zipfile.ZipFile(io.BytesIO(response.content)) as z:
    z.extractall(".")

# 2. Load Data
with open("train.json") as f:
    train_data = json.load(f)
df = pd.DataFrame(train_data)
df = df.drop_duplicates(subset=['text'])

# 3. Preprocessing
stop_words = set(stopwords.words("english"))
stemmer = PorterStemmer()

def preprocess(text):
    text = str(text).lower()
    text = re.sub(r"<[^>]+>", " ", text)       # Remove HTML Tags
    text = re.sub(r"[^a-z\s$!]", "", text)     # Keep letters, spaces, $ and !
    tokens = word_tokenize(text)
    tokens = [w for w in tokens if w not in stop_words]
    tokens = [stemmer.stem(w) for w in tokens]
    return " ".join(tokens)

df['processed'] = df['text'].apply(preprocess)
y = df['label'].astype(int)

# 4. Train TF-IDF + LinearSVC
pipeline = Pipeline([
    ("vectorizer", TfidfVectorizer(
        min_df=2, max_df=0.9,
        ngram_range=(1, 2),
        max_features=80000,
        sublinear_tf=True
    )),
    ("classifier", LinearSVC(C=1.0, max_iter=10000))
])

pipeline.fit(df['processed'], y)

# 5. Save
joblib.dump(pipeline, "skills_assessment.joblib")

Upload Model:

curl -F "model=@skills_assessment.joblib" http://<TARGET_IP>:5000/api/upload

Answer: HTB{s3nt1m3nt_4n4lys1s_d4t4}

Answer Quick Check

Chapter	Question Number	Answer
2 - Environment Setup	Q1	`DONE`
14 - Model Evaluation (Spam Detection)	Q1	`HTB{sp4m_cla55if13r_3v4lu4t0r}`
18 - Model Evaluation (Network Anomaly Detection)	Q1	`HTB{n3tw0rk_tr4ff1c_4n0m4ly_d3t3ct0r}`
24 - Model Evaluation (Malware Image Classification)	Q1	`HTB{9569648083a8106ba057bbbe2d00d8ec}`
25 - Skills Assessment	Q1	`HTB{s3nt1m3nt_4n4lys1s_d4t4}`

Contents​

1. Introduction​

Key Learning Points​

Understanding and Insights​

Practical Takeaways​

2. Environment Setup​

Key Concepts​

Understanding and Insights​

Common pitfalls​

Exercise Solutions​

Q1: If you choose to use the Playground VM, you can start it here and familiarize yourself with the environment. We recommend keeping the VM running as you work through the module and follow along with the code snippets. Type DONE to continue.​

3. JupyterLab​

Key Takeaways​

Understanding and Insights​

Practical Takeaways​

4. Python Libraries for AI​

Key Knowledge Points​

Understanding and Insights​

Commonly Confused Concepts​

5. Datasets​

Key Takeaways​

Understanding and Insights​

Practical Takeaways​

6. Data Preprocessing​

Key Takeaways​

Understanding and Insights​

Practical Takeaways​

7. Data Transformation​

Key Knowledge Points​

Understanding and Insights​

Practical Takeaways​

8. Metrics for Evaluating a Model​

Key Concepts​

Understanding and Insights​

Practical Takeaways​

9. Spam Classification​

Key Concepts​

Understanding and Insights​

Practical Takeaways​

10. The Spam Dataset​

Key Points​

Understanding and Insights​

Practical Gains​

11. Preprocessing the Spam Dataset​

Key Knowledge Points​

Understanding and Insights​

Common Pitfalls​

12. Feature Extraction​

Key Knowledge Points​

Understanding and Insights​

Practical Takeaways​

13. Training and Evaluation (Spam Detection)​

Key Knowledge Points​

Understanding and Insights​

Practical Takeaways​

14. Model Evaluation (Spam Detection)​

Exercise Solutions​

Q1: What is the flag you get from submitting a good model for evaluation?​

15. Network Anomaly Detection​

Key Knowledge Points​

Understanding and Insights​

Practical Takeaways​

16. Preprocessing and Splitting the Dataset​

Key Points​

Understanding and Insights​

Common Pitfalls​

17. Training and Evaluation (Network Anomaly Detection)​

Key Takeaways​

Understanding and Insights​

Practical Takeaways​

18. Model Evaluation (Network Anomaly Detection)​

Exercise Solutions​

Q1: What is the flag you get from submitting a good model for evaluation?​

19. Malware Classification​

Key Points​

Understanding and Insights​

Practical Takeaways​

20. The Malware Dataset​

Key Points​

Understanding and Insights​

Contents

1. Introduction

Key Learning Points

Understanding and Insights

Practical Takeaways

2. Environment Setup

Key Concepts

Understanding and Insights

Common pitfalls

Exercise Solutions

Q1: If you choose to use the Playground VM, you can start it here and familiarize yourself with the environment. We recommend keeping the VM running as you work through the module and follow along with the code snippets. Type DONE to continue.

3. JupyterLab

Key Takeaways

Understanding and Insights

Practical Takeaways

4. Python Libraries for AI

Key Knowledge Points

Understanding and Insights

Commonly Confused Concepts

5. Datasets

Key Takeaways

Understanding and Insights

Practical Takeaways

6. Data Preprocessing

Key Takeaways

Understanding and Insights

Practical Takeaways

7. Data Transformation

Key Knowledge Points

Understanding and Insights

Practical Takeaways

8. Metrics for Evaluating a Model

Key Concepts

Understanding and Insights

Practical Takeaways

9. Spam Classification

Key Concepts

Understanding and Insights

Practical Takeaways

10. The Spam Dataset

Key Points

Understanding and Insights

Practical Gains

11. Preprocessing the Spam Dataset

Key Knowledge Points

Understanding and Insights

Common Pitfalls

12. Feature Extraction

Key Knowledge Points

Understanding and Insights

Practical Takeaways

13. Training and Evaluation (Spam Detection)

Key Knowledge Points

Understanding and Insights

Practical Takeaways

14. Model Evaluation (Spam Detection)

Exercise Solutions

Q1: What is the flag you get from submitting a good model for evaluation?

15. Network Anomaly Detection

Key Knowledge Points

Understanding and Insights

Practical Takeaways

16. Preprocessing and Splitting the Dataset

Key Points

Understanding and Insights

Common Pitfalls

17. Training and Evaluation (Network Anomaly Detection)

Key Takeaways

Understanding and Insights

Practical Takeaways

18. Model Evaluation (Network Anomaly Detection)

Exercise Solutions

Q1: What is the flag you get from submitting a good model for evaluation?

19. Malware Classification

Key Points

Understanding and Insights

Practical Takeaways

20. The Malware Dataset

Key Points

Understanding and Insights