I Won $10,000 in a Machine Learning Competition — Here’s My Complete Strategy

in my first ML competition and honestly, I’m still a bit shocked.

I’ve worked as a data scientist in FinTech for six years. When I saw that Spectral Finance was running a credit scoring challenge for Web3 wallets, I decided to give it a try despite having zero blockchain experience.

Here were my limitations:

I used my computer, which has no GPUs
I only had a weekend (~10 hours) to work on it
I had never touched web3 or blockchain data before
I had never built a neural network for credit scoring

The competition goal was straightforward: predict which Web3 wallets were likely to default on loans using their transaction history. Essentially, traditional credit scoring but with DeFi data instead of bank statements.

To my surprise, I came second and won $10k in USD Coin! Unfortunately, Spectral Finance has since taken the competition site and leaderboard down, but here’s a screenshot from when I won:

My username was Ds-clau, second place with a score of 83.66 (image by author)

This experience taught me that understanding the business problem really matters. In this post, I’ll show you exactly how I did it with detailed explanations and Python code snippets, so you can replicate this approach for your next machine learning project or competition.

Getting Started: You Don’t Need Expensive Hardware

Let me get this clear, you don’t necessarily need an expensive cloud computing setup to win ML competitions (unless the dataset is too big to fit locally).

The dataset for this competition contained 77 features and 443k rows, which is not small by any means. The data came as a .parquet file that I downloaded using duckdb.

I used my personal laptop, a MacBook Pro with 16GB RAM and no GPU. The entire dataset fit locally on my laptop, though I must admit the training process was a bit slow.

Insight: Clever sampling techniques get you 90% of the insights without the high computational costs. Many people get intimidated by large datasets and think they need big cloud instances. You can start a project locally by sampling a portion of the dataset and examining the sample first.

EDA: Know Your Data

Here’s where my fintech background became my superpower, and I approached this like any other credit risk problem.

First question in credit scoring: What’s the class distribution?

Seeing the 62/38 split made me shiver… 38% is a very high default rate from a business perspective, but luckily, the competition wasn’t about pricing this product.

Next, I wanted to see which features actually mattered:

This is where I got excited. The patterns were exactly what I’d expect from credit data:

risk_factor was the strongest predictor and showed > 0.4 correlation with the target variable (higher risk actor = more likely to default)
time_since_last_liquidated showed a strong negative correlation, so the more recently they last liquidated, they riskier they were. This lines up as expected, since high velocity is usually a high risk signal (recent liquidation = risky)
liquidation_count_sum_eth suggested that borrowers with higher liquidation counts in ETH were risk flags (more liquidations = riskier behaviour)

Insight: Looking at Pearson correlation is a simple yet intuitive way to understand linear relationships between features and the target variable. It’s a great way to gain intuition on which features should and should not be included in your final model.

Feature Selection: Less is More

Here’s something that always puzzles executives when I explain this to them:

More features doesn’t always mean better performance.

In fact, too many features usually mean worse performance and slower training, because extra features add noise. Every irrelevant feature makes your model a little bit worse at finding the real patterns.

So, feature selection is a crucial step that I never skip. I used recursive feature elimination to find the optimal number of features. Let me walk you through my exact process:

The sweet spot was 34 features. After this point, the model performance as measured by the AUC score didn’t improve with additional features. So, I ended up using less than half of the given features to train my model, going from 77 features down to 34.

Insight: This reduction in features eliminated noise while preserving signal from the important features, leading to a model that was both faster to train and more predictive.

Building the Neural Network: Simple Yet Powerful Architecture

Before defining the model architecture, I had to define the dataset properly:

Split into training and validation sets (for verifying results after model training)
Scale features because neural networks are very sensitive to outliers
Convert datasets to PyTorch tensors for efficient computation

Here’s my exact data preprocessing pipeline:

Now comes the fun part: building the actual neural network model.

Important context: Spectral Finance (the competition organizer) limited model deployments to only neural networks and logistic regression because of their zero-knowledge proof system.

ZK proofs require mathematical circuits that can cryptographically verify computations without revealing underlying data, and neural networks and logistic regression can be efficiently converted into ZK circuits.

Since it was my first time building a neural network for credit scoring, I wanted to keep things simple but effective. Here’s my model architecture:

Let’s walk through my architecture choice in detail:

5 hidden layers: Deep enough to capture complex patterns, shallow enough to avoid overfitting
64 neurons per layer: Good balance between capacity and computational efficiency
ReLU activation: Standard choice for hidden layers, prevents vanishing gradients
Dropout (0.2): Prevents overfitting by randomly zeroing 20% of neurons during training
Sigmoid output: ideal for binary classification, outputs probabilities between 0 and 1

Training the Model: Where the Magic Happens

Now for the training loop that kicks off the model learning process:

Here are some details on the model training process:

Early stopping: Prevents overfitting by stopping when validation performance stops improving
SGD with momentum: Simple but effective optimizer choice
Validation tracking: Essential for monitoring real performance, not just training loss

The training curves showed steady improvements without overfitting during the training process. This is exactly what I wanted to see.

Model training loss surves — Model training loss curves (image by author)

The Secret Weapon: Threshold Optimization

Here’s where I probably outperformed others with more complicated models in the competition: I bet most people submitted predictions with the default 0.5 threshold.

But due to the class imbalance (~38% of loans defaulted), I knew that the default threshold would be suboptimal. So, I used precision-recall analysis to pick a better cutoff.

I ended up maximizing the F1 score, which is the harmonic mean between precision and recall. The optimal threshold based on the highest F1 score was 0.35 instead of 0.5. This single change improved my competition score by several percentage points, likely the difference between placing and winning.

Insight: In the real world, different types of errors have different costs. Missing a default loses you money, which rejecting a good customer just loses you potential profit. The threshold should reflect this reality and shouldn’t be set arbitrarily at 0.5.

Conclusion

This competition reinforced something I’ve known for a while:

Success in machine learning isn’t about having the fanciest tools or the most complex algorithms.

It’s about understanding your problem, applying solid fundamentals, and focusing on what actually moves the needle.

You don’t need a PhD to be a data scientist or win a ML competition.

You don’t need to implement the latest research papers.

You also don’t need expensive cloud resources.

What you do need is domain knowledge, solid fundamentals, attention to details that others might overlook (like threshold optimization).

Want to build your AI skills?

👉🏻 I run the AI Weekender, which features fun weekend AI projects and quick, practical tips to help you build with AI.