Lessons Learned After 6.5 Years Of Machine Learning

I started learning machine learning more than six years ago, the field was in the midst of really getting traction. In 2018-ish, when I took my first university courses on classic machine learning, behind the scenes, key methods were already being developed that would lead to AI’s boom in the early 2020s. The GPT models were being published, and other companies followed suit, pushing the limits, both in performance and parameter sizes, with their models. For me, it was a great time to start learning machine learning, because the field was moving so fast that there was always something new.

From time to time, usually every 6 to 12 months, I look back on the years, mentally fast-forwarding from university lectures to doing commercial AI research. In looking back, I often find new principles that have been accompanying me during learning ML. In this review, I found that working deeply on one narrow topic has been a key principle for my progress over the last years. Beyond deep work, I’ve identified three other principles. They are not necessarily technical insights, but rather patterns of mindset and methods.

The Importance of Deep Work

Winston Churchill is famous not only for his oratory but also for his incredible quickness of mind. There’s a popular story about a verbal dispute between him and Lady Astor, the first woman in British Parliament. Trying to end an argument with him, she quipped:

If I were your wife, I’d put poison in your tea.

Churchill, with his trademark sharpness, replied:

And if I were your husband, I’d drink it.

Giving witty repartee like that is admired because it’s a rare skill, and not everyone is born with such reflexive brilliance. Luckily, in our domain, doing ML research and engineering, quick wit is not the superpower that gets you far. What does is the ability to focus deeply.

Machine learning work, especially the research side, is not fast-paced in the traditional sense. It requires long stretches of uninterrupted, intense thought. Coding ML algorithms, debugging obscure data issues, crafting a hypothesis — it all demands deep work.

By “deep work,” I mean both:

The skill to concentrate deeply for extended periods
The environment that allows and encourages such focus

Over the past two to three years, I’ve come to see deep work as essential to making meaningful progress. The hours I’ve spent in focused immersion — several times a week — have been far more productive than much more fragmented blocks of distracted productivity ever could. And, thankfully, working deeply can be learned, and your environment set up to support it.

For me, the most fulfilling periods are always those leading up to paper submission deadlines. These are times where you can laser focus: the world narrows down to your project, and you’re in flow. Richard Feynman said it well:

To do real good physics, you need absolute solid lengths of time… It needs a lot of concentration.

Replace “physics” with “machine learning,” and the point still holds.

You Should (Mostly) Ignore Trends

Have you heard of large language models? Of course, you have — names like LLaMA, Gemini, Claude, or Bard fill the tech news cycle. They’re the cool kids of generative AI, or “GenAI,” as it’s now stylishly called.

But here’s the catch: when you’re just starting out, chasing trends can make gaining momentum hard.

I once worked with a researcher, and we both were just starting in “doing ML”. We’ll call my former colleague John. For his research, he dove head-first into the then-hot new field of retrieval-augmented generation (RAG), hoping to improve language model outputs by integrating external document search. He also wanted to analyze emergent capabilities of LLMs — things these models can do even though they weren’t explicitly trained for — and distill those into smaller models.

The problem for John? The models he based his work on evolved too fast. Just getting a new state-of-the-art model running took weeks. By the time he did, a newer, better model was already published. That pace of change, combined with unclear evaluation criteria for his niche, made it nearly unmanageable for him to keep his research going. Especially for someone still new to research, like John and me back then.

This isn’t a criticism of John (I likely would have failed too). Instead, I am telling this story to make you consider: does your progress rely on continually surfing the foremost wave of the latest trend?

Doing Boring Data Analysis (Over and Over)

Every time I get to train a model, I mentally breathe a sigh of relief.

Why? Because it means I’m done with the hidden hard part: data analysis.

Here’s the usual sequence:

You have a project.
You acquire some (real-world) dataset.
You want to train ML models.
But first…you need to prepare the data.

A lot can go wrong in that last step.

Let me illustrate this with a mistake I made while working with ERA5 weather data — a massive, gridded dataset from the European Centre for Medium-Range Weather Forecasts. I wanted to predict NDVI (Normalized Difference Vegetation Index), which indicates vegetation density, using historical weather patterns from the ERA5 data.

For my project, I had to merge the ERA5 weather data with NDVI satellite data I got from the NOAA, the US weather agency. I translated the NDVI data to ERA5’s resolution, added it as another layer, and, getting no shape mismatch, happily proceeded to train a Vision Transformer.

A few days later, I visualized the model predictions and… surprise! The model thought Earth was upside down. Literally — my input data showed a normally oriented world, but my vegetation data was flipped at the Equator.

What went wrong? I had overlooked how the resolution translation flipped the orientation of the NDVI data.

Why did I miss that? Simple: I did not want to do the data engineering, but directly skip ahead to machine learning. But the reality is this: in real-world ML work, getting the data right is the work.

Yes, academic research often lets you work with curated datasets like ImageNet, CIFAR, or SQuAD. But for real projects? You’ll need to:

Clean, align, normalize, and validate
Debug weird edge cases
Visually inspect intermediate data

And then repeat this until it’s truly ready

I learned this the hard way by skipping steps I thought were not necessary for my data. Don’t do the same.

(Machine Learning) Research Is a Specific Kind of Trial and Error

From the outside, scientific progress always seems to be elegantly smooth:

Problem → Hypothesis → Experiment → Solution

But in practice, it’s much messier. You’ll make mistakes — some small, some facepalm-worthy. (e.g., Earth flipped upside down.) That’s okay. What matters is how you treat those mistakes.

Bad mistakes just happen. But insightful mistakes teach you something.

To help myself learn faster from the perceived failures, I now maintain a simple lab notebook. Before running an experiment, I write down:

My hypothesis
What I expect to happen
Why I expect it

Then, when the experimental results come back (often as a “nope, did not work”), I can reflect on why it might have failed and what that says about my assumptions.

This transforms errors into feedback, and feedback into learning. As the saying goes:

An expert is someone who has made all the mistakes that can be made in a very narrow field.

That’s research.

Final Thoughts

After 6.5 years, I’ve come to realize that doing machine learning well has little to do with flashy trends or just tuning (large language) models. In hindsight, I think it’s more about:

Creating time and space for deep work
Choosing depth over hype
Taking data analysis seriously
Embracing the messiness of trial and error

If you’re just starting out — or even are a few years in — these lessons are worth internalizing. They won’t show up in conference keynotes, but they’ll show up through your actual progress.

The Feynman quote is from the book Deep Work, by Cal Newport
For Churchill’s quote, several variations exist, some with coffee, some with tea, being poisoned

Lessons Learned After 6.5 Years Of Machine Learning

The Importance of Deep Work

You Should (Mostly) Ignore Trends

Doing Boring Data Analysis (Over and Over)

(Machine Learning) Research Is a Specific Kind of Trial and Error

Final Thoughts

Related Posts

Your Bluetooth Headphones Might Be Spying On You

Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)

Leave a Reply Cancel reply