Good afternoon Señor Horace Greeley. Many people have asked, so I’d like to recount how and why I independently wrote and published the book entitled Trustworthy Machine Learning, which is available for free in html and pdf formats at http://www.trustworthymachinelearning.com and as an at-cost paperback at various Amazon marketplaces around the world (USA, Canada, UK, Germany, Netherlands, Japan, …).
Why I Wrote A Book
Writing a book is a big effort and a big commitment, so why do it? Just like you shouldn’t do a startup company just to be able to say you did a startup, it can’t be just because you want to have written a book. It has to be because you have something unique to say that the world needs to hear, and it is just bursting out of you.
I’d had the nondescript want for a book for a long time. But three years ago, I felt that there was something I needed to say. That was my approach and worldview for doing data science and machine learning that I had honed over a decade in an environment that few others experienced. And it felt like the deep learning revolution was missing some important things. I was ready to speak.
How It Started

In May 2019, I flew to Madrid to represent Darío at Fundación Innovación Bankinter’s Future Trends Forum. That trip was the only time in my life I’ve sat in business class and it was fortuitous because it happened mere days after I had a painful back spasm. After the meeting concluded, I had a few hours to kill before proceeding onwards to Geneva for the AI for Good Global Summit. Instead of risking my back with any tourism, I sat in a park (the thin green area on the map) and wrote down an entire outline for the book I was imagining. That outline ended up being close to that of the eventual finished product. Look below for exactly what I typed into the notes app of my phone that afternoon.

Introduction
Age of Artificial intelligence
General purpose technology
Trustworthiness
Overview and Limitations
Overview
Limitations of book
Biases of author
Diverse voices
Preliminaries
Uncertainty
Aleatoric
Epistemic
Detection theory
Confusion matrix
Costs
Bayesian detection
ROC
Calibration
Robust (minimax) detection
Neyman-Pearson detection
Chernoff-Stein, mutual information theory, kl divergence
Causality
Directed graphical models
Data
Finite samples
Modalities
Sources
Administrative data
Crowdsourcing
Biases
Temporal biases
Cognitive biases/prejudice (quantization)
Quantization only by words so don't have to introduce quantization and clustering
Sampling biases
Poisoning
Privacy
Causal basis included
Machine learning
Risk minimization
Decision stumps
Trees, Forests
Perceptron
Margin-based methods
Neural networks
Adversarial methods
Data augmentation
Causal inference
Causal discovery
Safety
Epistemic uncertainty in machine learning
Distribution shift
Fairness
Adversarial robustness
(Causal foundations included in each pillar)
Testing
Communication
Explainability and interpretability
Direct global
Distillation / simple models
Post hoc local
Value alignment
Unified theory
Preference elicitation
Specification gaming
Factsheets
Blockchain
Purpose
Professional codes
Lived experience
Social good
Types of problems with examples
Open platforms
Summer and Fall of 2019
Once I was back from Europe, the summer was upon us and that meant having our social good student fellows with us and their projects in full steam. That, along with my other work, also meant days full of meetings: a manager’s schedule rather than a maker’s schedule, so I didn’t do anything further on the book all summer. Here is my calendar on one of those summer days (and this wasn’t atypical).

In the fall of 2019, I had the honor of spending three months at IBM Research – Africa, in Nairobi, Kenya. Because of the time difference, I made myself only available for meetings 8 am to 11 am Eastern, which often meant entire mornings (East Africa Time) with no meetings (except for the nice conversations with the Africa lab researchers). Even though I thought I could use that time to start writing the book, I didn’t. Instead, the sabbatical turned out to be a great time to recover and recharge (while also doing some stuff on maternal, newborn and child health). Recovery is underappreciated.

Starting to Write
Back home, and with my calendar still mostly bare, I blocked off 90 minutes for writing every day starting on January 2, 2020. I started getting into a flow and put some words and equations down on paper (really this Overleaf). I made good progress on an introduction chapter and a detection theory chapter.

Then in mid-February, Bob Sutor stopped by my office and said that an acquisitions editor for the publisher he worked with on Dancing with Qubits was looking to publish a book on responsible and ethical AI, and connected me with Tushar. Coincidentally, the same week, an acquisitions editor for Manning Publications emailed me cold about my possible interest in writing a book. I had good conversations with both editors and I was naïvely happy at the perfect confluence of events.
I filled out book proposals for both companies. Here is the one I did for Packt:
and here is the one I did for Manning:
I was completely honest in explaining what I wanted to do (mix of math and narrative), who it was for, and so on. I even sent over the couple of chapters I had already written. Both publishers were happy and accepted my proposal. Both made very similar offers in the contractual terms, which wasn’t particularly important for me because I wasn’t doing this for the money. Manning had an early access program through which readers could access chapters as they were being written (which is what I wanted and also why I had made the Overleaf open when I was writing the first two chapters), so I decided to go with them. I signed on the dotted line on March 17, 2020.
Turbulence
Things did not go as I thought they might. Everything had shut down a week earlier because of the Covid-19 pandemic, and the shutdown did not abate in any way. I was sitting on a dilapidated sofa in my basement trying to complete other work, taking the kids outside to kick a soccer ball around once in a while, and plotting out how to get scarce groceries — not exactly conducive to writing. Certainly no more 90 minute blocks of time daily.
More turbulent than that, however, was the publisher trying to shoehorn me into what they wanted. My proposal was very clear that the book would have a decent amount of math and no software code examples, would be a tour of different topics, and would be centered on concepts. But that didn’t seem to matter once things were underway. As I soon learned, Manning religiously follows Bloom’s taxonomy, and understanding concepts is very low on the totem pole. As instructed, I doggedly kept trying to push my text higher in the taxonomy, but it was mostly a farce to me, where I would just use the word “sketch” or “appraise” while still saying what I was going to say. I was also ruthlessly trying to reduce the math at their insistence. For example, the chapter on uncertainty as a concept morphed into evaluating safety.

There was a lot of back and forth, and a lot of frustration. Eventually, on February 16, 2021, the book was available for sale in the $40-$60 range through the early access program with the first four chapters available. We celebrated. I got a lot of positive feedback from people I know.

But the turbulence didn’t calm down. More Bloom, less math, and less of myself. I am not someone who uses the word “grok“. I didn’t want this to be a prescriptive recipe book because I don’t believe that that is what trustworthy machine learning is all about.
The book reached 320 sales by the time the first 12 chapters had been posted, which in my opinion is pretty darn good for something that is not even complete and with an underwhelming marketing effort.
Then came an ending and a rebirth. On September 10, 2021, the acquisitions editor reached out and said that the publisher would be ending the contract and the rights to the content would revert back to me. I guess the sales weren’t what they needed and the content continued to be mismatched from the desires of their typical buyers. This turn of events ended up being more of an emotional relief than anything else.
Did the book improve because of all that back and forth? On balance, I’d say yes. So no hard feelings.
Finishing
I am not one to leave things unfinished, and I wasn’t going to let the ending of the contract hold me back from finishing the manuscript that I had toiled on for so very long at that point. I vowed to complete the whole thing by the end of the calendar year. In less than 4 months, I wrote the remaining 6 chapters: an unbridled pace much faster than what I had been doing before.
I didn’t think much about what the route to get it out would be in September or October. Tushar reached out and offered to bring it to market through Packt, but I just wanted to focus on finishing it. And I did, on December 30!
By that time, I had made up my mind to post it online with a Creative Commons license to begin with. I created the website http://www.trustworthymachinelearning.com and posted a pdf of version 0.9. I quietly spread the word and kept getting a lot of positive response from acquaintances.
Independently Published
While a diverse panel I had assembled was giving version 0.9 a look over and providing feedback, I did a bunch of soul-searching on what this book was for and why I was doing it. I also pored over what people had written about self-publishing in today’s age. I clearly wasn’t in it for the money — I was more than happy for anyone in the world to learn from it without paying. In fact, empowering people, no matter their station in life, is one of the messages of the book. I wanted its message to ring far and wide.
While everyone has a little vanity in them, like I said at the beginning of this post, I hadn’t written the book just to have written a book. This was also not a book aiming for some kind of book award. I wasn’t going to be using it for an academic tenure or promotion case, or any other stamp of approval. I didn’t want IBM to be involved in any explicit way (Manning had actually sought that out through a sponsorship deal). I enjoy doing a little formatting and aesthetic stuff here and there, and copy-editing. The previous experience hadn’t shown me that a publisher would necessarily do the right kind of marketing. Kindle Direct Publishing is really easy, doesn’t require any capital investment, and has very wide reach.

Putting all of that thinking together, despite not having heard of others in my orbit doing it before, I decided to independently publish the book. It has been up on Amazon since February 16, 2022 at the lowest possible price that Amazon allows for covering their costs. I’ve been very happy with my decision. It suits me and my worldview.
Afterwards
That very day, February 16, I made a social media push about the book, and that very night, I received this very kind email from Michael Hassan Tarawalie:
Dear sir,
It is an honor to come in contact with you, sir. Am a student at the electrical and electronic department, faculty of engineering Fourah Bay College, University of Sierra Leone.
Sir your book has helped me.
One of the very first citations to the book was in the influential report by NIST entitled “Towards a Standard for Identifying and Managing Bias in Artificial Intelligence”.
There have been several great reviews of the book on Amazon from people I don’t know. It has become almost a cottage industry for people to hold up their copy of the paperback in large meetings I attend on Zoom and for others to post photos holding their copy on social media.
As of today, 481 copies of the book have been printed and shipped across the world in less than 3 months. Even though I’m not tracking it, I’m sure lots of people have accessed the free pdf and used it to uplift themselves.
This is what I wished for.
It always seems impossible until it’s done.
Nelson Mandela
