Sometimes Words Mean Something: August 2024

Friday, August 30, 2024

Is Doing the Right Thing - Right?

For most of this year (the last nine months), I have been pondering whether doing the right thing is right. Let me clarify. For me, doing the right thing in the context of honesty or integrity - is ALWAYS the right thing. Is doing what I feel is the right thing to do - the right thing? I have been noodling on specific circumstances that have me questioning whether I did the right thing based on how they turned out.

There were two circumstances where I put others' needs ahead of my own, and it didn't work out. It is not nearly as noble as when Spock sacrificed himself in Wrath of Khan, but it is my little drama. The binary part of me looks at these situations and sees the only alternative as never putting others first and looking out only for myself. Even writing that down makes me grimace, as it goes against my core beliefs. So, the answer must be in the vast gray area between what I did and doing nothing. In the classic answer for all questions, it depends.

Regarding the two circumstances I have been pondering, the others were not in their integrity, making that the easier thing to focus on. I put others (people, projects, etc.) first and myself second, and it didn't work out as I had hoped. As I healed from that, I was able to begin considering where my learning was. Even if it's a one-off, I won't trust that person in the future.

I have been thinking about reciprocity in this context; doing something for others (entities or people) without expecting what I may get in return. Being a manager/leader is being in service. The thing I struggle with is whether that also requires selflessness. My recent experience is that selflessness in the workplace is a minefield. First, it's not very likely to happen - it's just the nature of companies, especially large ones. Second, I must acknowledge that I sometimes have expectations hiding behind my actions. They may be as simple as looking for a smile, but expectations nonetheless. That doesn't make my decision/activity wrong; it just exposes that I had some expectations. It's the missed expectations that led to my disappointment.

Ultimately, I feel good about what I did and my reasons for doing it. I would make the same choice again, looking out for any expectations I have. It also means that being selfless does not mean sacrificing myself. So, maintain my integrity, do what I believe right, and watch out for personal risks.

Monday, August 19, 2024

Who I Work With is a Choice

Since leaving Amazon, I have been recharging and considering what I want to take on next and with whom. I followed a similar process in 2013 when I last looked for my next opportunity. Then, I gathered a list of the top 10 companies I respected and focused my search on those. I needed to align my principles with the companies that aligned with them. This led me to also form a list of companies that didn't align with my principles and exclude them from my search. As a result, I passed up some well-paying opportunities.

For instance, Amazon was on the list in 2013 because of my experience with no-hassle returns and Amazon Prime. Secondarily, it was the technology, tools, and innovation.

A lot has changed since 2013, including my prioritization of principles. For instance, after the 2016 election, I am paying more attention to politics. It's not a party affiliation but rather a candidate and their policies. It is a contradiction to say you support me as a transgender person and then support policies that limit my freedoms. It's also about the larger company culture concerning diversity, equity, inclusion, social impact, and corporate responsibility.

From what I hear and see, the job market is in a cycle where it's challenging to land a job. I can feel the tension between landing a job and standing by principles. It's a choice.

Ultimately, it's not a choice for me. I have not been happy when I have worked with less principled teams. I am the type of person who wants to wake up in the morning and is glad to be working with the company/team. That does not mean it's easy or there are no challenges. But it does allow me to be myself, be comfortable asking for help, know that I will get help, and pay it forward by helping others.

Friday, August 16, 2024

Do or do not. There is no try.

I recently dreamed I was in a large audience where an unidentified leader asked for a volunteer to lead a big project. They looked at me, and I said something like I will try, but I couldn't do it alone. Someone else stood up and said they would take on the project, and I could feel the audience staring at me with disdain. So much so that it woke me up.

I am not the type of person who looks for a lot of meaning in my dreams, mostly because I don't remember them. But this one woke me up, and I spent the rest of the early morning lying in bed, considering why it bothered me.

This was similar to talking to others and asking them how we get something done, and their answer is something like, "We don't have the resources." That wasn't the question; the question was, what would it take? My dream was similar because the question was not, "Would you deliver this project by yourself." It's not a question of ensuring a successful outcome; it's a question of pulling together enough for a reasonable discussion on feasibility.

The response I would guide my dream self to give is: Sure, I will lead that big project for you. Then, I would document what it's going to take to get it done and present that to the stakeholders.

This dream bothered me because I was settling for trying when I knew better.

Wednesday, August 14, 2024

My Experience of APIs

I watched Nate Jones's TicTok last week. He talked about how APIs are becoming the interface to data. As I noodled on his words, I agreed with him, and it got me thinking about my history with APIs - more generally than his point.

I have been developing or calling APIs for a long time, and this got me thinking: weren't the MSDOS interrupts (related post) a form of API? We all know the right answer, so we will not say it out loud and move on.

But seriously. I remember creating an API for the DOS terminate and stay resident (TSR) module I wrote as the proxy for the server-based DBMS we wrote. This API used INT2F as the interface to the TSR. We didn't expect CBASIC programmers to code to INT2F, so I had a callable C library that abstracted the internals. What we built was closer to something like ODBC, a generic API for retrieving data requiring you to write the query.

There was a period when our systems had a business layer and a data layer; the business layer represented business semantics, and the data layer represented the sproc/view/query/table structure. Weren't these all APIs? During the object-oriented period (which sounds like the Jurassic period), I didn't refer to them as APIs then; they were methods on an object.

We went from RPC to SOAP to REST, each an evolution to build distributed systems. All these are forms of APIs as well. Why? APIs are about creating a dependable contract that callers can depend on to create an outcome—not about the specific implementation technology. The "interesting" part of the API design process was how to design for reusability. And by interesting, I mean the part we spent a lot of time discussing. In the Enterprise IT space, we had enterprise architects who created taxonomies, libraries, and APIs that modeled the data and/or the business processes. Building models/services for the entire company didn't work, and a "single responsibility model" and "separation of concerns" were needed. Even now, we have engineers who have API design as a skill we depend on and use at some level of governance.

From these evolutions (and more) of APIs, I have learned they are hard. Why? Because API design is hard. Why? Because reuse is hard. This is why when we talk about APIs today, we don't think of them as static. Instead, we build them to be extensible as the needs evolve. Even designing for evolution, I have had more than my fair share of painful API deprecations.

I still love a good API design session.

Wednesday, August 7, 2024

Surprisingly good project estimation

At Amazon, we went through several cycles/iterations of planning. Regardless of how Agile an organization operated, we needed to map capacity to the roadmap to understand how much we could deliver in a given year. This does not mean we actually delivered exactly what we initially planned since requirements and priorities changed over the course of the year. But it did allow us to have a starting place that we used to have a shared vision across engineering and stakeholders.

For each iteration, I would strive for better accuracy by asking technical leaders to revise their estimates. Effort was the input we refined the most, asking teams to go from swags to more educated estimates. How much were they relying on existing services? How much had never been done before? I had a list of prompts to help them assess the complexity/unknowns and, therefore, the risk. I would take all that and put it into my model (i.e., Excel), which would auto-magically create a calendar plan.

Invariably our leaders always requested when we would be "done," typically in the context of something like Prime Day or re:Invent. To answer this question, I would use a scaled-down version of the organization planning spreadsheet. Typically, the project was underway, and varying levels of design were complete. Most of the time this meant that the effort estimates were better, and the number of resources available was known. For each estimate, I would ask managers to provide me with the factor they used to compensate for PTO, meetings, etc. - I called it non-keyboard time.

Anecdotally, the static model usually yielded a surprisingly accurate date. I would use it to drive discussions around resource and/or feature tradeoffs depending on which knob we were using to tune the outputs (typically the launch date).

Such a simple model gave many people heartburn—"How could something so simple model something so complicated?" I tried "enhancing" the model to account for other factors, for instance, parallelization. However, it was too complex to model and required too much from engineering managers or technical leads. Parallelization is needed when building a plan but doesn't work when doing estimation—it simply didn't improve the model.

The problem was that many leaders would simply not believe the model. I found this very frustrating since they would usually choose a somewhat arbitrary date as to when they wanted to be done and tell me to use that. I would challenge that approach and eventually agree since it was obvious that they were going to get their way (an anti-pattern of Amazon's Have Backbone, Disagree, and Commit). Side note, one time, I escalated a case of this, and while the VP I escalated to was appreciative, the Director threw a tantrum in the meeting. So much so that the VP later contacted me to apologize for the Director's behavior.

In project retrospectives, we discussed how we course-corrected throughout the project. It turns out that the early model outputs were right—scary right?

I have been around long enough to know that when bringing in the date, leaders are often pressuring the teams to keep their foot on the gas. Something to watch out for is that this practice can lead to the outcomes I describe in my "Secure Apps Take A Village" post. Some of this behavior comes down to a couple of challenges leaders (including me) have to grapple with - trust and giving up control. Trust (and verify) what my teams are telling me, and accept that I don't have the same level of control I used to. Both are probably topics in and of themselves.

Things that I ran into...

Inflated efforts. Managers and leads pad their effort estimates to ensure they meet commitments. First, how do you know if it's inflated? Who better to know the complexity of a system than the people closest to it? If managers are really doing this, then do they understand what is being asked of them? How will it be used? If managers are padding, they are responding to a fear typically rooted in the organization's culture. This is something to go deeper into to figure out how to make it work for you.
Low efficacy factors. Given that keyboard time is an input into the model, this will directly impact the date/resource calculation. Is a low number accounting for a less experienced team? Or is the manager using the same number every time? Why—to pad dates? I used a generously low number, which was painful to write down but proved to be more accurate. For instance, this was especially true for teams that had high operational overhead.
What is DONE? Of course, done is something that not everyone defines the same way. As the owner of a launch, done to me meant the system was launched and open to customers. Invariably some folks would define done as some level of completeness for their code. Ensure that everyone is aligned on what done means. I would typically use code complete, including all automated tests and the binaries, in a shared test environment. I used this as done since the cost of integration testing and other launch activities was typically shared across teams, and I could estimate those differently.

Monday, August 5, 2024

Secure Apps Take a Village

I reflect on recent experiences and the tension between rolling out features and ensuring the product is secure. Features are flashy. They make good keynotes and/or demos. Features get customers excited and make money.

Security is about the backend. It's not flashy. It's hard to demo security; you probably wouldn't want to even if you could. But when something goes sideways, a security issue will lose you customers. It's like giving your competition bullets—"Hey, did you hear about that security breach at Acme?"

Security is getting more complex and more challenging to do well. Sure, we are smarter and better understand vulnerabilities. However, applications are also becoming more complicated (e.g., log4j and versioning). Complexity makes security harder—it just does. The bad actors are as determined as ever, and their methods continue to evolve.

I attribute much of this tension to the organization's culture. A typical culture pattern is when leadership compresses engineering estimates, and as time runs out, security verification takes place. This leads to pressure to complete verification and remedy errors without impacting the date. Regardless of the number of times that leaders tell people how important security is - actions speak louder. We must ship at this point in the project, leaving uncomfortable conversations and hard choices.

The tension between shipping and being secure starts early, when most of the time is spent discussing/planning customer-facing features. Sure, security comes up, but in my experience, it gets a different in-depth consideration. If we are lucky, the challenging security conversations/decisions are before we launch - and not after. Talking about security feels more like a bad thing when it's this late. Further contributing to this tension is when leaders push to make a date, a security issue is identified, and in response, they make statements akin to - "Security is always a priority, so when I push on the date, my expectation is that the system is still secure."

Imagine a product where this culture goes on for a long time; the product is very successful and lucky that lurking security issues are private. With some certainty, this culture will catch up with you, and a critical mass of security issues will arise, requiring engineering resources. Sometimes, the number and/or severity of the problems is so great - that feature delivery is paused, and most of the team is focused on remediation. I bet that for every case like this in the press, many more are not public.

Too often, I have seen teams rely on the hero model: people who, early on, take the risk of spending time on security—which, in my opinion, is under-acknowledged/appreciated. Or the heroes who scramble to fix issues after they are found—typically in a severe time crunch. The hero model is not sustainable and is unreliable for ensuring secure systems. Further, the hero model is often a symptom of something "wrong."

So, let's stop relying on heroes, best intentions, hollow statements, and punishing the bearers of bad news. In a fiercely competitive world, features are prioritized, and security often gets less attention, sometimes even becoming an afterthought. This is a culture war that needs to be waged.

We must start by considering security not just as a necessity but as a feature as important as any other. We need to talk about it early. We need to do security reviews early and often. For instance, start a threat model on the first day, keep it up to date, and assess design decisions in the context of how it would change the threat model. Activities like this involve staffing security-oriented teams and embedding them into the project teams. Then, empower them to ensure positive security-oriented outcomes. A culture of security is independent of volunteers who take on additional responsibility to ensure a secure system. We need mechanisms to help people do the "right" thing. Then, acknowledge (early) security wins, such as actions to avoid an event.

A culture of security is hard to change, and as leaders, we need to be honest with ourselves about it — are we backing up our words in ways that will lead to the outcome we want? Look in the mirror. Yes, the words mean something, but it takes more than words.

PS: If you're unfamiliar with the African phrase "it takes a village" - see here.

Sometimes Words Mean Something