Posts‎ > ‎

Be likeable or get fired!

posted Apr 19, 2018, 3:32 PM by Renato Athaydes   [ updated Apr 20, 2018, 1:26 AM ]
I have just been let go by my employer after only 5 months on the job. As usual in my country, the contract I signed included a probation period of 6 months within which any party (employee or employer) could terminate the contract, basically without advance notice.

I was shocked... not because I thought everything was perfect... I had been quite unhappy myself for some time, and I knew some of my co-workers also didn't like me too much... the shock was because just a few weeks earlier, I had a "performance review" in which half-a-dozen of my peers mostly congratulated me for the great job I was doing, with just one guy being a bit disgruntled with me and, after giving me quite nice compliments, as all the others had done, went on a lengthy rampage listing items which he believed I needed to improve on.

Before I get to that list, let me tell the story from the beginning.

I was hired with pretty good salary and benefits (no great title though - I was simply Backend Developer - which I asked to be changed to Software Developer to reflect how I see myself, not tied to only one part of the stack) after impressing the CTO with my take-home assignment and an incredible feat of luck: just the week he interviewed me, knowing I had Kotlin experience, he asked my opinion on Kotlin and as we discussed that (my opinion is that Kotlin is an excellent language, by the way), he mentioned he had just read an article showing how Kotlin performance was really competitive... and as it turned out, I was the author of the article!

That seemed like a great start, so I decided to leave my comfortable job at Curity (the up-and-coming OAuth/Identity server), and join a payments company.

My intention was to get exposed to new technologies used in high-traffic, scalable systems... things like Cassandra, Amazon ECS and Kubernetes. I had been working in product companies for several years, which on one hand is great as you get to develop a great understanding of the code-base and, in the case of Curity, lots of greenfield development as it was a relatively new product. But on the other hand, you're stuck with one technology stack, in my case, JVM stuff: Java, the language, Kotlin, Groovy, Jetty and so on.

In my first week, I was assigned to a temporary team which had been assembled to fix a security issue with a high level of urgency. Right from the first contact with my new colleagues, I noticed that there was something strange: even though the fix we had to make was extremely simple compared to the kind of thing I had been doing in my previous jobs, the other developers seemed quite stressed about the situation and declared that we would need months to sort things out. In my head, the fix was a matter of days!!

Being new in the company, chances were I was just ignorant of some details that they were not making very clear to me, but in any case, the matter was urgent and management requested that we tried to fix it within 2 weeks (not a hard deadline, as I understood it).

However, once we started the work, it was clear there was no complications I had failed to see: it was pretty trivial, actually, and within one week we were mostly done - even after spending lots of time discussing possible alternative solutions. Having expertise in the area, I took the lead in implementing the solution and wrote most of the code (unfortunately, the project is not open-source so I cant' verify it, but I probably wrote more than 70% of the code). The second week was spent mostly testing everything and communicating our solution to the other teams involved.

There was some more concern about deploying the code to production... clearly, the guys didn't trust the testing systems to pick up all issues before they hit production. That was a second red-flag for me - I've always worked extremely hard to establish a staging environment which reliably reproduces the production environment, so that great confidence could be had that if the released code passed the unit tests, system tests and some final manual tests in the staging environment, that it would be very unlikely that serious/obvious issues would only happen in production. And when it did, it was always a failure of the tests to cover some obscure edge case, not something much more basic like differences in setup between production and staging.

In any case, once we deployed the fix to production, it was a big triumph for us - specially for me, as this was only my first couple of weeks on the job and I had already been able to take a large role in fixing a serious issue.

But from this initial experience, I had learned that things were far from perfect and decided to start pointing out things we needed to improve.

I noticed that the tech-lead was quite upset every time I criticized something, as clearly he had been involved personally in most of the things I was criticizing: code of poor quality with no validation of inputs, lots of partially finished modifications (see the lava flow anti-pattern), lack of unit tests, obvious differences between production and staging environments, developers releasing from their own machines rather than the CI server... things that, in my opinion, would be irresponsible to ignore. The answer was always the same: they had had to compromise due to time constraints, or lack of resources... the familiar, valid (even if unsatisfactory), excuses.

But I don't like to criticize without showing a better way, so I decided to immediately start fixing the most glaring issues that I had observed during my initial contact with the micro-service I had been working on. That involved quite extensive changes and writing lots of unit-tests. I broke up the work in separate pull requests and hoped for some help from the guys to make sure the assumptions the tests were making were valid.

That's when my relationship with the team started to degrade. My pull requests were left mostly unattended for several days. I kept nagging people to please review my changes, but apparently I had made too many changes, even considering I had broken up the changes in separate PRs, so no one wanted to review. This situation would remain for the whole time I was there and probably was the root cause of our problems.

I started noticing that it was not only me.. having PRs left open for several days or even months seemed to be common: I found PRs in other projects that were almost celebrating their first anniversaries and still had no reviews.

When I have a lingering PR, what I normally do is fork a new branch from the PR branch, and continue work from there - until I am ready to submit another PR, which may result in a stack of PRs if the PRs are not reviewed fast enough. Notice that we're talking days here, not hours!

I brought that up with the team lead who responded that they could not spend their whole time just reviewing my code! I felt like I had been just reprehended for working too fast!

Another guy in the team started criticizing me for being messy with my PRs. Someone even complained when I offered to help another team perform some task which for me would be easy (and would definitely not affect our sprint negatively) - as if they had control over what I was allowed to do. So, everyone was clearly unhappy about me actually doing what I was paid to do. I actually heard from more than one people that I should slow down as there was no rush.

I tried to set an example by quickly reviewing PRs the other team members submitted, at which point someone complained I was reviewing his PR before it was ready for review (it had WIP in the title)... but I'd always thought that you just don't make a PR before you want someone to have a look at it!! Why would you do that?!

At this point, I think the atmosphere was already too heavy to recover from... everything I did was criticized by someone (except the code, which is what I actually wanted to get critic on) so I started becoming more and more defensive, which further deteriorated the situation! I thought of just going back to my previous job (the guys made it clear I was welcome to go back) but I was afraid it would make my Resume look like I am too much of a job-hopper. Also, it's in my veins to never give up when something is not going too well until I've put every effort to fix it.

A month or two passed... and we were now working on a Cassandra database migration which required changes both in the code and in the data, including the schema itself. This kind of thing might be simple with SQL databases like Postgres, but with Cassandra, this is basically uncharted territory from what we could gather.

One person who was held as an expert in Cassandra had already started writing some code to start the migration... but apparently hit a wall when error would occur every time he tried to run a migration with a larger volume of data. He was claiming the DB driver was "blowing up" because there was too much data in memory (or something on these lines)... looking into it, I immediately noticed the error had nothing to do with that, it was just a conflict in the Java classpath which caused a certain path in the code to throw a NoSuchMethodError. I tried to explain that to the guy, but he seemed lost at what I was saying - his knowledge of the Java runtime was clearly very limited. This lowered my confidence in his general expertise significantly. Later technical arguments with him were very difficult as I knew I couldn't trust his judgement (many other similarly bogus technical claims were made by the same person several times later) and when I started responding like "This is just not true", "I don't believe you" (which are the only responses I could master when someone claimed, for example, that "you can't iterate over all rows in Cassandra"), others may have felt like I was the one being irrational.

Anyway, after I sorted out the correct library versions we needed, the code obviously started working... but he started arguing that with Cassandra, you can't do things the way I was doing. He suggested that I should go read a book about Cassandra as I didn't know what I was doing - even as I made clear progress and got things working that he couldn't. I might not have known Cassandra well, but I had experience with CouchDB, MongoDB, MySQL, Postgres... and I could not imagine a database that does not provide such basic functionality as paginating over the whole data in a table!

Needless to say that it did work. Tests showed all data was being correctly iterated over as expected. But the damage was done: out of this situation, I became the ignorant guy who knows nothing about Cassandra, he the guy who tried giving expert advice and was not being listened to (if he was so expert, maybe he should be writing the code - but he was too busy playing with Terraform to try to automatically deploy new database clusters, something we had clearly said earlier would not be necessary). Even more so when another guy in the team pointed out how stupid I was that I was inserting the data in the new tables via BATCH statements (that from someone who did not write a single line of code to help us in the effort)!

The thing is: the Cassandra book actually warns against using BATCH statements in "buckloading" scenarios as that affects the stability of the nodes. So that other guy was right... but I was not using BATCH statements directly, but via the Java-equivalent, whose documentation I had read and said nothing about this problem. In any case, now my credibility was lower than ever as I had made the huge mistake of using BATCH statements for batch processing :).

I didn't like that situation at all and decided to verify the claims with data... so I measured how much load the nodes experienced using BATCH VS individual statements. I just couldn't get my head around the fact that you're supposed to send million of individual statements to be executed by the DB, rather than BATCHing them together like any civilized database client would. All measurements I did, which were quite comprehensive, showed that the load both within the JVM and on the OS metrics was significantly lower using BATCH, contradicting the advice in the book. As it turns out, BATCH probably becomes too expensive only when you have a large amount of nodes... but we had just a couple! So this problem didn't really affect us. I made a slide presentation to the whole team showing my findings.

Did anyone recognize my efforts and agreed to use BATCH?? Nope! Instead, I was accused of being stubborn and never changing my mind.

Well, I AM actually stubborn!! But if the metrics had shown that BATCH was worse, I would absolutely be the first to change my mind. This accusation lead to another confrontation where I became aggressive (the guy attacked me personally), which probably was the last drop for them.

We ended up not using BATCH. Instead, they asked me to throttle the migration, which to me was pretty absurd and showed a lack of technical knowledge: I applied back-pressure to avoid flooding the database with INSERTs... the "producer" only sent a group of requests (no longer a batch) then waited for all of them to be completed before sending more. Throttling via sleeps between "batches" in this scenario seems rather unnecessary - specially when the "consumer" was Cassandra, a database specifically written to handle high-throughput without just dying - and the metrics I had collected showed 0 impact on end users (which I verified by running realistic load tests on the server as the job was running). But anyway, arguing my case by now was completely hopeless.

In the end, the migration was a success. It took much longer than it should, but it did finish with only minor, fixable hiccups that did not affect production in any way.

The CTO suggested we celebrate with cake, which was common in such circumstances. But we got no cake. After a few days, I got escorted out of the building.

I was successful in every task during my 5 months at this company, but I was not successful in what is, in many ways, the most important: likeability.

In my previous jobs, I am pretty certain I'd been liked by most people (I got asked to stay by several people), and even in this job, I had some great relationships. But small clashes due to unclear process and technical arguments, with differences in expectations lead to a situation where I became disliked enough within my team for them to ask for my removal.

This has taught me a lot, and I hope that this post helps myself and others understand how small things on our day-to-day can quickly escalate, causing unnecessary harm.

The list of things I was told I should improve on included listening more to other people and accepting their opinions, being less defensive, less aggressive... basically, be more agreeable.

While I know I made many mistakes, I believe I was not the only one who did... but I was the one on probation... so... so long and thanks for all the fish.

To the people involved: if you believe I have misrepresented anything you did or said, even though I took utmost care to only make accurate claims, feel free to contact me and I will redact the story to reflect both sides of the story.