Category: Uncategorized

Is Pedigree and Old Boy network really relevant in the 21st Century?

Is Pedigree and Old Boy network really relevant in the 21st Century?

Not sure how to categorise this post, it’s an amalgamation of a critique of the social stratification that we witness today with a fair amount of my experiences interspaced and viewed through the lenses of a book, Pedigree. I have always held that Pedigree and education from Elite Institutions are a bit overrated and are not necessarily an indication of capability or skill, but rather of discipline and or perseverance.

Prior to my arrival in the United Kingdom eleven years ago, I had a very different picture of effort and success in the world. I assumed that professional success was not just a possibility, but a certainty if you were skilled and worked hard. Especially, after you read things like the 10000-hour rule (Outliers) or the 67 Principles (The success principle). Nothing seemed to be able to stop the onslaught of hard-working smart people’s success. My personal yardstick was, of course, me. After all, here I am in the heart of London building the very first meta-search engine for betting odds (in the UK) leading a team across 3 continents.

Imagine a tortoise roll aka Flashbak scenes from Masala movies:

I did not have the “Blue Blood” in me, I graduated with an Electronics degree from a Tier 2 University in South India. To make it abundantly clear, the first time I had come to the state capital, Madras (now Chennai) was to enrol in IEEE student chapter and the second was to apply for a passport! So, you can guess my “exposure”, the fact that I even got to know that something called IITs exists is a testament to the wonderful Teachers I had during my school days. Anyway, from there, it took me 3 different jobs (one of which was for Paypal), my own entrepreneurship journey and a successful exit to hitch this stint with EasyOdds/MarCol group.

Going through this and seeing my peers with a similar trajectory, I could be forgiven for thinking this is how success looks, a slow march with perseverance and dedication and “years under the belt”. This is not just me, but most people I knew did think that the recipe for success is the long game. This couldn’t be far from the truth. We have been oblivious to the fact that two classes exist in any workforce, the elites and the others! I started noticing that people with degrees from “fancy institutions” climb the ladder much faster or even start from a higher step. Now, Rivera starts her book by saying,

“Most Americans believe that hard work- not blue blood- is the key to success.”

Pedigree: How Elite Students Get Elite Jobs.

Rivera is a Professor at Northwestern University’s Kellogg School of Management and has received her Ph.D. in sociology from Harvard University. She has spent around a year researching her book by working in the HR department of a major New York City consulting firm.

Rivera starts her book by discussing and pinpointing the macroeconomic environment that sets “elite” students from “other” students. She uses a lot of data analysis throughout her book, (some of them went right over my head). The book is categorized as “vocational guidance” on Amazon! However, the book doesn’t directly guide you to secure an “elite” job! So, do not bother. Instead, the book is a critique of the hiring process and reveals the actions of those responsible for hiring. The book’s thesis is “that the way in which elite employers define and evaluate “merit” in hiring strongly tilts the playing field for America’s highest paying jobs toward children from socioeconomically privileged backgrounds.”

However, in her last chapter, she makes sure to show that the hiring process isn’t completely rigged and that some candidates from less affluent backgrounds were able to break the code and get hired while other candidates with affluent backgrounds failed to get hired. But, as stated earlier this is rare and not the norm. The author does a really good job by chronologically taking the reader through the steps of the hiring process, “from the initial decision (of firms) of where to post job advertisements to the final step, when the hiring committee meets to make final offer and rejection decisions.”

Rivera does a good job of explaining earlier in the book how the reproduction of elites starts from a very young age during college and that “Today, the transition of economic privilege from one generation to the next tends to be indirect. It operates largely through the education system” (3) She follows up by using the sociological research conducted by Alexandra Radford in which she shows that many top achieving high school valedictorians “from lower-income families do not apply to prestigious, private, four year universities because the high price tags associated with these schools. Illustrating how money and cultural know-how work together, some who would have qualified for generous financial aid packages from these institutions did not apply because they were unaware of such opportunities. Other had difficulty obtaining the extensive documentation required for financial aid applications.” (5) Since these students are unfortunately unable to apply to “elite” schools they are also unable to apply to “elite” EPS firms that only hire from a select number of IVY league schools. Therefore, as one of the attorneys said in the opening of the second chapter, “There are many smart people out there. We just refuse to look at them”. Why? Because they primarily hire from a select number of schools which are accessible to a select few. Therefore, the universities are the “engines of inequality” as she says in her book. Rivera points out that there are two methods of allocating high status career opportunities. One is the contest system, in which competition is open to all and success is based on competence. The other is the sponsored system in which existing elites select the winners, either directly or through third parties. The system in the U.S is a combination of both models according to Rivera. However, what is better for a society’s institutions? A combination of both models or strictly a contest system? It seems quite clear that a contest system would certainly drive the most deserving and competent students to the jobs that suit them. However, that’s unfortunately not the case as earlier stated through the attorney. Does this not seem like an irony in age where shareholder maximization seems like the first commandment of firms today? It certainly is. Where is the efficiency? It’s traded off for the sake of job “fit” and “polish”. According to Rivera’s study more than half of the evaluators of applicants regarded fit “as the most important criterion at the job interview stage, rating it above analytical skills and polish.” But what is fit?

According to Rivera’s sample of evaluators they defined and measured fit saying that it means they have a similarity in “play styles- how applicants preferred to conduct themselves outside the office- rather than in their work styles or job skills. In particular, they looked for matches in leisure pursuits, backgrounds, and self-presentation styles between candidates and firm employees (including themselves)” (137). This definition of fit unfortunately tilts the hiring process in favor of the already dominant elite while rejecting the competent and hardworking yet “unfit” candidates. Furthermore, it results in a monoculture where homphily is the norm and widely practiced. Yes, fit might generate stronger cohesion among employees, but a diverse number of competent employees from diverse backgrounds can certainly be healthier for society by motivating employees to work harder while increasing efficiency. Another metric that Rivera mentioned was Polish. Now what is Polish? Well, according to Rivera “interviewers in my study initially had difficult explaining to me how they recognized and assessed polish during job interviews.” (171) One Banker went so far that she likened polish to pornography, laughingly saying, “You kind of know it when you see it”. The general idea of Polish is that firms want to recruit employees who can maintain a reputable, luxurious and elite picture of the firm they represent. One consultant Natalie said, “In an ideal world, you have people who are folks that you want to throw in front of a client, that you feel are professional and mature. People that you know can walk into a room full of people who are twice their age ad be able to command it with self-confidence, but not too much self-confidence.” (170) Although, this might certainly be a good thing for a hiring firm it might also be a bias for those with families that have executives in the family and have taught their children to deal with executives and clients growing up. Therefore, besides the fact that polish is very arbitrary and can lead to a monoculture it can certainly lead to inequality as well.

In conclusion, I found this book to be great at illustrating all the short comings of the interviewing process at Elite Professional Service firms and that it unfortunately leads to more inequality in society. Furthermore, as Rivera suggests, firms needs to widen their interviewing scope not just for the sake of candidates, but even for the sake of hiring smarter and more competent students. They must stress the importance of having grades over institutional prestige and culture extra-circulars. In addition, to handing the interviewing process to more professional interviewers who can structure the interviews and detach themselves from the arbitrary metrics currently used.

References, Further Reading:

Old Boys Network

https://en.wikipedia.org/wiki/Old_boy_network

Pedigree in Tech

https://news.ycombinator.com/item?id=25486065

Insight: In the Silicon Valley start-up world, pedigree counts – https://www.reuters.com/article/us-usa-startup-connections-insight-idUSBRE98B15U20130912

Does the startup world have a Pedigree problem – https://qz.com/work/1695042/does-the-startup-world-have-a-pedigree-problem

The 10000 hours rule

https://www.theguardian.com/science/2019/aug/21/practice-does-not-always-make-perfect-violinists-10000-hour-rule

https://www.vox.com/science-and-health/2019/8/23/20828597/the-10000-hour-rule-debunked

How to measure Engineering Productivity?

How to measure Engineering Productivity?

The fact that you clicked on this article tells me that you are leading/heading a Team, group or an entire Engineering function and most likely a fast-paced startup. Assume the following,

It was a regular weekday, and your CEO/CTO asked the most intriguing question.

Do we measure Engineering Productivity? How do we fare? What can we do to improve it?

Well, if your boss’s name is not Elon Musk or if you do not work for Twitter, you can still be saved. Go on and read through. I know it is a long read.

What is Engineering Productivity?

As with anything you’re trying to improve, it starts with measuring the right data. So, you can actually track the right metrics. This data will form the basis of your analysis and baseline. I strongly recommend you don’t change anything about your current engineering process before you can collect sex weeks’ worth of data about your processes. If you start working on processes, you could end up with a Survivorship Basis.

You should have sufficient historical data to make comparisons. On top of that, most teams work in sprints of two weeks, so six weeks of data allows you to collect data for at least three different sprints. This will give you the allowances for any spikes and eliminate any unusual stress or slack on the execution.

Next, you should make gradual changes to the engineering process to see what improves or impedes the value delivery. It’s ideal to only implement one change at a time, so you can see the effect of each change, with all other things being equal. (it never is :D)

For example, if your engineering squads suffer from significant technical debt, you may want to build an additional stub related to feature completion. Every time an engineer completes a new feature, they must document the new feature. This could mean describing the feature, how is it built, what are the outcomes, how it interacts with other functions and the reasoning behind the design decisions.

By continuously measuring engineering productivity metrics, you can determine if this change has positively impacted the developers’ productivity.

How Is Engineering Productivity Measured?

There are potentially 100s of metrics you can measure for an Engineering Org. Here are four key metrics that will help you to get started with measuring engineering productivity. And I have consciously excluded the Sprint Velocity.

4 Prime Directives of Engineering Metrics

1. The One Metrics to rule them all metrics – Cycle Time 

Software development cycle time measures the amount of time from work started to work delivered. It is a metric “borrowed” from lean manufacturing, and it is one of the most important metrics for software development teams. In plain speak, cycle time measures the amount of time from the first commit to production release.

2. The Oracle of an Engineering Leader – Release Frequency 

You should measure how often you deploy new changes to your customers (production). In addition, you can track deployments to various branches/instances, such as feature branches, hotfix branches, or QA branches. This data would show you how long it takes for a feature/fix to move through the different development stages. In addition, the Release Frequency reflects the throughput of your team. It’s a good stand-in replacement for Agile Velocity, so you don’t spook your Engineers and you are not blind as well.

3. The Guardrail – Number of Bugs

You should definitely track the number of bugs that your team has to resolve within 2 sprints of releasing a feature. This metric helps you to understand the quality of your code better. Higher-quality code should display fewer bugs after feature deployment.

While there are derivative and more evolved metrics like Defect Density, Mean Time to Detect (MTtD), Mean Time to Resolve (MTrR) and Code coverage, those onces makes sense after you’ve taken stock of and address the prime metric “ No: of Bugs” first.

If you want a more detailed list, methodology of QA metrics, refer the links given below. 

4. What is your “Blocker” – Review to Merge Time (RTMT)

This may look like a zoom-in on “Cycle time” metric we discussed earlier. But, in fact it is very different. In fact, it is an interesting metric suggested by GitLab’s development handbook. 

You should measure the time between asking for a pull request (PR) review and merging the PR. Ideally, you want to reduce the time a feature spends in the review state (or pending review state). A high RTMT prevents developers from progressing while they wait for feedback and encourages context-switching between different issues/features.

Arguably, Context-Switching is the highest productivity killer and should be avoided as much as possible

So, why would you measure all these engineering productivity metrics?

Why Is Measuring Engineering Productivity Important?

When you’re a “fast-growing startup”, it’s important to keep an eye on engineering productivity. It happens that these startups favour growth through feature delivery at the cost of effectively scaling the engineering team and ensuring the team’s efficiency.

I hear your question.

But, why does my CEO/VP/MD not understand?

Answer is simple

Assume you have to manage multiple VP’s expectations and outcomes (Sales, Marketing, Support etc), Company’s OKRs, and investors (or) board, will you have more time to dedicate to Engineering Productivity?

In these cases, technical debt can quickly grow, which will slowly kill your team’s productivity. Technical debt can have many negative consequences:

  • More bugs for your team to fix
  • Lower code quality—not only bugs but also worse code design
  • Harder to debug code
  • Scalability issues
  • A decline in overall happiness and job satisfaction

To avoid all of these scenarios, you should measure the engineering team’s efficiency and avoid technical debt buildup. Avoiding these problems before they occur is an excellent Occam’s razor.  But addressing them head-on will have a significant impact on your organisation, both materially and culturally. 

In addition to preventing your team’s productivity from going down, the engineering productivity approach allows you to experiment with various approaches to try and improve throughput & efficiency. 

So, the goal is to improve the engineering process itself. For example, introducing new tools or applying new techniques. Next, you can measure the impact of these changes on your team’s productivity.

In the next part, I will write down on how can measurement improve engineering productivity, Stay Tuned!

References:

  1. Survivorship Bias. 
    1. https://www.masterclass.com/articles/survivorship-bias
    2. https://en.wikipedia.org/wiki/Survivorship_bias 
  2. Cycle Time
    1. https://tulip.co/blog/cycle-vs-lead-vs-takt
  3. Release Frequency
    1. https://community.atlassian.com/t5/DevOps-articles/Why-should-we-start-measuring-the-Release-Frequency/ba-p/1786430 
  4. Detailed QA Metrics to ponder (in addition to No: of bugs)
    1. https://reqtest.com/agile-blog/agile-testing-metrics/ 
  5. Review to Merge Time
    1. https://about.gitlab.com/handbook/engineering/development/performance-indicators/#review-to-merge-time-rtmt 
  6. Context Switching 
    1. https://pacohq.com/blog/guide/the-high-price-of-context-switching-for-developers/ 
Do you really need a Product Manager for a successful Product?

Do you really need a Product Manager for a successful Product?

This post is a summary of a series of “Mentoring” and “Advisory”  calls I did with some early stage startups, over the past 6 months. Most of the time, one of the founder ideates, one builds/leads the build. But, they want to go fast and think they need a Product Manager. Unfortunately, most of them don’t need a Product Manager. If you are at a similar juncture, read on to find out more.. 

The title is a controversial question, I know! 

The State of Product Management:

Off lately, Product Managers have to wear too many hats, leaving the role vague and blurring the boundaries of their area of responsibility. This ultimately leads to diminishing the value of the product manager’s core functions. Product Management is a strategic, cross-functional, front-line role that brings great value to the product and business.

But, it commonly gets abused by many fast-paced organisations expecting product managers to fill in the gaps in various disciplines. This may be process, pricing, unit-economics, partnerships, product-marketing to name a few. They can definitely do that due to their broad professional background.

Admittedly, product managers do have a broad background, otherwise they would have a hard time to be able to effectively collaborate with the stakeholders, lead the product and make the informed decisions. But this definitely should not end up with the product managers becoming de-facto “deciders” or “doers” originally intended to be done by other roles in other functions.

How do you decide if you need a Product Manager or Not?

Like any problem, there are two approaches, if an intellectual debate is more to your taste, continue reading on. If it is more of a rational “doer” approach, head straight down to it. 

Intellectual Approach

Ask yourselves some questions:

If you are a founder or a  leader or a decision maker,  before hiring a Product Manager, question yourself as to your expectations from the product manager. 

Think hard on what you want them to do:

  1. What do you want your new product manager to change/fix in your organization? What is it that you are unable to do?
  2. Do you not already have the in-house expertise that would help you address the current issues?

If you are still unsure about whether or not you need a product manager “in the house”, 

I recommend that you go through this checklist and answer Yes/No to each of its questions:

  1. Do you have a vision for your product? Do you believe it is aligned with the market needs?
  2. Are you sure you are building the right product — the one that delivers value to your target audience?
  3. Do you have a direction for your product? A long-term and a short-term roadmap?
  4. Till now, have you been able to execute your roadmap without major distractions?
  5. Are you capable of maintaining the strategic focus across all levels of the organization?
  6. Do you know your competitors and what they have on the game? Proposition, not features.
  7. Do you have an established feedback loop with your clients? (Not the feature request types)
  8. Do you mostly base your decisions on evidence/data?
  9. Do you find it easy to say “No” to various stakeholders from various functions while hearing their “suggestions” and “inputs” and explain them why what they think is not the “most” right thing?

If you answered “No” to more than 4 questions, you probably need a Product Manager, No doubt in that. 

But the reality is, that hiring a highly capable Product Manager won’t magically change the DNA of your organisation. I have seen multiple orgs regress into a worser situation than before. Because, the person responsibl has delegated the product decisions to that Product manager with a shiny belt, without enabling/empowering him/her. 

The result 

Rational Approach

If you’re a CEO, founder, or senior leader considering hiring a PM, check this list and see if you need one. Lets play a guess and eliminate game. 

If you can see your organisation is reflected in this article, don’t bother hiring a PM — save some money and hire a cheaper role. You would also spare a PM some misery.

Don’t bother hiring a PM…

If you have a fixed idea of what to build

You already know what you want to build, you just need somebody to build it. You’ve hired some engineers. You need somebody to gather the requirements from you and the team, and maybe manage the back-and-forth of different requirements from many stakeholders. This person then passes the requirements along to the engineers and makes sure they deliver on time.

You need a Project Manager, not a Product Manager.

If your Sales team or clients are dictating what to build

You have a handful of big clients and you’re ready to bend over backwards to deliver what they need, including building custom features. Your Sales team knows best what to build, surely, as they’re the ones talking to the customers all the time. Now it’s just a matter of writing the stories and prioritising them.

You need a Delivery Manager, not a Product Manager

If they won’t have access to your customers

You have some very-important-people as customers and their time is precious. You don’t want the new person you just hired to talk to them directly — may be they will say something untoward?

I don’t know what you need, but you certainly don’t need a Product Manager. 

If you’re not ready to delegate authority

You know that product managers should be given a problem to solve, not a feature to build. Heck, you were probably a Product person yourself, who has now set up your own startup. You have the vision and the strategy and you know exactly how to get there…

What’s left for the Product Managers to do, then? Maybe hire an Engineering Manager or a Tech Lead?

If you see technology as a support function

An easy way to assess this: How much of your company budget is dedicated for product/technology/innovation? If you’re not willing to invest significant resources to staff the product/technology team properly, they’ll be left firefighting all year long. 

Don’t hire a Product Manager — yet. Assess how you see technology plays a role in your company’s vision. Set aside a proper budget, hire a strong CTO or CPO, and let them build their team. Only do that if you’re willing to listen to them though — or don’t bother doing it at all.

In Sum and summary, Hire a Product Manager only if you believe you can delegate authority, and can come to a rational decision based on data. If not, hire a Project Manager, Engineering Manager or any of the other roles.

Engineering Leadership in Start-Ups: Engineering Manager, Director, VP of Engineering.

Engineering Leadership in Start-Ups: Engineering Manager, Director, VP of Engineering.

This post is partly the result of my discussions with our People practice leader and talent acquisition executive. ITILITE is at a phase of growth, where are looking for more engineering & product management bandwidth. And I had to think hard to write the various Job-Descriptions. So, I have tried to generalise it using my experiences from the last 2-3 stints. In case you’re interested to explore an Engineering Management role with ITILITE, please get in touch with me or write to careers{at}itilite.com

Engineering Leadership

As apps are becoming increasingly omnipresent and in most cases, there is a startup behind them. Engineers make up to 70% of a tech startup’s workforce, there is an increasing need for managers who look after those developers. As a result, there is a rise in the number of engineering managers in recent years. Engineering managers are responsible for delivery teams that develop these “Apps”. The following is a very generalised version of what you could do in these roles and a possible career progression.

Engineer to Tech Lead/Lead Developer

The first step in your journey from an Individual Contributor(IC) to a management role. This could be a mix of people management, delivery management, process management etc, depending on the context of your organisation. In most organisations, it is a “technical mentorship” role with some aspects of people management, quality and delivery ownership.

Most Tech Leads are natural technical leaders. They are great engineers on their own, they were well respected by the engineers around them, they worked reasonably well with the team, they understood how the product/module was designed, built and shipped, they had a decent sense for making the right kinds of product tradeoffs and they were willing to do just enough project management and people development to keep the team/project humming along. 

In this role,

  • Most TLs would retain some independent deliverables in addition to anchoring and owning the deliveries of their team.
  • Most of the team still works on the same module/feature or sub-system
  • They do code & design reviews, suggest changes and have the final say for their modules.
  • Together with the Product Managers, they “own” the feature/module.

We at itilite, call them Engineering Owners, much like Product Owners

Tech Lead to Engineering Managers

The next step in the Engineering Manager. In this role, you will be “Managing” a collection of inter-related modules/projects. In this role, the focus on timely delivery, people management and quality are higher than technical design & architecture. But, you are very much an Engineer and may be required to occasionally write quick hacks, frameworks for your developers to build atop.

The main difference is you will be responsible for the delivery of multiple projects in a related area. You will be expected to optimise the resources (Devs, Testers, etc.) available with you to maximise the outputs of your group, across multiple projects/modules

In this role, you’d be

  • Expected to actively engage with the Product Management teams to define what needs to be built
  • Defining how you will measure the outcomes of what your team is building and quantify the outcomes with metrics
  • Ensuring quality, getting stakeholder alignment and signoffs
  • Macromanage the overall deliverables of your group

The Pivot – Tech, Product, Solution Architect

The next step in your career gives you two options. One with people management, P&L accountability and other a purely technical role. If you’re planning for a pureplay technical role, some organisations have Staff Engineer, Principal Engineer etc. In essence, they are mostly a combination of Tech Lead+Architect type roles. Depending on your seniority/tenure and organisational context, you may be reporting to an Engineering/Delivery Manager, Director/VP or the CTO. In this rolw,

  • You will work closely with Engineering Managers, Quality Assurance leads/managers and Product Owners to design the system architecture, define the performance baselines
  • You will work with Tech Leads and Sr.Devs to drive the performance, redundancy, scalability among other stuff.
  • You will be called into discussions/decide when the team can’t reach consensus on engineering choices

Engineering Manager to Director of Engineering

A Director of Engineering role is completely different. You now have multiple leads+managers, likely multiple projects within a general focus area of the organisation. This will mean there will be way more individual deliverables and project milestones than you can track in detail on a regular basis. Now you have to manage both people and projects “from the outside” rather than “from the inside”. You’ll likely start appreciating the metrics and dashboards, as they will help you in tracking those multiple projects and deadlines, schedules, overruns etc.

You have to make sure that your managers and leads are managing their resources appropriately and support them in their effort rather than managing individual contributors and projects directly.

Lots of great technical leaders have difficulty making this transition.

While being an engineering lead/manager is certainly managing, it’s type of managing from “within the project” is much easier than “managing from outside the project” and as a director, you almost always have to manage multiple people and projects “from the outside”.

Also, as a director, you will be responsible for a number of aspects of the culture, such us

  • What kind of people are you hiring, setting responsibilities and workload expectations,
  • What is the team(s) doing for fun, how do they interact with other functions
  • What kinds of performance is rewarded/encouraged vs punished/discouraged.

Now, moving to some serious responsibilities, you may be the first major line of responsibility for what to do when things does not work,

  • an employee not working out,
  • a project falling behind,
  • a project not meeting it’s objectives,
  • hiring not happening in time, etc…

While most of these things are the direct responsibility of the engineering manager, the engineering manager is usually not left to face these issues alone, they work on it with the director and the director is expected to guide the process to the right decision/outcome.

I’ve seen people who were great technical leaders and good engineering managers who did not enjoy being a director at all (or weren’t as good at it) because it was a whole different type of managing bordering the administration.

Director to Vice-President

The VP of Engineering is the executive responsible for all of engineering. Development, Quality, DevOps and partly to Security and Product Management as well. While both the engineering manager and director of engineering have managers who themselves have likely been engineering managers and directors before, the VP may work for the CEO (in an early stage Startup or a smaller company) who has never been a VP of Engineering before.

A large company may have multiple levels of VPs, but in most cases, you work for someone who hasn’t been a VP of Engineering or doesn’t actually know how to do your job. This means, there simply is no first-hand experience from your Manager, that you can rely on to solve your problem. The first time you step into that role and realize that, it’s a sobering thought. You’re a pretty much on your own to figure things out. Not only are you completely responsible for everything that happens in the engineering organization, but when things aren’t going right, there’s pretty much no help from anywhere else. You and your team have to figure it out by yourselves. Many successful VPs eventually come to like this autonomy, but it can be a big adjustment when moving from director to VP.

At the director level, you can always go to your VP for help and consulting on difficult issues and they can and should help you a lot. At the VP level, you may consult with the executive team or the CEO on some big decisions, but you’re more likely talking to them about larger tradeoffs that affect other parts of the company, not how you solve issues within your team.

As a VP, you are primarily responsible for setting up processes and procedures for your organization to make it productive:

  • Team/Project tools such as bug system, project tracking, source code management, versioning, build system, etc.
  • Defining/improving processes to track, monitor and report on projects.
  • Defining processes to deal with projects that run into trouble.
  • Hiring: How you hire? What kind of people do you hire? how do you maintain the quality of new hire?
  • Firing: When someone isn’t working out, how do you fix it: reassignment, training, performance plan, transfer, firing?
  • Training: How does your team get the training they might need, it could be hard-skills, soft-skills or managerial
  • Rewards: How do you reward your top individual contributors and for your top managers?

You may be part of the Leadership “Council” or participate regularly in business discussions that may or may not concern your department directly. In a startup, you are often “the” technical representative on exec staff. You help craft the strategy of the business. You are relied upon for technical direction of the company (sometimes with the help of a CTO).

As a VP, you are expected to understand many important aspects of other departments, what is important to other departments and how your department serves or interacts with or depends upon other departments. Two classic example might be,

  • Sales depending upon certain product features/capabilities being delivered in a given timeframe to be able to convert a prospect.
  • Customer success depending upon certain product fixes being delivered in a given timeframe.

As a VP, you will participate in the setting of these timeframes and balancing these against all the other things your department is being tasked to do.

As you can see, Engineering Management/Leadership is a very interesting career option. We have multiple opening across Product and Engineering functions at ITILITE. Please see if any of these roles interest you.

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Background:

As described in an earlier post, I run the Engineering at an early stage #traveltech #startup called Itilite. So, one of my responsibility is to architect, build and manage the cloud infrastructure for the company. Even though I have had designed/built and maintained the cloud infrastructure in my previous roles, this one was really challenging and interesting. Due in part to the fact, that the organisation is a high growth #traveltech startup and hence,

  1. The architecture landscape is still evolving,
  2. Performance criteria for the previous month look like the minimum acceptable criteria the next
  3. The sheer volume of user-growth, growth of traffic-per-user
  4. Addition of partner inventories which increases the capacity by an order of magnitude

And several others. Somewhere down the lane, after the infrastructure, code-pipeline and CI is set-up, you reach a point where managing (read: trigger intervention, analysis, storage, archival, retention) logs across several set of infrastructure clusters like development/testing, staging and production becomes a bit of an overkill.

Enter Log Management & Analytics

Having worked up from a simple tail/multitail to Graylog-aggregation of 18 server logs, including App-servers, Database servers, API-endpoints and everything in between. But, as my honoured colleague (former) Mr.Naveen Venkat (CPO of Zarget) used to mention in my days with Zarget, There are no “Go-To” persons in a start-up. You “Go-Figure” yourself!

There is definitely no “One size fits all” solution and especially, in a Start-up environment, you are always running behind Features, Timelines or Customers (scope, timeline, or cost in conventional PMI model).

So, After some due research to account for the recent advances in Logstash and Beats. I narrowed down on the possible contenders that can power our little log management system. They are,

  1. ELK Stack  — Build it from scratch, but have flexibility.
  2. Graylog  — Out of the box functionality, but you may have to tune up individual components to suit your needs.
  3. Fluentd — Entirely new log-management paradigm, interesting and we explored it a bit.

(I did not consider anything exotic or involves us paying (in future) anything more than what we pay for it in first year. So, some great tools like splunk, nagios, logpacker, logrythm were not considered)

Evaluation Process:

I wrote an Ansible script to create a replica environment and pull in the necessary configurations. And used previously written load-test job to simulate a typical work hour. This configuration was used for each of the frameworks/tools considered.

I started experimenting with Graylog, due to familiarity with the tool. Configured it the best way, I felt appropriate at that point in time.

Slight setback:

However, the collector I had used (Sidecar with Filebeat) had a major problem in sending files over 255KB and the interval was less than 5 secs. And the packets that are to be sent to the Elasticsearch never made it. And the pile-up caused a major issue for application stability.

One of the main use-case for us is to ingest XML/JSON data from multiple sources. (We run a polynomial regression across multiple sources, and use the nth derivatives to do further business operations). Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”. And all of these were “NICE” loads.

When the daily logs you need to export is in upwards of 5GB for an app (JSON logs), add multiple APIs and some micro-services application logs, web-server, load-balancers, CI (Jenkins), database-query-log, bin-log, redis and … yes, you get the point?

(())Upon further investigation, The sidecar collector was actually not the culprit. Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”. And all of these were “NICE” loads! (in our defence) 

So, once the CPU hit 100% mark, sidecar started behaving very differently. But, ultimately fixed it with a patched version of sidecar and actually shifting to NXLog.

Experiment with the ELK is a different beast in itself, as provisioning and configuring took a lot more time than I was comfortable with. So, switched to AWS “Packaged Service” . We deployed the ES domain in AWS, fired up a couple of Kibana and Logstash instances and connected them (after what appeard to be forever), it was a charm. Was able to get all information required in Kibana. One down-side is that you need to plan the Elastic Search indices according to how your log sources will grow. For us, it was impractical.

Fluentd was an excellent platform for normalising your logs, but then it also depended on Kibana/ES for the ultimate analysis frontend.

So, finally we settled down to good old Graylog.

Advantages of Graylog

 The tool perfectly fit into our workflow and evolving environment:

  1. Graylog is a free & open-source software. — So we wont have pay now or in future.
  2. Its trigger actions and notifications are a good compliment to Graylog monitoring, just a bit deeper!
  3. With error stack traces received from Graylog, engineers understand the context of any issue in the source code. This saves time and efforts for debugging/troubleshooting and bug fixing.
  4. The tool has a powerful search syntax, so it is easy to find exactly what you are looking for, even if you have terabytes of log data. The search queries could be saved. For really complex scenarios, you could write an ElasticSearch query and save it in the dashboard as a function.
  5. Graylog offers an archiving functionality, so everything older than 30 days could be stored on slow storage and re-imported into Graylog when such a need appears (for example, when the dev team need to investigate a certain event from the past).
  6. Java, Python & Ruby applications could be easily connected with Graylog as there is an out-of-box library for this.

#logmanagement #analytics #startup #hustle #opensource #graylog #elk

What is SA-Core-2018-002 and How Acquia Mitigated 500000 attacks on Drupal

What is SA-Core-2018-002 and How Acquia Mitigated 500000 attacks on Drupal

Disclaimer: I have been working on WCMS and specifically with Acquia/Drupal for more than seven years. And in that period, I have developed a Love/hate relationship with Drupal. Love for Drupal 6 and hate for 7. Or something like that. So my views may be slightly unneutral.
 
On March 28th, the Drupal Security Team released a bug fix for a critical security vulnerability, named SA-CORE-2018-002. Over the next week, various exploits have been identified, as attackers have attempted to compromise unpatched Drupal sites. Hackers continue to try to exploit this vulnerability, and Acquia’s own security team has observed more than 100,000 attacks a day.

Timeline of SA-CORE-2018-002

The Remote code execution exploit or the so-called SA-CORE-2018-002 was a vulnerability that had been present on various layers of Drupal 7 and 8. And Drupal being Drupal,  had one of the most efficient governance among Open Source projects around. This I can say with confidence and pride as I have had more than a few interactions with the community, notifying issues, committing documentation, in feature roadmap discussions (Agreed, some of them are heated!) and submitting patch/fixes. Drupal community has very high standards and even though your patch or fix has functionally addressed the underlying issue, it may be declined. That said, it’s also one of the democratic community software you can get. Still, They insist on following the stringent and high community standards for the modules or themes.
So, it is no surprise that Drupal today has one of the most Responsible Disclosure policy.
Drupal community had previously notified all the developers in official channels and had asked to prepare a high impact patch. Meanwhile, Acquia did the same for its SMEs and Enterprise clients as well. Those in the deep of it knew a bit early on the nature of exploit and mitigation strategy.
And in the community forums, there were detailed descriptions of planning this infrastructure patch up and how to plan for uptime, isolation post disclosure, patching, updation and redeployment.
Multiple methods to suit multiple needs of the environment, architecture etc has also begun to appear. It was one giant machinery, albeit a self-governing one in it. I have known large organisations do a hodge-podge patchwork and contain the underlying vulnerability. Leaving a vendetta-driven Ex-Employee or a determined Hacker to expose the inner workings of the exploit. It had resulted in many multi-million dollar loss. Only after the #Apache project had reached a state of maturity, did these larger organisations learnt the art of disclosure. but, how many of them were practising it is a big question.
Till 28th March 2018, there were no (publically) known exploit for the RCA in Drupal 7/8. 
This all changed after Checkpoint Research released a detailed step by step explanation of the security bug SA-CORE-2018-02 and how it can be exploited. In less than 6 hours after Checkpoint Research’s blog post, Vitalii Rudnykh, a Russian security researcher, shared a proof-of-concept exploit on GitHub.
The article by Checkpoint Research and Rudnykh’s proof-of-concept code have spawned numerous exploits, which are written in different programming languages such as Ruby, Bash, Python and more. As a result, the number of attacks has grown significantly after that.
The scale and the severity of this attack suggest that if you failed to upgrade your Drupal sites, or your site is not supported by Acquia Cloud or another trusted vendor that provides platform level fixes, the chances of your site being hacked are very high. If you haven’t upgraded your site yet and you are not on a protected platform then assume your site is compromised. Rebuild your host, reinstall Drupal from a backup taken before the vulnerability was announced and upgrade before putting the site back online.
Geographic distribution of SA-CORE Attack Vectors

Solution:

Upgrade to the most recent version of Drupal 7 or 8 core.

  • If you are running 7.x, upgrade to Drupal 7.58. (If you are unable to update immediately, you can attempt to apply this patch to fix the vulnerability until such time as you are able to completely update.)
  • If you are running 8.5.x, upgrade to Drupal 8.5.1. (If you are unable to update immediately, you can attempt to apply this patch to fix the vulnerability until such time as you are able to completely update.)

Drupal 8.3.x and 8.4.x are no longer supported and the community doesn’t normally provide security releases for unsupported minor releases. However, given the potential severity of this issue, Drupal community choose to provide 8.3.x and 8.4.x releases that include the fix for sites which have not yet had a chance to update to 8.5.0.

DevOps Post Series : 2, How to install and configure SSL/TLS Certificate on AWS EC2

DevOps Post Series : 2, How to install and configure SSL/TLS Certificate on AWS EC2

Assumption:

It is assumed, you have launched an EC2 instance with a valid Key, configured the Security groups, Installed Apache/Nginx and have deployed your app.

Background:

Now, its time to configure your TLS/SSL certificate. Why would you want to configure your own certificate, when you can get Amazon to issue a free TLS/SSL certificate? Well, there are more than a few use-cases that we have come across.

  1. First and Foremost is, AWS Certificate Manager certificates can be installed only on Elastic Load Balancers, Amazon CloudFront distributions, or APIs for Amazon API Gateway. (At the time of writing)
  2. You are building a staging/testing server and would test integrations in it and require SSL/TLS.
  3. You are just starting off, and have only one EC2 instance to start with. (you cannot install AWS provisioned certificate on a EC2 directly)
  4. Provisioning a new service, say for data exchange for your customers with their customers/vendors etc, and will be a very under utilised service.
  5. Planning an endpoint for SSO/OpenID etc. and prefer to have this part as logically different than your app.abc.com or abc.com.

And at least a dozens other use-cases that comes to my mind, but leaving out for brevity.

Getting Started

Self-Signed Certificates:

Firstly, enable apache  in your EC2 Instance and install/enable ssl.
(As usual, I’ll try to give the instruction for both RPM and DEB package based distributions)
[shell]sudo systemctl is-enabled httpd[/shell]
This should return “enabled” if not, enable it by typing the following,
[shell]sudo systemctl start httpd && sudo systemctl enable httpd [shell]sudo yum update -y [shell]sudo yum install -y mod_ssl[/shell]
And Follow the on-screen instructions, You would have answered some basic questions like domain Name, Country, Email ID etc. And if you accepted the default locations from the prompt, you would have generated 2 files in the following locations.
/etc/pki/tls/private/localhost.key – This is an auto-generated 2048-bit RSA private key for your Amazon EC2 host. You can also use this key to generate a certificate signing request (CSR) to submit to a certificate authority (CA).
/etc/pki/tls/certs/localhost.crt – This is a self-signed X.509 certificate for your server host. This certificate is useful only where you can control the “client” environment, like a testing or staging server.
Now, restart the apache
[shell]sudo systemctl restart httpd[/shell]
And try https://your-aws.public.dns or https://[yourpublicip].
Since you’re accessing your site with a self-signed, untrusted host certificate, your browser may display a series of security warnings. But, once you added it to the exception list, you should be good to go. This would be the end of it, if you’re only looking for a certificate to be used for staging/other controlled environments. If you want a public facing SSL, so your users/customer can login and access this new service,

CA-Signed Certificate

– Go to /etc/pki/tls/private/  and generate a new private key
[shell]sudo openssl genrsa -out virtualserver1.key 2048[/shell]
This generates an RSA key that is identical to the default key. You can generate a 4096-bit key, not use RSA altogether and depend on some other mathematical models as well. But those are beyond the scope of this post.
[bash]sudo chown root.rootvirtualserver1.key
sudo chmod 600 virtualserver1.key
ls -alvirtualserver1.key [/bash]
Now, you can use this key to generate a Certificate Signing Request
[shell]sudo openssl req -new -keyvirtualserver1.key -out csr.pem[/shell]
When you do this, OpenSSL will open a series of prompts for all sorts of data, the “CommonName” is one thing which is Mandatory for your to get a certificate. All other data requested by it are optional. Once you’re done with that, you should have generated a csr.pem.
Submit the CSR to a CA. This usually consists of opening your CSR file in a text editor and copying the contents into a web form. At this time, you may be asked to supply one or more subject alternate names (SANs) to be placed on the certificate.
Remove or rename the old self-signed host certificate localhost.crt from the/etc/pki/tls/certs directory and place the new CA-signed certificate there (along with any intermediate certificates).
Once you’ve copied the contents of the .key file in the form and submitted it with your CA, you would have received an Email confirming the “Issue” of the certificate. Once its done, you can check your application in Https, now it should be with a “green padlock”. Meaning fully secure.
However, you can run a security test on your SSL, just go to SSLLabs and start a test by giving your URL. After about 2-5 mins, you would receive a rating and details. SOmthing similar to the following image.

That’s It! You’re done.

DevOps Post Series : 1, How to install and configure LAMP on AWS EC2

DevOps Post Series : 1, How to install and configure LAMP on AWS EC2

In this #DevOps centric series of blog posts, I will write about some of the interesting yet common problems and their solutions or quick guides and how-tos. This is the result of setting up a new #Datacenter setup for the #Startup I am working.
 
In this post, I will assume that you have already launched an EC2 instance type with the operating system of your choice. Generally, Amazon Linux (based on RedHat/CentOS) or Ubuntu is the preferred OS of choice. In case you prefer an exotic flavour of Linux, which does not support either the rpm/yum(RHEL/CentOS/Fedora/AMI) or apt (Debian/Ubuntu and derivatives)  this article may not be of much use to you.

  1. Connect to your instance – Use the private key you downloaded during the ec2 launch.
    1. If you’re in Linux or Mac – use the following by replacing it with your private key name and instance’s public dns –  ssh -i "loginserver."[email protected]
    2. If you’ve launched an Amazon Linux, use “ec2-user” instead of “root”
    3. If you’ve launched an Ubuntu Linux, use “ubuntu” instead of “root”
    4. another important thing is to ensure that the private key has 0400 privilege and it is “owned” by the “User” as who you’ll execute the ssh connection.
  2. Update your package manager
    1. Amazon Linux : sudo yum update
    2. Ubuntu Linux: sudo apt-get update
  3. Tools & Utils (Optional/Personal Preference) I normally prefer to have a couple of tools installed in the server for quick-hacks/edits, monitoring etc.
    1. Amazon Linux : sudo yum install -y mc nano tree multitail git lynx
    2. Ubuntu Linux: sudo apt-get -y mc nano tree multitail git  lynx
      1. For details on the above-mentioned tools, refer the bottom of the article.
  4. LAMP Server
    1. Amazon Linux :sudo yum install -y httpd24 php70 mysql56-server php70-mysqlnd mysql56-client
    2. Ubuntu Linux: sudo apt-get install mysql-client-core-5.6 mysql-server-core-5.6 apache2 php libapache2-mod-php php-mcrypt php-mysql
      1. Your operating system will start to download and install the specified software, as for MySQL, you will be prompted for a root password. After installation, I strongly recommend you to run mysql_secure_installation and proceed with the onscreen instructions.
      2. Some of the critical things to do are remove the “test” db, remove access to "root"@"%", others are optional.
      3. The optional steps are,
        1. remove the anonymous user accounts.
        2. disable the remote root login.
        3. reload the privilege tables and save your changes.
  5. Configuration and other dependencies
    1. Amazon Linux :
      sudo yum install php70-mbstring.x86_64 php70-zip.x86_64 composer node -y
    2. Ubuntu replace yum install with apt0get install

Finally, restart the services and off you go. You have successfully installed LAMP server in EC2. Now, go to your browser and enter the publicDNS of the ec2 instance and you should be able to see the default apache page.  If you get either a timeout or not found error, it may mean you have to configure the security group accordingly. You should “ALLOW” port 80/443 (http/Https) in the security group.


 
 
 
 
 
 

How to Disable an Adblocker-blocker or Create an Anti-Adblock Killer!

How to Disable an Adblocker-blocker or Create an Anti-Adblock Killer!

History & Theory:

Digital Advertisement:

I get it. Ads are a necessary evil in content delivery game. Hell, I have been in the engineering side of content delivery for 10 yrs myself.  So, back in the days of #dotcom #bubble, we endured Banner ADs. When the #BigBrother, oops #Google came up, they swept the market clean with their (initially, atleast) non-intrusive text ADs. And people even appreciated the contextual advertisement, just when you were searching for a suspension for your car, you see 4 different ADs for OEM grade replacement suspension, grease monkeys to install them and so on.
Fast forward 10 yrs and Google is the global powerhouse of advertisement. Google knows what your mothers’ cousin` once removed does like and runs ads tailored to it in no less than 50 websites run by Google and countless other affiliates. The convenience transformed itself into a mild hindrance and a major nuisance in no time.  In its core, Google, Microsoft and Yahoo ADs were all based on a relevance relevance engine. I.E. based on the content that is currently served by the publisher (website you’re visiting) they search for the relevant ADs from their database and one that matches and has the target profile matching yours (this is where privacy advocates go crazy) they serve this AD. In its simplest form the process look something like the below diagram.

ADsense process diagram
Contextual Advertisement – Process Flow (ADSense/ADWords)

For the inquisitive lot, who want to know the technicality, it looks a lot more complex than this and it is presented below.
Tentative Process flow of How ADWords and ADSense content advertisement happens.

Enter ADBlockers:

And soon, people found a way to block the ADs. As seen above, All of these Adverts are programmed to run using great stores of data from the backend. So, when a user visits site a lot happens in the backend and a script is used get the resultant AD piece. Technically inclined people started writing custom scripts that would stop this script which renders the ADs. In no time all the bells and whistles like #blacklist #whitelist #regular-expression support all came in. Once the modern browser came with support for content filtering built-in, it was easy to supplement them with custom lists and scripts to block these ads. And ADBlockers for every device, OS, browsers became available and public knowledge of the same exploded their use in around 2013-2015 period. (see graph below) . So, All seems rosy from here.

ADBlocker-Blocker

Publishers and their representative trade bodies, on the other hand, argue that Internet ads provide revenue to website owners, which enable the website owners to create or otherwise purchase content for the website. Publishers claim that the prevalent use of ad blocking software and devices could adversely affect website owner revenue and thus in turn lower the availability of free content on websites. So, there is no wonder that publishers have begun to block or evict users found to be using #ADBlockers. (A page from my personal experience, I do not remember a time when I did not use ADblock, before Mozilla, I used MyIE (Maxthon) which had this configurable filters). But, off-lately the publishers have become more aggressive and have rolled out a slew of their own warriors. AKA ADBlocker-Blocker. Which are nifty little utilities you can embed in your site and traffic from ADB enabled users will be blocked until they disable or whitelist you.  Some majors like Economist, Wired and others have announced a novel approach, either you can disable ADB on their site or pay a small fee to see their site without the clutter of advertisements. For the sites that do not offer this feature or If you wish to  simply override them, read on.

Practice & Implementation

So, enter Anti-AdBlocker Killer — https://github.com/reek/anti-adblock-killer
It’s simple, really: it tricks sites that use #anti-adblocker technology into thinking you aren’t using an adblocker. The #adblocker-blocker lets you keep your adblocker on when you visit a page that would usually disable it by using a JavaScript file and filter list. This means you can work around bans on adblockers from common news companies, like Forbes, which lock you out when you’re detected.
It works against a number of different technologies used to detect #adblock users, and is likely to be a part of the next #armsrace as publishers work out how to block the #adblockers using #adblocker-blockers. If you’re still reading, I will conclude my narration and give step-by-step instruction on how to enable it and activate.

Step-by-step Instruction to Activate Anti-Adblock Killer

  1. Step 1 – Get a Script Manager:
    1.  Greasemonkey or Scriptish
    2.  Tampermonkey or Native
    3.  Tampermonkey or Violentmonkey
    4.  Tampermonkey or NinjaKit
    5.  Tampermonkey
        • (* After installation, depending on your browser, may require a browser restart for it to effect)
  2. Step 2 – Subscribe to a FilterList
    1. Subscribe from github.com (I prefer this)
    2. Subscribe from reeksite.com 
      • At this point, if you chose Github list, you’ll be prompted with a list of Extension and you can chose to Manualy install AAKiller. (representative screenshot is shown below) 
  3. Step 3 – Get User Scripts
    1. Install from greasyfork.org
    2. Install from openuserjs.org
    3. Install from github.com
    4. Install from reeksite.com

Once this is done, you’re on your way to enjoy AD-Blocker pop-up free browsing.

More data to support Planet Nine Hypothesis

More data to support Planet Nine Hypothesis


Last year, the existence of an unknown planet in our Solar system was announced. However, this hypothesis was subsequently called into question as biases in the observational data were detected. Now, Spanish astronomers have used a novel technique to analyse the orbits of the so-called extreme trans-Neptunian objects and, once again, they report that there is something perturbing them—a planet located at a distance between 300 to 400 times the Earth-sun distance.
Like the comets that interact with Jupiter.
At the beginning of 2016, researchers from the California Institute of Technology (Caltech, USA) announced that they had evidence of the existence of this object, located at an average  of 700 AU and with a mass 10 times that of the Earth. Their calculations were motivated by the peculiar distribution of the orbits found for the trans-Neptunian objects (TNO) in the Kuiper belt, which suggested the presence of a Planet Nine within the solar system.
Using calculations and data mining, the Spanish astronomers have found that the nodes of the 28 ETNOs analysed (and the 24 extreme Centaurs with average distances from the Sun of more than 150 AU) are clustered in certain ranges of distances from the Sun; furthermore, they have found a correlation, where none should exist, between the positions of the nodes and the inclination, one of the parameters which defines the orientation of the orbits of these icy objects in space.
“Assuming that the ETNOs are dynamically similar to the comets that interact with Jupiter, we interpret these results as signs of the presence of a planet that is actively interacting with them in a range of distances from 300 to 400 AU,” says De la Fuente Marcos, who emphasizes: “We believe that what we are seeing here cannot be attributed to the presence of observational bias”.
Is there also a Planet Ten?
De la Fuente Marcos explains that the hypothetical Planet Nine suggested in this study has nothing to do with another possible planet or planetoid situated much closer to us, and hinted at by other recent findings.
Also applying data mining to the orbits of the TNOs of the Kuiper Belt, astronomers Kathryn Volk and Renu Malhotra from the University of Arizona (USA) have found that the plane on which these objects orbit the Sun is slightly warped, a fact that could be explained if there is a perturber of the size of Mars at 60 AU from the Sun.
“Given the current definition of planet, this other mysterious object may not be a true planet, even if it has a size similar to that of the Earth, as it could be surrounded by huge asteroids or dwarf planets,” explains the Spanish astronomer.
“In any case, we are convinced that Volk and Malhotra’s work has found solid evidence of the presence of a massive body beyond the so-called Kuiper Cliff, the furthest point of the trans-Neptunian belt, at some 50 AU from the Sun, and we hope to be able to present soon a new work which also supports its existence”.

Bitnami