Author: Ramkumar Sundarakalatharan

Business Value Delivery by Engineering Teams in StartUps – Part 2

Business Value Delivery by Engineering Teams in StartUps – Part 2

In this multi-part post, I will try to articulate my view on the importance of business value and its delivery by engineering teams. This is the second part, where I will share my perspective on the “How of it”.

Part 2: The How of it – Define, Visualise, Prioritise, Develop, Deliver & Measure.

The PMI Model of Delivering Business Value.

1). Define Systems Development Strategy 

The first thing a “Tech” Founder need to do is define the Systems Development Strategy. At a very high level, the systems development strategy should detail the state of the current/planned systems, the high-level business strategy for the next 2-5 years and maps out a plan to get there. An engineering leader will drive the creation and implementation of the development strategy to ensure the business can meet their current and future needs. Working closely with architects and technical leads, the engineering leader can formulate a solid development strategy.

The development strategy should detail the core architecture direction and technologies for the systems, including high-level plans for delivery. The development strategy is the crux of all efforts to deliver business value. Without a firm foundation of proper system architecture and technology, the business will have a difficult time delivering the value they need to survive. 

If you’re an Engineering Leader who joined the startup after the MVP is created, it is your responsibility to understand the business strategy and formulate the Development Strategy as early as possible.

If your startup doesn’t have a solid development strategy or similar document, the following is a great place to start:

  • Gather business needs: Gather high-level business needs/strategy to cater for the now and future (2-5 years) horizon. Not a deep dive, but deep enough to judge existing systems and measure other options. (Question like How many new new users will be added month-on-month, what is the order of magnitude of transactions we plan to rake, is it thousands or millions or tens of millions – Each will point you in a different direction on the system design)
  • Review of existing systems: Analysis of current systems around fit for purpose and whether it can be maintained and extended to meet the future needs uncovered in the above. (The MVP may seem to work fine and it will be tempting to build “On-Top” of it with a plethora of “Features”, resist the urge and pressure, if applicable)
  • Technologies / Architecture: Based on the review of the first two bullet points, you may recommend a strategic direction. The decision here could range from rebuilding the entire system with a new solution, to replacing components of the system with off the shelf/Open-source components. Alternatively, you may find the existing system is a strong foundation which needs modernising or scaling. In which case, the development strategy document would detail a range of architectural and technologies for future development. 

The above is a good starting point and will allow the business to get started on implementing the development strategy. You may do it even before starting with the Startup and make it a Pre-Joining exercise with the Founders and Senior folks. At the end of this exercise, you will have performed an extensive analysis of the current systems and have a strategic direction for the systems.

2). Help the Business Define Requirements

It is essential to understand what needs to be delivered before you can go ahead and deliver the next Amazon or Airbnb. It has been my experience that on occasion, the business will need some “External Inputs” to finalise what is required. 

When the business has a lot of ideas for improvements, they can sometimes get muddled together and lost. To counter this, we at ITILITE do a “Quarter Theme“. Before ITILITE I worked with Zarget, where we had a similar “Themed Quarterly Roadmaps” as well. This “Theming” helps in prioritising the focus areas. More on that in the next section

After which, we can visualise the entire scope of these ideas using User Story Maps. User story maps are visual representations of functionality requirements where all the requirements documented using a system of cards. It becomes a more straightforward (not easy) task to slice and dice these requirements using a story map to cull anything that is not critical to the business. 

For the remaining requirements, we need to gather a little more information to progress to the next step, for each requirement we need to capture:

  • Description: High-level description of the change. Not a HLD/LLD but enough to provide a high-level order of magnitude estimate.
  • Business benefits: Here we are looking to understand what benefits we can expect from the business change. 
  • High-Level estimate: Order of magnitude level estimate, lots of refinement to still take place, however, gives us a good idea around sizing.
  • Business SME and Sponsor: Details of people we can go to get more information.

The detail we capture for each of these changes is small, the reason being these items are a wish list only and not confirmed, so we do not want to waste more time on these then we need (lean thinking). While it is the domain of product managers and business analysts to flesh out business requirements and benefit statements, the engineering leader also plays an essential part in this process. Engineering leaders can use their experience to provide the high-level estimates for development, or indeed recommend ways to implement the requirement without the need to write additional code. 

Another area where engineering leaders should influence is ensuring and non-functional (technical strategy items or technical debt) is included for development prioritisation. These technical plumbing is not attractive to the business but could be critical for the business to achieve their long term goals. Engineering leaders are the people that need to fight to ensure they are on the table.

Also, while you analyse requirements, where possible, try to group requirements where they affect the same code or system module. Grouping requirements will assist us in prioritisation, sequencing and hence the Go-To-Market, which is a key parameter for the business. The last thing we need to do is storing these requirements in our product backlog, to be reviewed and prioritised by the business in our next step.

3) Visualise the work and prioritise

In our third step, we are getting closer to the business deciding on their valuable items. Taking our list of requirements from our product backlog, we now present these to the business to discuss and rank in order of importance.

As discussed above, there will always be x+n “Projects” in the asks. Where “x” is the number of features you can effectively deliver in the timeline. And all “Projects” will look like they are P0 to solve.

If Everything Is a Priority, Then Nothing Is!

Well, the quote wasn’t from Morpheus, I just liked that Meme (it is debated to be in between Yuri Van Der Sluis and Garr Reynolds)

Having an extensive list of items to visualise enables the business to understand that we cannot have everything, and need to select the items that will make the most significant difference to their business (i.e., highest business value). 

This is again, not because of intent, but because of trying to do “Too-Many” things and “Too-Soon”. Independently, all of the asks may sound truly important. Every Leader/Function within your Start-Up will come with several competing “Projects”. The Finance Team may want that flashy invoicing module or an ERP integration with your suppliers/customers, the Customer Success would want that Advanced Analytics platform integrated, The Support Teams may want that long-standing “Quirks” on the product ironed out. Left to Engineering, this is a sure recipe for disaster. This is where a Strong Product Leadership helps!

A business analyst or product manager typically runs these planning and prioritisation meetings. However, the engineering leader also has a place at the table to provide insight and assistance to the businesses decision-making process. Who from the business should attend these meetings? It is essential to gather a broad cross-section of business stakeholders for every Department or Function that uses the Product in question. We don’t want one department having too much influence that may not be of benefit to the business. 

The meeting could have the following Agenda:

  • Review items: The group will discuss each item in the (curated) product backlog in an open and honest discussion. 
  • Accept or reject: The item will be approved for development or rejected. Rejected items will have their requestor notified, to ensure they are in the loop. 
  • Ranking: Approved items get added to the backlog in priority order. 

At ITILITE, We have the backlog/Thematic items in a Google Sheet, which is distributed at least a week before the meeting to ensure the Leaders have enough time to review, ask any questions before the meeting to ensure a smooth meeting. During the meeting, we view the sheet, top to bottom taking notes where required.

At the end of this meeting, we have something special. We have a prioritised list of business value and Key Outcomes.  

The prioritisation meeting can be held quarterly or monthly, depending on the speed of change in your Start Up. We do it on a quarterly cycle to meet with the Leadership, so as not to overload these folks from actually getting their work done. 

If the business has urgent changes which require attention, an emergency prioritisation session can be called-in where a meeting can occur to review and approve changes to the delivery schedule. Alignment should happen outside on one-on-ones and this meeting is a platform for other leaders to either ascent or dissent on the re-prioritisation.

4) Schedule and communicate delivery

In the fourth step, we now have a list that ranks all the business requirements in priority order. We now have confidence that the business indeed wants these work items completed and the order they prefer. The engineering team can now spend time working out how to deliver these items. Remembering from step two, we gathered very high-level requirements, (so not to waste time before they were endorsed), we now need to finish fleshing these items out enough to commence delivery. 

There are a few mechanisms we can use to gather the information we need to get going, and the main one I like is the feature or project kickoff & inceptions. The kickoff is a process where we get the delivery team together to discuss the work that needs to be delivered. Inceptions can run anywhere up to a few weeks for big projects; it depends on how much time you allow here. During our inception, the delivery team all get on the same page with the requirements in question and can ask questions of each other or the sponsor to get all the information they need. 

Technical delivery decisions can also be made, including creating simple prototypes to test out delivery options. Once the inception is complete, the delivery team have all their information, more confident delivery estimates are possible, sprint planning can take place, and the overall delivery schedule is known.

From here, the final step is the communication of the delivery schedule to all relevant stakeholders. Ensuring people can ask questions or point out any problems they see with this schedule. 

5) Deliver value often get feedback

The final step here is to get the job done. The best way to deliver software is in small chunks completed during our sprints (typically two-week blocks). Sprints are the quickest way to deliver business value, allowing the business to gradually use this value much quicker than waiting for a monolithic release to occur. 

At the end of each sprint, the team should be running product demo events. A product showcase allows the product and engineering teams to show off their excellent work to the business, who have an opportunity to provide their feedback on the product. This can start even before the first “releasable” product is out. It can start with mockups, design and prototypes. Then it will progress to V0, V1 and so forth. This feedback loop is another mechanism to ensure we are hitting the mark in terms of the delivery of business value.

Conclusion

I hope i was able to do justice to the process in this article. The key to delivering business value is having close relationships with the stakeholders, ensuring that they are involved in each step of the process. The business stakeholders are the only folks that can define business value. However, it is the role of engineering leaders to ensure proper technical oversight takes place to ensure the timely delivery of business value. 

Business Value delivery by Engineering Teams in StartUps – Part 1

Business Value delivery by Engineering Teams in StartUps – Part 1

In this multi-part post, I will try to articulate my view on the importance of business value and its delivery by engineering teams. While most of this is written from the view of a StartUp, some elements of an established organisation are also used.

Part 1: Defining Business Value & Role of Leadership in it.

Business value is a concept that can mean multiple things to multiple people and the tricky part is all of them could potentially be valid. A product manager may value a long list of features that his/her customers have demanded for months. Another Product Manager working with internal teams to improve efficiency (revenue) will value the enhancements the accounting or support team was after. While the support manager may value a more stable product to keep the customers, s/he deals with happy. 

Business value & impacts are a difficult thing to define and deliver, while it is even more difficult to measure.A collaborative effort is required to define and deliver business value, with consideration needed to ensure all voices are heard.

While most of what I will be covering in this article is typically the purview of product management, I have learned that engineering leaders have a critical role to play in this space. (Will write more on that in the next part.)

Engineering leaders bring product development experience and technical expertise to the table to provide a crucial element to the delivery of business value which I will try and explain in this article.

What is Business Value?

I would define “Business value” as any improvements to systems, processes or people that augment the products or the ability to deliver products or services to the customers, thereby increasing the revenue or experience or both. No two companies will have the same definition of business value. Forget two different companies, a company in its 5th year will have a very different perception of value to its first year. This is due to their products and customers being different and requiring different elements to add value. One company may find value in the ability to build out its new product offering quickly. While another may find value in responding to customer support requests in a timely manner. 

Due to the rapid changes around us, the things that businesses value changes often. Companies often face new challenges that require a quick response.

Be agile, be nimble” is the key phrase.

These challenges can come in the form of new product features released by competitors, or a specific feature request by a key customer, or changes in the market that render the current product/feature obsolete. Business needs or desires, therefore change just as quickly as any of these external changes.

You have probably worked for a company that comes to the engineering team with new requirements, seemingly daily?

It is not because they cannot make up their mind; It is in response to the changing business needs. This changing goalpost is one of the main reasons that Agile development practices have taken precedence from more traditional waterfall methodologies for software development. 

Velocity is everything, a report by McKinsey on how Developer Velocity fuels Business Performance will give more insights on this. A snapshot from the report is below.

Why is business value important? 

Reacting to change and delivering business value with haste is a crucial area of importance for modern businesses. All companies exist for a purpose. The majority of companies exist to return a profit for their owners (individuals or shareholders), while some companies exist to provide a social service. The critical thing to note is that they all exist to fulfil a specific purpose which guides their definition of business value. 

No matter the company large or small, if they stop innovating, and their products or services stop being relevant to society at large and market in particular, that company will whiter and eventually die.

Kodak is a prime example of this occurring in recent history. In today’s world, IT, whether it be hardware or software, is the largest driver of business value. It is therefore critical that the software engineering teams keep delivering the things that the business need to fuel their innovation.

We, as engineers, are not employed to just build that shiney app in the latest technologies, but to deliver our contribution in support of the business purpose (If not drive it!)

The importance of Engineering Leadership in Delivering Business Value

An engineering leader is, of course, a People Leader, and s/he is also responsible for the Execution, both technology and delivery of the engineering team. However there is a third dimension which often goes unrecognised, is that great engineering executives must also be great Business Leaders; they help drive alignment with other leadership/executives and shape the strategy and direction of the business itself.

It is this underutilised/forgotten element which I will try to detail here.

A People Leader & an Execution Champion:

Engineering leadership is often naively thought of as being simply a great Architect or Engineer or a Manager. But most of you already know it’s more than that. Team leadership will involve some combination of team building, culture, leadership development, and performance management.

For detailed coverage on Engineering Leadership – Please checkout my Previous Post

Most of this responsibilities will be bang in the middle of the comfort-zone of a rising Engineering Leader.  But one of the hardest things for most engineering leaders as we scale is, to continue having an accurate forecast of when products and features will be delivered – what the business always asks for.

That is partially because this bleeds into the third, and the least recognised dimension of engineering leadership.

The Missing Sauce: A Business Strategist

Engineering leadership isn’t just about delivering products faster, or making engineers more productive. It’s about guiding the team in the same direction as the business, about continuously improving, and it’s about being the voice of engineering as a part of the decision-making process of the executive team. Of course, these are all dependent on our ability to understand the work our engineering teams are doing and how it aligns to business goals.

The third dimension – Business Alignment – is often overlooked or made difficult by other executives, but is absolutely necessary for the management of a successful engineering org. This is the strategic practice of engineering management, and all operational decisions depend on it. Business alignment means ensuring your organization is focused on the right projects that align with the business’s goals. 

The Product org can detail/design and Engineering org can build as many features as they can agree on, but what/how does it matter, if they do not align with the business objectives or goals? Business alignment involves the right allocation of resources that supports business objectives, and helping to drive those business decisions of which projects are strategically important. (at itilite, this is always the First Principle)

How do we deliver business value?

So how do we actually deliver business value? Business value isn’t created by a soloist delivering a virtuoso performance, but a collaboration of the business, product, engineering and customer success teams working together to realise a shared vision. 

Below are the five ways this can take place; together, these provide a roadmap for delivering business value;

  • Define systems development strategy
  • Help business define requirements 
  • Visualise the work and prioritise
  • Schedule and communicate delivery
  • Deliver value often and get feedback

I will try to articulate through each of these one at a time and dig into a little more detail in Part 2 of this article.

Engineering Leadership in Start-Ups: Engineering Manager, Director, VP of Engineering.

Engineering Leadership in Start-Ups: Engineering Manager, Director, VP of Engineering.

This post is partly the result of my discussions with our People practice leader and talent acquisition executive. ITILITE is at a phase of growth, where are looking for more engineering & product management bandwidth. And I had to think hard to write the various Job-Descriptions. So, I have tried to generalise it using my experiences from the last 2-3 stints. In case you’re interested to explore an Engineering Management role with ITILITE, please get in touch with me or write to careers{at}itilite.com

Engineering Leadership

As apps are becoming increasingly omnipresent and in most cases, there is a startup behind them. Engineers make up to 70% of a tech startup’s workforce, there is an increasing need for managers who look after those developers. As a result, there is a rise in the number of engineering managers in recent years. Engineering managers are responsible for delivery teams that develop these “Apps”. The following is a very generalised version of what you could do in these roles and a possible career progression.

Engineer to Tech Lead/Lead Developer

The first step in your journey from an Individual Contributor(IC) to a management role. This could be a mix of people management, delivery management, process management etc, depending on the context of your organisation. In most organisations, it is a “technical mentorship” role with some aspects of people management, quality and delivery ownership.

Most Tech Leads are natural technical leaders. They are great engineers on their own, they were well respected by the engineers around them, they worked reasonably well with the team, they understood how the product/module was designed, built and shipped, they had a decent sense for making the right kinds of product tradeoffs and they were willing to do just enough project management and people development to keep the team/project humming along. 

In this role,

  • Most TLs would retain some independent deliverables in addition to anchoring and owning the deliveries of their team.
  • Most of the team still works on the same module/feature or sub-system
  • They do code & design reviews, suggest changes and have the final say for their modules.
  • Together with the Product Managers, they “own” the feature/module.

We at itilite, call them Engineering Owners, much like Product Owners

Tech Lead to Engineering Managers

The next step in the Engineering Manager. In this role, you will be “Managing” a collection of inter-related modules/projects. In this role, the focus on timely delivery, people management and quality are higher than technical design & architecture. But, you are very much an Engineer and may be required to occasionally write quick hacks, frameworks for your developers to build atop.

The main difference is you will be responsible for the delivery of multiple projects in a related area. You will be expected to optimise the resources (Devs, Testers, etc.) available with you to maximise the outputs of your group, across multiple projects/modules

In this role, you’d be

  • Expected to actively engage with the Product Management teams to define what needs to be built
  • Defining how you will measure the outcomes of what your team is building and quantify the outcomes with metrics
  • Ensuring quality, getting stakeholder alignment and signoffs
  • Macromanage the overall deliverables of your group

The Pivot – Tech, Product, Solution Architect

The next step in your career gives you two options. One with people management, P&L accountability and other a purely technical role. If you’re planning for a pureplay technical role, some organisations have Staff Engineer, Principal Engineer etc. In essence, they are mostly a combination of Tech Lead+Architect type roles. Depending on your seniority/tenure and organisational context, you may be reporting to an Engineering/Delivery Manager, Director/VP or the CTO. In this rolw,

  • You will work closely with Engineering Managers, Quality Assurance leads/managers and Product Owners to design the system architecture, define the performance baselines
  • You will work with Tech Leads and Sr.Devs to drive the performance, redundancy, scalability among other stuff.
  • You will be called into discussions/decide when the team can’t reach consensus on engineering choices

Engineering Manager to Director of Engineering

A Director of Engineering role is completely different. You now have multiple leads+managers, likely multiple projects within a general focus area of the organisation. This will mean there will be way more individual deliverables and project milestones than you can track in detail on a regular basis. Now you have to manage both people and projects “from the outside” rather than “from the inside”. You’ll likely start appreciating the metrics and dashboards, as they will help you in tracking those multiple projects and deadlines, schedules, overruns etc.

You have to make sure that your managers and leads are managing their resources appropriately and support them in their effort rather than managing individual contributors and projects directly.

Lots of great technical leaders have difficulty making this transition.

While being an engineering lead/manager is certainly managing, it’s type of managing from “within the project” is much easier than “managing from outside the project” and as a director, you almost always have to manage multiple people and projects “from the outside”.

Also, as a director, you will be responsible for a number of aspects of the culture, such us

  • What kind of people are you hiring, setting responsibilities and workload expectations,
  • What is the team(s) doing for fun, how do they interact with other functions
  • What kinds of performance is rewarded/encouraged vs punished/discouraged.

Now, moving to some serious responsibilities, you may be the first major line of responsibility for what to do when things does not work,

  • an employee not working out,
  • a project falling behind,
  • a project not meeting it’s objectives,
  • hiring not happening in time, etc…

While most of these things are the direct responsibility of the engineering manager, the engineering manager is usually not left to face these issues alone, they work on it with the director and the director is expected to guide the process to the right decision/outcome.

I’ve seen people who were great technical leaders and good engineering managers who did not enjoy being a director at all (or weren’t as good at it) because it was a whole different type of managing bordering the administration.

Director to Vice-President

The VP of Engineering is the executive responsible for all of engineering. Development, Quality, DevOps and partly to Security and Product Management as well. While both the engineering manager and director of engineering have managers who themselves have likely been engineering managers and directors before, the VP may work for the CEO (in an early stage Startup or a smaller company) who has never been a VP of Engineering before.

A large company may have multiple levels of VPs, but in most cases, you work for someone who hasn’t been a VP of Engineering or doesn’t actually know how to do your job. This means, there simply is no first-hand experience from your Manager, that you can rely on to solve your problem. The first time you step into that role and realize that, it’s a sobering thought. You’re a pretty much on your own to figure things out. Not only are you completely responsible for everything that happens in the engineering organization, but when things aren’t going right, there’s pretty much no help from anywhere else. You and your team have to figure it out by yourselves. Many successful VPs eventually come to like this autonomy, but it can be a big adjustment when moving from director to VP.

At the director level, you can always go to your VP for help and consulting on difficult issues and they can and should help you a lot. At the VP level, you may consult with the executive team or the CEO on some big decisions, but you’re more likely talking to them about larger tradeoffs that affect other parts of the company, not how you solve issues within your team.

As a VP, you are primarily responsible for setting up processes and procedures for your organization to make it productive:

  • Team/Project tools such as bug system, project tracking, source code management, versioning, build system, etc.
  • Defining/improving processes to track, monitor and report on projects.
  • Defining processes to deal with projects that run into trouble.
  • Hiring: How you hire? What kind of people do you hire? how do you maintain the quality of new hire?
  • Firing: When someone isn’t working out, how do you fix it: reassignment, training, performance plan, transfer, firing?
  • Training: How does your team get the training they might need, it could be hard-skills, soft-skills or managerial
  • Rewards: How do you reward your top individual contributors and for your top managers?

You may be part of the Leadership “Council” or participate regularly in business discussions that may or may not concern your department directly. In a startup, you are often “the” technical representative on exec staff. You help craft the strategy of the business. You are relied upon for technical direction of the company (sometimes with the help of a CTO).

As a VP, you are expected to understand many important aspects of other departments, what is important to other departments and how your department serves or interacts with or depends upon other departments. Two classic example might be,

  • Sales depending upon certain product features/capabilities being delivered in a given timeframe to be able to convert a prospect.
  • Customer success depending upon certain product fixes being delivered in a given timeframe.

As a VP, you will participate in the setting of these timeframes and balancing these against all the other things your department is being tasked to do.

As you can see, Engineering Management/Leadership is a very interesting career option. We have multiple opening across Product and Engineering functions at ITILITE. Please see if any of these roles interest you.

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Background:

As described in an earlier post, I run the Engineering at an early stage #traveltech #startup called Itilite. So, one of my responsibility is to architect, build and manage the cloud infrastructure for the company. Even though I have had designed/built and maintained the cloud infrastructure in my previous roles, this one was really challenging and interesting. Due in part to the fact, that the organisation is a high growth #traveltech startup and hence,

  1. The architecture landscape is still evolving,
  2. Performance criteria for the previous month look like the minimum acceptable criteria the next
  3. The sheer volume of user-growth, growth of traffic-per-user
  4. Addition of partner inventories which increases the capacity by an order of magnitude

And several others. Somewhere down the lane, after the infrastructure, code-pipeline and CI is set-up, you reach a point where managing (read: trigger intervention, analysis, storage, archival, retention) logs across several set of infrastructure clusters like development/testing, staging and production becomes a bit of an overkill.

Enter Log Management & Analytics

Having worked up from a simple tail/multitail to Graylog-aggregation of 18 server logs, including App-servers, Database servers, API-endpoints and everything in between. But, as my honoured colleague (former) Mr.Naveen Venkat (CPO of Zarget) used to mention in my days with Zarget, There are no “Go-To” persons in a start-up. You “Go-Figure” yourself!

There is definitely no “One size fits all” solution and especially, in a Start-up environment, you are always running behind Features, Timelines or Customers (scope, timeline, or cost in conventional PMI model).

So, After some due research to account for the recent advances in Logstash and Beats. I narrowed down on the possible contenders that can power our little log management system. They are,

  1. ELK Stack  — Build it from scratch, but have flexibility.
  2. Graylog  — Out of the box functionality, but you may have to tune up individual components to suit your needs.
  3. Fluentd — Entirely new log-management paradigm, interesting and we explored it a bit.

(I did not consider anything exotic or involves us paying (in future) anything more than what we pay for it in first year. So, some great tools like splunk, nagios, logpacker, logrythm were not considered)

Evaluation Process:

I wrote an Ansible script to create a replica environment and pull in the necessary configurations. And used previously written load-test job to simulate a typical work hour. This configuration was used for each of the frameworks/tools considered.

I started experimenting with Graylog, due to familiarity with the tool. Configured it the best way, I felt appropriate at that point in time.

Slight setback:

However, the collector I had used (Sidecar with Filebeat) had a major problem in sending files over 255KB and the interval was less than 5 secs. And the packets that are to be sent to the Elasticsearch never made it. And the pile-up caused a major issue for application stability.

One of the main use-case for us is to ingest XML/JSON data from multiple sources. (We run a polynomial regression across multiple sources, and use the nth derivatives to do further business operations). Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”. And all of these were “NICE” loads.

When the daily logs you need to export is in upwards of 5GB for an app (JSON logs), add multiple APIs and some micro-services application logs, web-server, load-balancers, CI (Jenkins), database-query-log, bin-log, redis and … yes, you get the point?

(())Upon further investigation, The sidecar collector was actually not the culprit. Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”. And all of these were “NICE” loads! (in our defence) 

So, once the CPU hit 100% mark, sidecar started behaving very differently. But, ultimately fixed it with a patched version of sidecar and actually shifting to NXLog.

Experiment with the ELK is a different beast in itself, as provisioning and configuring took a lot more time than I was comfortable with. So, switched to AWS “Packaged Service” . We deployed the ES domain in AWS, fired up a couple of Kibana and Logstash instances and connected them (after what appeard to be forever), it was a charm. Was able to get all information required in Kibana. One down-side is that you need to plan the Elastic Search indices according to how your log sources will grow. For us, it was impractical.

Fluentd was an excellent platform for normalising your logs, but then it also depended on Kibana/ES for the ultimate analysis frontend.

So, finally we settled down to good old Graylog.

Advantages of Graylog

 The tool perfectly fit into our workflow and evolving environment:

  1. Graylog is a free & open-source software. — So we wont have pay now or in future.
  2. Its trigger actions and notifications are a good compliment to Graylog monitoring, just a bit deeper!
  3. With error stack traces received from Graylog, engineers understand the context of any issue in the source code. This saves time and efforts for debugging/troubleshooting and bug fixing.
  4. The tool has a powerful search syntax, so it is easy to find exactly what you are looking for, even if you have terabytes of log data. The search queries could be saved. For really complex scenarios, you could write an ElasticSearch query and save it in the dashboard as a function.
  5. Graylog offers an archiving functionality, so everything older than 30 days could be stored on slow storage and re-imported into Graylog when such a need appears (for example, when the dev team need to investigate a certain event from the past).
  6. Java, Python & Ruby applications could be easily connected with Graylog as there is an out-of-box library for this.

#logmanagement #analytics #startup #hustle #opensource #graylog #elk

What is SA-Core-2018-002 and How Acquia Mitigated 500000 attacks on Drupal

What is SA-Core-2018-002 and How Acquia Mitigated 500000 attacks on Drupal

Disclaimer: I have been working on WCMS and specifically with Acquia/Drupal for more than seven years. And in that period, I have developed a Love/hate relationship with Drupal. Love for Drupal 6 and hate for 7. Or something like that. So my views may be slightly unneutral.
 
On March 28th, the Drupal Security Team released a bug fix for a critical security vulnerability, named SA-CORE-2018-002. Over the next week, various exploits have been identified, as attackers have attempted to compromise unpatched Drupal sites. Hackers continue to try to exploit this vulnerability, and Acquia’s own security team has observed more than 100,000 attacks a day.

Timeline of SA-CORE-2018-002

The Remote code execution exploit or the so-called SA-CORE-2018-002 was a vulnerability that had been present on various layers of Drupal 7 and 8. And Drupal being Drupal,  had one of the most efficient governance among Open Source projects around. This I can say with confidence and pride as I have had more than a few interactions with the community, notifying issues, committing documentation, in feature roadmap discussions (Agreed, some of them are heated!) and submitting patch/fixes. Drupal community has very high standards and even though your patch or fix has functionally addressed the underlying issue, it may be declined. That said, it’s also one of the democratic community software you can get. Still, They insist on following the stringent and high community standards for the modules or themes.
So, it is no surprise that Drupal today has one of the most Responsible Disclosure policy.
Drupal community had previously notified all the developers in official channels and had asked to prepare a high impact patch. Meanwhile, Acquia did the same for its SMEs and Enterprise clients as well. Those in the deep of it knew a bit early on the nature of exploit and mitigation strategy.
And in the community forums, there were detailed descriptions of planning this infrastructure patch up and how to plan for uptime, isolation post disclosure, patching, updation and redeployment.
Multiple methods to suit multiple needs of the environment, architecture etc has also begun to appear. It was one giant machinery, albeit a self-governing one in it. I have known large organisations do a hodge-podge patchwork and contain the underlying vulnerability. Leaving a vendetta-driven Ex-Employee or a determined Hacker to expose the inner workings of the exploit. It had resulted in many multi-million dollar loss. Only after the #Apache project had reached a state of maturity, did these larger organisations learnt the art of disclosure. but, how many of them were practising it is a big question.
Till 28th March 2018, there were no (publically) known exploit for the RCA in Drupal 7/8. 
This all changed after Checkpoint Research released a detailed step by step explanation of the security bug SA-CORE-2018-02 and how it can be exploited. In less than 6 hours after Checkpoint Research’s blog post, Vitalii Rudnykh, a Russian security researcher, shared a proof-of-concept exploit on GitHub.
The article by Checkpoint Research and Rudnykh’s proof-of-concept code have spawned numerous exploits, which are written in different programming languages such as Ruby, Bash, Python and more. As a result, the number of attacks has grown significantly after that.
The scale and the severity of this attack suggest that if you failed to upgrade your Drupal sites, or your site is not supported by Acquia Cloud or another trusted vendor that provides platform level fixes, the chances of your site being hacked are very high. If you haven’t upgraded your site yet and you are not on a protected platform then assume your site is compromised. Rebuild your host, reinstall Drupal from a backup taken before the vulnerability was announced and upgrade before putting the site back online.
Geographic distribution of SA-CORE Attack Vectors

Solution:

Upgrade to the most recent version of Drupal 7 or 8 core.

  • If you are running 7.x, upgrade to Drupal 7.58. (If you are unable to update immediately, you can attempt to apply this patch to fix the vulnerability until such time as you are able to completely update.)
  • If you are running 8.5.x, upgrade to Drupal 8.5.1. (If you are unable to update immediately, you can attempt to apply this patch to fix the vulnerability until such time as you are able to completely update.)

Drupal 8.3.x and 8.4.x are no longer supported and the community doesn’t normally provide security releases for unsupported minor releases. However, given the potential severity of this issue, Drupal community choose to provide 8.3.x and 8.4.x releases that include the fix for sites which have not yet had a chance to update to 8.5.0.

DevOps Post Series : 2, How to install and configure SSL/TLS Certificate on AWS EC2

DevOps Post Series : 2, How to install and configure SSL/TLS Certificate on AWS EC2

Assumption:

It is assumed, you have launched an EC2 instance with a valid Key, configured the Security groups, Installed Apache/Nginx and have deployed your app.

Background:

Now, its time to configure your TLS/SSL certificate. Why would you want to configure your own certificate, when you can get Amazon to issue a free TLS/SSL certificate? Well, there are more than a few use-cases that we have come across.

  1. First and Foremost is, AWS Certificate Manager certificates can be installed only on Elastic Load Balancers, Amazon CloudFront distributions, or APIs for Amazon API Gateway. (At the time of writing)
  2. You are building a staging/testing server and would test integrations in it and require SSL/TLS.
  3. You are just starting off, and have only one EC2 instance to start with. (you cannot install AWS provisioned certificate on a EC2 directly)
  4. Provisioning a new service, say for data exchange for your customers with their customers/vendors etc, and will be a very under utilised service.
  5. Planning an endpoint for SSO/OpenID etc. and prefer to have this part as logically different than your app.abc.com or abc.com.

And at least a dozens other use-cases that comes to my mind, but leaving out for brevity.

Getting Started

Self-Signed Certificates:

Firstly, enable apache  in your EC2 Instance and install/enable ssl.
(As usual, I’ll try to give the instruction for both RPM and DEB package based distributions)
[shell]sudo systemctl is-enabled httpd[/shell]
This should return “enabled” if not, enable it by typing the following,
[shell]sudo systemctl start httpd && sudo systemctl enable httpd [shell]sudo yum update -y [shell]sudo yum install -y mod_ssl[/shell]
And Follow the on-screen instructions, You would have answered some basic questions like domain Name, Country, Email ID etc. And if you accepted the default locations from the prompt, you would have generated 2 files in the following locations.
/etc/pki/tls/private/localhost.key – This is an auto-generated 2048-bit RSA private key for your Amazon EC2 host. You can also use this key to generate a certificate signing request (CSR) to submit to a certificate authority (CA).
/etc/pki/tls/certs/localhost.crt – This is a self-signed X.509 certificate for your server host. This certificate is useful only where you can control the “client” environment, like a testing or staging server.
Now, restart the apache
[shell]sudo systemctl restart httpd[/shell]
And try https://your-aws.public.dns or https://[yourpublicip].
Since you’re accessing your site with a self-signed, untrusted host certificate, your browser may display a series of security warnings. But, once you added it to the exception list, you should be good to go. This would be the end of it, if you’re only looking for a certificate to be used for staging/other controlled environments. If you want a public facing SSL, so your users/customer can login and access this new service,

CA-Signed Certificate

– Go to /etc/pki/tls/private/  and generate a new private key
[shell]sudo openssl genrsa -out virtualserver1.key 2048[/shell]
This generates an RSA key that is identical to the default key. You can generate a 4096-bit key, not use RSA altogether and depend on some other mathematical models as well. But those are beyond the scope of this post.
[bash]sudo chown root.rootvirtualserver1.key
sudo chmod 600 virtualserver1.key
ls -alvirtualserver1.key [/bash]
Now, you can use this key to generate a Certificate Signing Request
[shell]sudo openssl req -new -keyvirtualserver1.key -out csr.pem[/shell]
When you do this, OpenSSL will open a series of prompts for all sorts of data, the “CommonName” is one thing which is Mandatory for your to get a certificate. All other data requested by it are optional. Once you’re done with that, you should have generated a csr.pem.
Submit the CSR to a CA. This usually consists of opening your CSR file in a text editor and copying the contents into a web form. At this time, you may be asked to supply one or more subject alternate names (SANs) to be placed on the certificate.
Remove or rename the old self-signed host certificate localhost.crt from the/etc/pki/tls/certs directory and place the new CA-signed certificate there (along with any intermediate certificates).
Once you’ve copied the contents of the .key file in the form and submitted it with your CA, you would have received an Email confirming the “Issue” of the certificate. Once its done, you can check your application in Https, now it should be with a “green padlock”. Meaning fully secure.
However, you can run a security test on your SSL, just go to SSLLabs and start a test by giving your URL. After about 2-5 mins, you would receive a rating and details. SOmthing similar to the following image.

That’s It! You’re done.

DevOps Post Series : 1, How to install and configure LAMP on AWS EC2

DevOps Post Series : 1, How to install and configure LAMP on AWS EC2

In this #DevOps centric series of blog posts, I will write about some of the interesting yet common problems and their solutions or quick guides and how-tos. This is the result of setting up a new #Datacenter setup for the #Startup I am working.
 
In this post, I will assume that you have already launched an EC2 instance type with the operating system of your choice. Generally, Amazon Linux (based on RedHat/CentOS) or Ubuntu is the preferred OS of choice. In case you prefer an exotic flavour of Linux, which does not support either the rpm/yum(RHEL/CentOS/Fedora/AMI) or apt (Debian/Ubuntu and derivatives)  this article may not be of much use to you.

  1. Connect to your instance – Use the private key you downloaded during the ec2 launch.
    1. If you’re in Linux or Mac – use the following by replacing it with your private key name and instance’s public dns –  ssh -i "loginserver."[email protected]
    2. If you’ve launched an Amazon Linux, use “ec2-user” instead of “root”
    3. If you’ve launched an Ubuntu Linux, use “ubuntu” instead of “root”
    4. another important thing is to ensure that the private key has 0400 privilege and it is “owned” by the “User” as who you’ll execute the ssh connection.
  2. Update your package manager
    1. Amazon Linux : sudo yum update
    2. Ubuntu Linux: sudo apt-get update
  3. Tools & Utils (Optional/Personal Preference) I normally prefer to have a couple of tools installed in the server for quick-hacks/edits, monitoring etc.
    1. Amazon Linux : sudo yum install -y mc nano tree multitail git lynx
    2. Ubuntu Linux: sudo apt-get -y mc nano tree multitail git  lynx
      1. For details on the above-mentioned tools, refer the bottom of the article.
  4. LAMP Server
    1. Amazon Linux :sudo yum install -y httpd24 php70 mysql56-server php70-mysqlnd mysql56-client
    2. Ubuntu Linux: sudo apt-get install mysql-client-core-5.6 mysql-server-core-5.6 apache2 php libapache2-mod-php php-mcrypt php-mysql
      1. Your operating system will start to download and install the specified software, as for MySQL, you will be prompted for a root password. After installation, I strongly recommend you to run mysql_secure_installation and proceed with the onscreen instructions.
      2. Some of the critical things to do are remove the “test” db, remove access to "root"@"%", others are optional.
      3. The optional steps are,
        1. remove the anonymous user accounts.
        2. disable the remote root login.
        3. reload the privilege tables and save your changes.
  5. Configuration and other dependencies
    1. Amazon Linux :
      sudo yum install php70-mbstring.x86_64 php70-zip.x86_64 composer node -y
    2. Ubuntu replace yum install with apt0get install

Finally, restart the services and off you go. You have successfully installed LAMP server in EC2. Now, go to your browser and enter the publicDNS of the ec2 instance and you should be able to see the default apache page.  If you get either a timeout or not found error, it may mean you have to configure the security group accordingly. You should “ALLOW” port 80/443 (http/Https) in the security group.


 
 
 
 
 
 

How to Disable an Adblocker-blocker or Create an Anti-Adblock Killer!

How to Disable an Adblocker-blocker or Create an Anti-Adblock Killer!

History & Theory:

Digital Advertisement:

I get it. Ads are a necessary evil in content delivery game. Hell, I have been in the engineering side of content delivery for 10 yrs myself.  So, back in the days of #dotcom #bubble, we endured Banner ADs. When the #BigBrother, oops #Google came up, they swept the market clean with their (initially, atleast) non-intrusive text ADs. And people even appreciated the contextual advertisement, just when you were searching for a suspension for your car, you see 4 different ADs for OEM grade replacement suspension, grease monkeys to install them and so on.
Fast forward 10 yrs and Google is the global powerhouse of advertisement. Google knows what your mothers’ cousin` once removed does like and runs ads tailored to it in no less than 50 websites run by Google and countless other affiliates. The convenience transformed itself into a mild hindrance and a major nuisance in no time.  In its core, Google, Microsoft and Yahoo ADs were all based on a relevance relevance engine. I.E. based on the content that is currently served by the publisher (website you’re visiting) they search for the relevant ADs from their database and one that matches and has the target profile matching yours (this is where privacy advocates go crazy) they serve this AD. In its simplest form the process look something like the below diagram.

ADsense process diagram
Contextual Advertisement – Process Flow (ADSense/ADWords)

For the inquisitive lot, who want to know the technicality, it looks a lot more complex than this and it is presented below.
Tentative Process flow of How ADWords and ADSense content advertisement happens.

Enter ADBlockers:

And soon, people found a way to block the ADs. As seen above, All of these Adverts are programmed to run using great stores of data from the backend. So, when a user visits site a lot happens in the backend and a script is used get the resultant AD piece. Technically inclined people started writing custom scripts that would stop this script which renders the ADs. In no time all the bells and whistles like #blacklist #whitelist #regular-expression support all came in. Once the modern browser came with support for content filtering built-in, it was easy to supplement them with custom lists and scripts to block these ads. And ADBlockers for every device, OS, browsers became available and public knowledge of the same exploded their use in around 2013-2015 period. (see graph below) . So, All seems rosy from here.

ADBlocker-Blocker

Publishers and their representative trade bodies, on the other hand, argue that Internet ads provide revenue to website owners, which enable the website owners to create or otherwise purchase content for the website. Publishers claim that the prevalent use of ad blocking software and devices could adversely affect website owner revenue and thus in turn lower the availability of free content on websites. So, there is no wonder that publishers have begun to block or evict users found to be using #ADBlockers. (A page from my personal experience, I do not remember a time when I did not use ADblock, before Mozilla, I used MyIE (Maxthon) which had this configurable filters). But, off-lately the publishers have become more aggressive and have rolled out a slew of their own warriors. AKA ADBlocker-Blocker. Which are nifty little utilities you can embed in your site and traffic from ADB enabled users will be blocked until they disable or whitelist you.  Some majors like Economist, Wired and others have announced a novel approach, either you can disable ADB on their site or pay a small fee to see their site without the clutter of advertisements. For the sites that do not offer this feature or If you wish to  simply override them, read on.

Practice & Implementation

So, enter Anti-AdBlocker Killer — https://github.com/reek/anti-adblock-killer
It’s simple, really: it tricks sites that use #anti-adblocker technology into thinking you aren’t using an adblocker. The #adblocker-blocker lets you keep your adblocker on when you visit a page that would usually disable it by using a JavaScript file and filter list. This means you can work around bans on adblockers from common news companies, like Forbes, which lock you out when you’re detected.
It works against a number of different technologies used to detect #adblock users, and is likely to be a part of the next #armsrace as publishers work out how to block the #adblockers using #adblocker-blockers. If you’re still reading, I will conclude my narration and give step-by-step instruction on how to enable it and activate.

Step-by-step Instruction to Activate Anti-Adblock Killer

  1. Step 1 – Get a Script Manager:
    1.  Greasemonkey or Scriptish
    2.  Tampermonkey or Native
    3.  Tampermonkey or Violentmonkey
    4.  Tampermonkey or NinjaKit
    5.  Tampermonkey
        • (* After installation, depending on your browser, may require a browser restart for it to effect)
  2. Step 2 – Subscribe to a FilterList
    1. Subscribe from github.com (I prefer this)
    2. Subscribe from reeksite.com 
      • At this point, if you chose Github list, you’ll be prompted with a list of Extension and you can chose to Manualy install AAKiller. (representative screenshot is shown below) 
  3. Step 3 – Get User Scripts
    1. Install from greasyfork.org
    2. Install from openuserjs.org
    3. Install from github.com
    4. Install from reeksite.com

Once this is done, you’re on your way to enjoy AD-Blocker pop-up free browsing.

More data to support Planet Nine Hypothesis

More data to support Planet Nine Hypothesis


Last year, the existence of an unknown planet in our Solar system was announced. However, this hypothesis was subsequently called into question as biases in the observational data were detected. Now, Spanish astronomers have used a novel technique to analyse the orbits of the so-called extreme trans-Neptunian objects and, once again, they report that there is something perturbing them—a planet located at a distance between 300 to 400 times the Earth-sun distance.
Like the comets that interact with Jupiter.
At the beginning of 2016, researchers from the California Institute of Technology (Caltech, USA) announced that they had evidence of the existence of this object, located at an average  of 700 AU and with a mass 10 times that of the Earth. Their calculations were motivated by the peculiar distribution of the orbits found for the trans-Neptunian objects (TNO) in the Kuiper belt, which suggested the presence of a Planet Nine within the solar system.
Using calculations and data mining, the Spanish astronomers have found that the nodes of the 28 ETNOs analysed (and the 24 extreme Centaurs with average distances from the Sun of more than 150 AU) are clustered in certain ranges of distances from the Sun; furthermore, they have found a correlation, where none should exist, between the positions of the nodes and the inclination, one of the parameters which defines the orientation of the orbits of these icy objects in space.
“Assuming that the ETNOs are dynamically similar to the comets that interact with Jupiter, we interpret these results as signs of the presence of a planet that is actively interacting with them in a range of distances from 300 to 400 AU,” says De la Fuente Marcos, who emphasizes: “We believe that what we are seeing here cannot be attributed to the presence of observational bias”.
Is there also a Planet Ten?
De la Fuente Marcos explains that the hypothetical Planet Nine suggested in this study has nothing to do with another possible planet or planetoid situated much closer to us, and hinted at by other recent findings.
Also applying data mining to the orbits of the TNOs of the Kuiper Belt, astronomers Kathryn Volk and Renu Malhotra from the University of Arizona (USA) have found that the plane on which these objects orbit the Sun is slightly warped, a fact that could be explained if there is a perturber of the size of Mars at 60 AU from the Sun.
“Given the current definition of planet, this other mysterious object may not be a true planet, even if it has a size similar to that of the Earth, as it could be surrounded by huge asteroids or dwarf planets,” explains the Spanish astronomer.
“In any case, we are convinced that Volk and Malhotra’s work has found solid evidence of the presence of a massive body beyond the so-called Kuiper Cliff, the furthest point of the trans-Neptunian belt, at some 50 AU from the Sun, and we hope to be able to present soon a new work which also supports its existence”.

India Lauches PSLV C38 with 30 Satellites

India Lauches PSLV C38 with 30 Satellites

The Indian Space Research Organization (ISRO) successfully launched on Friday PSLV-C38 rocket on a mission to send 31 satellites, including India’s Cartosat 2 and NIUSAT satellites along with 29 foreign nano satellites, into orbit, ISRO said in a press release.
“India’s Polar Satellite Launch Vehicle, in its 40th flight (PSLV-C38), launched the 712 kg [0.7 tonnes] Cartosat-2 series satellite for earth observation and 30 co-passenger satellites together weighing about 243 kg [0.2 tonnes] at lift-off into a 505 km [313 mile] polar Sun Synchronous Orbit (SSO),” ISRO said.
According to ISRO, the co-passenger satellites comprise 29 nano satellites from 14 countries namely, Austria, Belgium, Chile, the Czech Republic, Finland, France, Germany, Italy, Japan, Latvia, Lithuania, Slovakia, the United Kingdom and the United States as well as one nano satellite from India.

Bitnami