Tag: traveltech

Business Value Delivery by Engineering Teams in StartUps – Part 2

Business Value Delivery by Engineering Teams in StartUps – Part 2

In this multi-part post, I will try to articulate my view on the importance of business value and its delivery by engineering teams. This is the second part, where I will share my perspective on the “How of it”.

Part 2: The How of it – Define, Visualise, Prioritise, Develop, Deliver & Measure.

The PMI Model of Delivering Business Value.

1). Define Systems Development Strategy 

The first thing a “Tech” Founder need to do is define the Systems Development Strategy. At a very high level, the systems development strategy should detail the state of the current/planned systems, the high-level business strategy for the next 2-5 years and maps out a plan to get there. An engineering leader will drive the creation and implementation of the development strategy to ensure the business can meet their current and future needs. Working closely with architects and technical leads, the engineering leader can formulate a solid development strategy.

The development strategy should detail the core architecture direction and technologies for the systems, including high-level plans for delivery. The development strategy is the crux of all efforts to deliver business value. Without a firm foundation of proper system architecture and technology, the business will have a difficult time delivering the value they need to survive. 

If you’re an Engineering Leader who joined the startup after the MVP is created, it is your responsibility to understand the business strategy and formulate the Development Strategy as early as possible.

If your startup doesn’t have a solid development strategy or similar document, the following is a great place to start:

  • Gather business needs: Gather high-level business needs/strategy to cater for the now and future (2-5 years) horizon. Not a deep dive, but deep enough to judge existing systems and measure other options. (Question like How many new new users will be added month-on-month, what is the order of magnitude of transactions we plan to rake, is it thousands or millions or tens of millions – Each will point you in a different direction on the system design)
  • Review of existing systems: Analysis of current systems around fit for purpose and whether it can be maintained and extended to meet the future needs uncovered in the above. (The MVP may seem to work fine and it will be tempting to build “On-Top” of it with a plethora of “Features”, resist the urge and pressure, if applicable)
  • Technologies / Architecture: Based on the review of the first two bullet points, you may recommend a strategic direction. The decision here could range from rebuilding the entire system with a new solution, to replacing components of the system with off the shelf/Open-source components. Alternatively, you may find the existing system is a strong foundation which needs modernising or scaling. In which case, the development strategy document would detail a range of architectural and technologies for future development. 

The above is a good starting point and will allow the business to get started on implementing the development strategy. You may do it even before starting with the Startup and make it a Pre-Joining exercise with the Founders and Senior folks. At the end of this exercise, you will have performed an extensive analysis of the current systems and have a strategic direction for the systems.

2). Help the Business Define Requirements

It is essential to understand what needs to be delivered before you can go ahead and deliver the next Amazon or Airbnb. It has been my experience that on occasion, the business will need some “External Inputs” to finalise what is required. 

When the business has a lot of ideas for improvements, they can sometimes get muddled together and lost. To counter this, we at ITILITE do a “Quarter Theme“. Before ITILITE I worked with Zarget, where we had a similar “Themed Quarterly Roadmaps” as well. This “Theming” helps in prioritising the focus areas. More on that in the next section

After which, we can visualise the entire scope of these ideas using User Story Maps. User story maps are visual representations of functionality requirements where all the requirements documented using a system of cards. It becomes a more straightforward (not easy) task to slice and dice these requirements using a story map to cull anything that is not critical to the business. 

For the remaining requirements, we need to gather a little more information to progress to the next step, for each requirement we need to capture:

  • Description: High-level description of the change. Not a HLD/LLD but enough to provide a high-level order of magnitude estimate.
  • Business benefits: Here we are looking to understand what benefits we can expect from the business change. 
  • High-Level estimate: Order of magnitude level estimate, lots of refinement to still take place, however, gives us a good idea around sizing.
  • Business SME and Sponsor: Details of people we can go to get more information.

The detail we capture for each of these changes is small, the reason being these items are a wish list only and not confirmed, so we do not want to waste more time on these then we need (lean thinking). While it is the domain of product managers and business analysts to flesh out business requirements and benefit statements, the engineering leader also plays an essential part in this process. Engineering leaders can use their experience to provide the high-level estimates for development, or indeed recommend ways to implement the requirement without the need to write additional code. 

Another area where engineering leaders should influence is ensuring and non-functional (technical strategy items or technical debt) is included for development prioritisation. These technical plumbing is not attractive to the business but could be critical for the business to achieve their long term goals. Engineering leaders are the people that need to fight to ensure they are on the table.

Also, while you analyse requirements, where possible, try to group requirements where they affect the same code or system module. Grouping requirements will assist us in prioritisation, sequencing and hence the Go-To-Market, which is a key parameter for the business. The last thing we need to do is storing these requirements in our product backlog, to be reviewed and prioritised by the business in our next step.

3) Visualise the work and prioritise

In our third step, we are getting closer to the business deciding on their valuable items. Taking our list of requirements from our product backlog, we now present these to the business to discuss and rank in order of importance.

As discussed above, there will always be x+n “Projects” in the asks. Where “x” is the number of features you can effectively deliver in the timeline. And all “Projects” will look like they are P0 to solve.

If Everything Is a Priority, Then Nothing Is!

Well, the quote wasn’t from Morpheus, I just liked that Meme (it is debated to be in between Yuri Van Der Sluis and Garr Reynolds)

Having an extensive list of items to visualise enables the business to understand that we cannot have everything, and need to select the items that will make the most significant difference to their business (i.e., highest business value). 

This is again, not because of intent, but because of trying to do “Too-Many” things and “Too-Soon”. Independently, all of the asks may sound truly important. Every Leader/Function within your Start-Up will come with several competing “Projects”. The Finance Team may want that flashy invoicing module or an ERP integration with your suppliers/customers, the Customer Success would want that Advanced Analytics platform integrated, The Support Teams may want that long-standing “Quirks” on the product ironed out. Left to Engineering, this is a sure recipe for disaster. This is where a Strong Product Leadership helps!

A business analyst or product manager typically runs these planning and prioritisation meetings. However, the engineering leader also has a place at the table to provide insight and assistance to the businesses decision-making process. Who from the business should attend these meetings? It is essential to gather a broad cross-section of business stakeholders for every Department or Function that uses the Product in question. We don’t want one department having too much influence that may not be of benefit to the business. 

The meeting could have the following Agenda:

  • Review items: The group will discuss each item in the (curated) product backlog in an open and honest discussion. 
  • Accept or reject: The item will be approved for development or rejected. Rejected items will have their requestor notified, to ensure they are in the loop. 
  • Ranking: Approved items get added to the backlog in priority order. 

At ITILITE, We have the backlog/Thematic items in a Google Sheet, which is distributed at least a week before the meeting to ensure the Leaders have enough time to review, ask any questions before the meeting to ensure a smooth meeting. During the meeting, we view the sheet, top to bottom taking notes where required.

At the end of this meeting, we have something special. We have a prioritised list of business value and Key Outcomes.  

The prioritisation meeting can be held quarterly or monthly, depending on the speed of change in your Start Up. We do it on a quarterly cycle to meet with the Leadership, so as not to overload these folks from actually getting their work done. 

If the business has urgent changes which require attention, an emergency prioritisation session can be called-in where a meeting can occur to review and approve changes to the delivery schedule. Alignment should happen outside on one-on-ones and this meeting is a platform for other leaders to either ascent or dissent on the re-prioritisation.

4) Schedule and communicate delivery

In the fourth step, we now have a list that ranks all the business requirements in priority order. We now have confidence that the business indeed wants these work items completed and the order they prefer. The engineering team can now spend time working out how to deliver these items. Remembering from step two, we gathered very high-level requirements, (so not to waste time before they were endorsed), we now need to finish fleshing these items out enough to commence delivery. 

There are a few mechanisms we can use to gather the information we need to get going, and the main one I like is the feature or project kickoff & inceptions. The kickoff is a process where we get the delivery team together to discuss the work that needs to be delivered. Inceptions can run anywhere up to a few weeks for big projects; it depends on how much time you allow here. During our inception, the delivery team all get on the same page with the requirements in question and can ask questions of each other or the sponsor to get all the information they need. 

Technical delivery decisions can also be made, including creating simple prototypes to test out delivery options. Once the inception is complete, the delivery team have all their information, more confident delivery estimates are possible, sprint planning can take place, and the overall delivery schedule is known.

From here, the final step is the communication of the delivery schedule to all relevant stakeholders. Ensuring people can ask questions or point out any problems they see with this schedule. 

5) Deliver value often get feedback

The final step here is to get the job done. The best way to deliver software is in small chunks completed during our sprints (typically two-week blocks). Sprints are the quickest way to deliver business value, allowing the business to gradually use this value much quicker than waiting for a monolithic release to occur. 

At the end of each sprint, the team should be running product demo events. A product showcase allows the product and engineering teams to show off their excellent work to the business, who have an opportunity to provide their feedback on the product. This can start even before the first “releasable” product is out. It can start with mockups, design and prototypes. Then it will progress to V0, V1 and so forth. This feedback loop is another mechanism to ensure we are hitting the mark in terms of the delivery of business value.

Conclusion

I hope i was able to do justice to the process in this article. The key to delivering business value is having close relationships with the stakeholders, ensuring that they are involved in each step of the process. The business stakeholders are the only folks that can define business value. However, it is the role of engineering leaders to ensure proper technical oversight takes place to ensure the timely delivery of business value. 

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Background:

As described in an earlier post, I run the Engineering at an early stage #traveltech #startup called Itilite. So, one of my responsibility is to architect, build and manage the cloud infrastructure for the company. Even though I have had designed/built and maintained the cloud infrastructure in my previous roles, this one was really challenging and interesting. Due in part to the fact, that the organisation is a high growth #traveltech startup and hence,

  1. The architecture landscape is still evolving,
  2. Performance criteria for the previous month look like the minimum acceptable criteria the next
  3. The sheer volume of user-growth, growth of traffic-per-user
  4. Addition of partner inventories which increases the capacity by an order of magnitude

And several others. Somewhere down the lane, after the infrastructure, code-pipeline and CI is set-up, you reach a point where managing (read: trigger intervention, analysis, storage, archival, retention) logs across several set of infrastructure clusters like development/testing, staging and production becomes a bit of an overkill.

Enter Log Management & Analytics

Having worked up from a simple tail/multitail to Graylog-aggregation of 18 server logs, including App-servers, Database servers, API-endpoints and everything in between. But, as my honoured colleague (former) Mr.Naveen Venkat (CPO of Zarget) used to mention in my days with Zarget, There are no “Go-To” persons in a start-up. You “Go-Figure” yourself!

There is definitely no “One size fits all” solution and especially, in a Start-up environment, you are always running behind Features, Timelines or Customers (scope, timeline, or cost in conventional PMI model).

So, After some due research to account for the recent advances in Logstash and Beats. I narrowed down on the possible contenders that can power our little log management system. They are,

  1. ELK Stack  — Build it from scratch, but have flexibility.
  2. Graylog  — Out of the box functionality, but you may have to tune up individual components to suit your needs.
  3. Fluentd — Entirely new log-management paradigm, interesting and we explored it a bit.

(I did not consider anything exotic or involves us paying (in future) anything more than what we pay for it in first year. So, some great tools like splunk, nagios, logpacker, logrythm were not considered)

Evaluation Process:

I wrote an Ansible script to create a replica environment and pull in the necessary configurations. And used previously written load-test job to simulate a typical work hour. This configuration was used for each of the frameworks/tools considered.

I started experimenting with Graylog, due to familiarity with the tool. Configured it the best way, I felt appropriate at that point in time.

Slight setback:

However, the collector I had used (Sidecar with Filebeat) had a major problem in sending files over 255KB and the interval was less than 5 secs. And the packets that are to be sent to the Elasticsearch never made it. And the pile-up caused a major issue for application stability.

One of the main use-case for us is to ingest XML/JSON data from multiple sources. (We run a polynomial regression across multiple sources, and use the nth derivatives to do further business operations). Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”. And all of these were “NICE” loads.

When the daily logs you need to export is in upwards of 5GB for an app (JSON logs), add multiple APIs and some micro-services application logs, web-server, load-balancers, CI (Jenkins), database-query-log, bin-log, redis and … yes, you get the point?

(())Upon further investigation, The sidecar collector was actually not the culprit. Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”. And all of these were “NICE” loads! (in our defence) 

So, once the CPU hit 100% mark, sidecar started behaving very differently. But, ultimately fixed it with a patched version of sidecar and actually shifting to NXLog.

Experiment with the ELK is a different beast in itself, as provisioning and configuring took a lot more time than I was comfortable with. So, switched to AWS “Packaged Service” . We deployed the ES domain in AWS, fired up a couple of Kibana and Logstash instances and connected them (after what appeard to be forever), it was a charm. Was able to get all information required in Kibana. One down-side is that you need to plan the Elastic Search indices according to how your log sources will grow. For us, it was impractical.

Fluentd was an excellent platform for normalising your logs, but then it also depended on Kibana/ES for the ultimate analysis frontend.

So, finally we settled down to good old Graylog.

Advantages of Graylog

 The tool perfectly fit into our workflow and evolving environment:

  1. Graylog is a free & open-source software. — So we wont have pay now or in future.
  2. Its trigger actions and notifications are a good compliment to Graylog monitoring, just a bit deeper!
  3. With error stack traces received from Graylog, engineers understand the context of any issue in the source code. This saves time and efforts for debugging/troubleshooting and bug fixing.
  4. The tool has a powerful search syntax, so it is easy to find exactly what you are looking for, even if you have terabytes of log data. The search queries could be saved. For really complex scenarios, you could write an ElasticSearch query and save it in the dashboard as a function.
  5. Graylog offers an archiving functionality, so everything older than 30 days could be stored on slow storage and re-imported into Graylog when such a need appears (for example, when the dev team need to investigate a certain event from the past).
  6. Java, Python & Ruby applications could be easily connected with Graylog as there is an out-of-box library for this.

#logmanagement #analytics #startup #hustle #opensource #graylog #elk

Building a Log-Management & Analytics Solution for Your StartUp

Building a Log-Management & Analytics Solution for Your StartUp

Background:

As described in an earlier post, I am working with an early stage startup. So, one of my responsibility is to architect, build and manage the cloud infrastructure for the company. Even though I have had designed/built and maintained the cloud infrastructure in my previous roles, this one was really challenging and interesting. Due in part to the fact, that the organisation is a high growth #traveltech startup and hence,

  1. The architecture landscape is still evolving,
  2. Performance criteria for the previous month look like the minimum acceptable criteria the next (in terms of itineraries automated, rating, mappings, etc.)
  3. The sheer volume of user-growth
  4. Addition of partner inventories which increases the capacity by an order of magnitude

And several others.  Somewhere down the lane, after the infrastructure, code-pipeline and CI is set-up, you reach a point where managing (read: trigger intervention,  analysis, storage, archival, retention) logs across several set of infrastructure clusters like development/testing, staging and production becomes a bit of an overkill.

Enter Log Management & Analytics

Having worked up from a simple tail/multitail to Graylog-aggregation of 18 server logs, including App-servers, Database servers, API-endpoints and everything in between. But, as my honoured colleague (former) Mr.Naveen Venkat  (CPO of Zarget) used to mention in my days with Zarget, There are no “Go-To” persons in a start-up. You “Go-Figure” yourself!
There is definitely no “One size fits all” solution and especially, in a Start-up environment, you are always running behind Features, Timelines or Customers (scope, timeline, or cost in conventional PMI model).
So, After some due research to account for the recent advances in Logstash and Beats. I narrowed down on the possible contenders that can power our little log management system. They are,

  1. ELK Stack
  2. Graylog
  3. Logstash 

(I did not consider anything exotic or involves us paying (in future) anything more than what we pay for it in first year. So, some great tools like splunk, nagios, logpacker, logrythm were not considered)

Evaluation Process:

I started experimenting with Graylog, due to familiarity with the tool. Configured it the best way, I felt it appropriate at that point in time. However, the collector I had used  (Sidecar ) had a major problem in sending files over 255KB and the interval was less than 5 secs.
One of the main use-case for us is to ingest the actual JSON data from multiple sources. (We run a polynomial regression  across multiple sources, and use the n th derivatives to do further business operations). When the daily logs you need to export is in upwards of  500MB for an app (JSON logs), add other application log(s), web-server, load-balancers, CI (Jenkins), database, redis and … yes,  you get the point?
(())Upon further investigation, The sidecar collector was actually not the culprit. Our architecture had accounted for several things, but by design, we used to hit momentary peaks in CPU utilisation for the “Merges”.  
So, once the CPU hit 100% mark, sidecar started behaving very differently. But, ultimately fixed it with a patched version of sidecar.
 
 

Bitnami