Talks

Performance-Ops for enabling business growth

Agile delivery, DevOps and Cloud have been great enablers for digital businesses to achieve quicker time-to-market and innovate faster, in-turn helping them stay competitive in the market. Business growth aspirations implicitly depend heavily on the performance, scalability and reliability of underlying enterprise systems at growing user load. If a business plans to grow 10x whereas their as-is systems struggle to support even 2x load due to high latencies and poor scalability, then business growth is limited. In this talk we present how Performance-Ops can help architects and DevOps engineers continuously deliver expected system performance across the enterprise without compromising on the reliability and stability of their systems thus enabling business growth.

Self service infrastructure provisioning with AWS Service Catalog

Speakers : Saptak Takalkar , Dharani Sowndharya Gopalakrishnan

As the world is moving faster towards digital transformation, infrastructure has become an important enabler for organisations in creating better value. Users rely on the organisation's DevOps teams for several essential infrastructure services that can include everything from virtual machines, servers, containers, databases, and complete multi-tier applications.

Usually the organization's DevOps teams are being flooded with these infrastructure provisioning requests that cannot be serviced, or requests that demand more information on service offerings, which causes DevOps teams to become a bottleneck. This process of getting infrastructure services provisioned is not sustainable for a long period. Such infrastructure services need to be presented to users in an organized, access controlled manner with easy-to-use self service interface and good documentation.

That is where the Delivery Infrastructure (DI) Hub comes to help and AWS Service Catalog (SC) can be leveraged to build the DI Hub. AWS SC acts as the single point of contact for users to view the list of eligible & approved infrastructure services, which are compliant with the organization's standards & security requirements, and get them provisioned. At its heart, SC is very much like a menu at a restaurant. Without one, customers would find it difficult to place an order.

In this talk, you’ll learn

Why do we need a self service DI Hub?
What is DI Hub and its components?
- What is Infra Product (higher order infrastructure)?
- What is Product Catalog?
- Publisher Workflow
- Consumer Workflow
What is AWS Service Catalog?
How does the AWS Service Catalog help to build DI Hub?
Benefits of using AWS SC
AWS SC Demo
SC offerings by other cloud service providers

Cloud Security

Speaker : Vaibhav Mani Tripathi

Cloud had been on the rise well before the pandemic, but cloud adoption achieved unprecedented growth during the past three years. This growth brought along an urgent need to improve security controls for secure access to cloud resources and cloud-native deployments. As a result, cloud security has become one of the fastest-growing segments in the IT security market.

A host of ideas come to mind whenever we hear the term "cloud security", but we struggle to organise them. More importantly, we find it difficult to structure them to gauge our current security posture. There is a lot of content available on cloud security, but most of it is from vendors talking about their services and tools. While these are helpful, they limit our perspective on the tool's effectiveness. In this talk, we will try to uncover those aspects and address what exactly cloud security means from the consumer's point of view and why it is crucial.

Though there is no such thing as perfect security protection, accepting some risks is necessary for leveraging public cloud services and ignoring these risks can be detrimental. We will discuss certain lenses which can enable us to design cloud security for achieving defence-in-depth and comply with zero trust architecture. We'll be primarily using references of AWS, but the concept can be extended to other clouds.

Building for a Billion
Lessons learned building distributed cloud platforms & products for a billion users

Speakers : Mayank Kapoor

In this talk, we will explore how to build distributed platforms at scale and for diverse organisations. We'll look at the challenges of building distributed systems for different kinds of companies & the lessons learned—from the basics of ensuring stability and reliability, to ease of use, to building complex systems that work in a real-world environment.

Software Development Practices for Infrastructure Code

Speakers : Gopal Singhal, Monish Jain

"With the evolution of DevOps, most of the organizations are moving towards Infrastructure as code. Whether it is cloud infrastructure, on-prem, configuration management, monitoring or containers, everything is provisioned and managed using code. Things become complicated and unmanageable when the scale of this code increases and the reason is that, in most organizations Infrastructure code is still managed the old way. With the increase in the scale of infrastructure code we also need to adopt software development practices for this code. Infrastructure code should be written with considerations of software coding principles like DRY, SOLID and may require to follow some design patterns to cater issues related to scale. Infrastructure code should also follow software delivery principles about versioning strategies, SAST, Unit Testing, Security Testing, Integration Testing, CI/CD and release processes. etc.

In this talk we will discuss some of these principles and how these can be applied to IaC taking Terraform as an example."

Evolutionary Journey To Cloud using GCP Cloud Migration

Speakers : Ankit Adlakha, Shubham Deshmukh

Cloud Migration Service market size was valued at $88.46 billion in 2019 and projected to be $515.83 by 2027 and growing at a CAGR of 24.8% (source-alliedmarketresearch). It clearly shows that many organizations are migrating to the cloud for its capabilities in this cloud era. Adoption of the cloud has aided businesses in devising cost-cutting strategies while ensuring that their data and systems are accessible to their customers always and from any location. Cloud computing is clearly demonstrating its potential for a variety of industries, and it is also expanding.

AWS, Microsoft Azure, and Google Cloud are just a few of the many cloud service providers available worldwide. There are various options for migrating from on-premises servers to cloud servers. Furthermore, you have now options for migrating from one cloud provider to another. In this talk, we will walk you through Cloud Migration, Migration strategies and a demo of how a VM can be migrated from AWS to GCP or GCP to Azure. We will also talk about the challenges we faced during the migration.

Building Scalable MQTT Infrastructure for Multi Tenant IOT Platform

Speakers : Nagarajan Selvaraj, Goushikaa Thirumoorthi, Sreekesh S

IoT as a technology became an essential need for many of the businesses as well as in automated home setup of everyday life. There are numerous real-world applications of the internet of things, ranging from consumer IoT and enterprise IoT to manufacturing and industrial IoT. As IoT devices use MQTT as one of the standard protocols, it is designed as an extremely lightweight publish/subscribe messaging transport that is ideal for connecting remote devices with a small code footprint and minimal network bandwidth.

We will walk you through our journey of scaling MQTT connections which helped our customer business, to shift capability from handling thousands of devices to millions of devices. This talk will also cover the challenges faced with the broker configurations, infrastructure , tools we used for running load tests against the setup and additional capabilities that are required to build a stable MQTT cluster for a high scale IoT platform.

Error Handling in Distributed System

Speaker : Deepti Mittal

While designing any system more focus is on how the system would behave in normal scenarios which is 90 to 95% of the time. Some teams make the mistake of thinking about error handling closer to production releases which might cause major changes in design and can put the entire release on risk. Exception scenarios have the power to bring the whole system down if not handled well. In this talk, I want to cover how you will not let that power be used by those exceptions.

Based on my experience in past projects, we will be covering below aspects of error handling scenarios:

1. Start thinking about error scenarios along with normal scenarios

2. How to bring mind shift change for clients and product to not consider exception/error handling as tech feature/tech work/tech debt

3. Importance of deciding rules and laying down guidance for error handling in systems for easy maintenance and production support

4. When too much infrastructure can cause errors in system

I will be referring Kafka as distributed system to explain concepts for error and exception handling so talk will also cover basics related concept of Kafka

Terrafile for SRE

Speaker : Sriharsha Kalluru

Over the past decade we have seen the evolution of DevOps and essentially we have proven that it is definitely the game changer and this practice has empowered the overall software delivery over Agile. However while adopting the same practices and approaches at an enterprise level, we are clearly seeing the challenges in dealing with companies that have a big platform. So doing IaC is one thing where implementing and following the same practice across the platform is very much challenging.

Most of the people are using IaC at a project level of very small scale & forgetting the key code concept of DRY. This is essentially needed on the platform engineering to re-use the code across projects. Thus a major challenge arises, how to manage the code that can work with multiple projects when every project has a different type of configuration. Terraform is definitely great and at the same time it is painful while restructuring. However if we follow certain practices then we would definitely be able to avoid the future hurdles and problems.

Terrafile is a very simple wrapper tool that helps project teams to consume the platform common code and make use of certain versions of Terraform modules in a simple manner.

So we will see the following during the demo.

Problems with the project code structure.
Are we keeping the IaC code dry ?
Why I don't see Terragrunt the people are using as the best solution.
Demonstration of Terafile.

Chaos Engineering

Speaker : Prashanth R

A predictable system is a myth. In today’s internet-connected world there are no systems in silos. Every system talks to one or more systems for it to work and in reality failure is inevitable. In the real world, one or more systems can fail for multiple reasons which we don’t anticipate. If this is not managed, it will result in a catastrophe.

Here we try to simulate randomness and chaos in systems to identify potential bottlenecks and weaknesses in systems to be more prepared if a real chaos happens.

Failures cannot be avoided but it can definitely be managed when prepared in advance. That is what we will explore in this topic.

In this workshop, we will cover the why, what and how to implement chaos engineering practices along with a demo using litmus chaos toolkit. The key takeaways will be-

1. Setup one kind of chaos test in a kubernetes cluster.
2. Analyze the results and improve your infrastructure for resilience
3. Template for setting up more kinds of tests
4. A mental framework for building resilient applications

Prerequisites:

Docker with kubernetes using minikube or rancher or k3d or any other kubernetes setup

Page updated

Report abuse