AKS Monitoring At Scale – Part 1 (AMA/Container Insight)

Azure Container Monitoring for Azure Kubernetes Services (AKS) is a vital aspect that varies from one deployment to another. Although enabling AKS Diagnostics Settings and sending logs to a Log Analytics Workspace might seem like enough, there are still a few moving parts to consider. In this blog post, we will discuss the importance of Container Insight and how to automatically and efficiently onboard it for any cluster.

First and Foremost, let me put ways of monitoring AKS Cluster and it’s component;

  1. Container Insight (Workload Metrics/Logs)
  2. AKS Control Plan Logs (Diagnostics Logs)
  3. Prometheus Logs (New Offering)

In this post, I am going to cover `Container Insight` using Azure Monitor Agent (AMA). It is recommended that any AKS cluster should have it deployed as must, best practice. To deploy this, you need;

  1. Log Analytics Workspace where you will send logs/metric to
  2. User Managed Identity (Best Practice for Remediation Task) , Else System Assigned Managed Identity
  3. Enable Add-On for Monitoring (Azure Monitoring Agent Container)

You can onboard any cluster with Container Insight as part of deployment by using a provided configuration. Alternatively, you could use Azure Policy to do it for you. In this blog post, we will show you how to automatically and efficiently onboard Container Insight for any cluster, using standardized configurations.

Have a review of demo which is covering scenario;

Key things and policy code,

This is a personal experience sharing post that will provide you with practical insights on how to get started with Container Insight for AKS cluster monitoring. Don’t hesitate to take advantage of this best practice and ensure the performance and health of your Kubernetes cluster and container workloads. Also, please feel free to add your comment. Please note, technology keep changing at rapid pace. Don’t be surprised we shall have better alternatives.

Coming blog posts, I may cover following topics;

  • AKS Monitoring using Managed Prometheus and Managed Grafana at Scale
  • AKS Diagnostic Monitoring at Scale
  • AKS all Monitoring using ALZ Terraform Module Custom Policy Extension
  • How to have AKS Enterprise Wide Dashboards for Central Monitoring

2019: personal learning index

I am firm believer that learning should never stop. No one is perfectionist, No one can learn everything, but one must keep on trying hard to learn new things/trend/technology.

On my personal learning index, 2018 was focus on building base for AWS similarly 2019 went for building base for GCP. I have Azure practitioner since 2012, therefore for last few years on Azure i have moved focused from traditional IaaS to PaaS/CICD as well as Data/AI.

Learning is of no use if we don’t apply in day-to-day life. Thanks to my job, where i get almost every requirement as new requirement. Thus, every solution i build not only allow me to use my learning but also push me further to explore and learn more.

I have tried to collate my learning KPI which i have achieved in last year 2019. Using this as based, i shall move forward in 2020. Like in sales target will keep on increasing YoY or QoQ, similarly self-target be it learning or knowledge should keep in increase.

AzureAWSGCPTOGAF
Platform utilized for learning and off-course a ton of native documentation read (nothing beats that)Linux Academy, Microsoft Learn, EDXLinux Academy, EDX, AWS QuickstartLinux Academy, Qwiklabs, Coursera, Togaf Online Guide, Udemy course on Togaf
Unique Course Focuses
Designing and Build IOT Solution, Security, DevOps Developing Solution on AWS (PaaS side), DevOps (CICD using AWS native)
GCP PCA, Hybrid Networking, Network Specialization,
Time Spent in Online Courses (hours) 20+6+60+75+
Labs Attempted15+25+50+More of day to day Practice learning, specially ADM Guidelines & Techniques
Boot-camp AttendedSAP HANA on Azure (onsite)
Designing & Building AI Solution using Azure Cognitive Services
First focus was building base
Certification AchievedAZ 300
AZ900
AZ103
GCP-PCA,Togaf 9
Next Steps for 2020AZ 301AWS 2 Specialty (prefer Security and Networking)GCP-Network Professional (this one already failed once & clueless why despite 100% sure on 84% answers) and SecurityEnhance Practicality of Togaf in daily operation

My personal favorite has been and will always be ‘labs‘. Unless you get your hands dirty in implementing the solution, you can not learn from Online course or documentations. Thus, i chose platform to learn which can give me lot of labs to performs.

Such matrix help me to keep me laser focused on goal. In fact, i have built a mind-map chart which also helps me to build my learning path without focusing too much too many things.

At last, I am not at all firm believer of certifications. But, unfortunately on this basis, this is what Industry recognize you as expert, only after you are certified on xyz with abc specialty. Therefore, it is wise to have these badges but do not compromise on true learning/knowledge.

I tried to collate things and share it with you all as an experience so that you may try to replicate something similar for your learning journey. Most of you may be already expert on domain, thus find it very basic but it might be helpful for someone which may have not started such journey yet.

Thank You and Happy New Year 2020 !!!