Over the last few months, the Cloud Architecture team in the Office of the CTO (OCTO) has been fervently releasing code for the upcoming release of Tanzu Service Mesh (TSM) and preparing for VMworld 2020.  I wanted to take this opportunity to introduce our Cloud Architecture team and share what we will be presenting at VMworld this year.

The Driving Forces behind the Cloud Architecture Team

When VMware CTO, Greg Lavender, joined VMware in 2018, he formed a new Cloud Architecture function within the Office of the CTO; a function solely focused on a top-down approach to Cloud Architecture and a new way of thinking about cloud from a holistic multi-cloud application platforms perspective.  Our defined charter is focused on special projects at the intersection of application platforms and cloud architectures, co-innovating across VMware to design and engineer new product-enabling features.

Shortly after Greg joined VMware, I was hired by Greg to compose a framework around the ideas he had. We started exploring with the Application Enabled Infrastructure – Practical Cloud Native and the Rise of Application Platforms.  We considered multiple ideas,  however, we quickly focused on the idea of common multi-cloud application platform runtimes and the various Layer 7 (L7) services you could offer to deal with traffic routing, resiliency and security patterns.  As we formulated a high-level design on these ideas, we teamed up with the Tanzu Service Mesh team that is led by Pere Monclus, Networking Security BU’s CTO, to develop some of these services. The first service we developed was an internal project known as Predictable Response Time Controller (PRTC) (patent pending) which provides an ability to declaratively scribe performance SLO thresholds and ensure the underlying cloud runtime system executes and adheres to these set thresholds.  Most importantly, we ventured to ensure such functionality could be added to the application platform across any cloud without intrusion to the application code logic (see previous blogs on SLOs: The Emergence of Universal Language in the Enterprise and Emergence of Cloud Runtimes).

As we began to grow the team, we hired Michael Gasch to focus on all things Kubernetes and distributed systems software development. We then added Daniel Linsley to focus on SRE use cases, Rafael Brito to focus on enterprise Kubernetes lifecycle across private and public clouds, Frankie Gold on various cloud native projects, and Diwan Chandrabose as the development lead for our offshore team (Ajit Roy, Dilip Tadepalli, Ramesh Puli). We also added Chris Slater on advanced cloud solutions for financials vertical, Harmen Van der Linde on observability frameworks, and Michael Hein on VCF architecture.

The VMware Cloud Architecture team focuses on the following areas:

Cloud Architecture Sessions at VMworld 2020

As you can see from the above list of projects, the team has been busy co-innovating across VMware. Below is a list of sessions we are excited to present to you in the Vision & Innovation track this year at VMworld 2020. 

Kubernetes Operators for VMware Tanzu Kubernetes Grid [KUB1248]

Tom Schwaller, Technical Product Line Manager, VMware
Michael Gasch, Staff Engineer, VMware Office of the CTO

Kubernetes operators are methods of packaging, deploying and managing Kubernetes applications. They extend the Kubernetes functionality with application-specific logic using custom resources and controllers. With the operator pattern, you can encode domain knowledge of specific applications into a Kubernetes API extension. In this session, you will get an overview of Kubernetes operators (e.g., PostgreSQL/Harbor Operator) and how to write them in Python. After this session, you should feel comfortable using and developing operators. No prior Python or software development background is required to attend this session.

VEBA and the Power of Event-Driven Automation – Reloaded [HCP1358]

William Lam, Senior Staff Solution Architect, VMware
Michael Gasch, Staff Engineer, VMware Office of the CTO

The VMware Event Broker Appliance (VEBA) was released at VMworld 2019 to bring the power of event-driven automation to our VMware communities, and the immediate feedback and adoption has been overwhelmingly positive. Since then, we have made a number of enhancements based on feedback. In this session, we will briefly recap on what is VEBA and the improvements made since the first release. We will walk through popular uses—such as chat notifications, auditing/reporting, and scanning a VM for vulnerabilities—with only minimal code required in any scripting/programming language. We will also share potential ideas and enhancements based on community feedback as VEBA would not have been possible without you.

Arm Yourself with Event-Driven Functions and Reimagine SDDC Capabilities [HCP1404]

Partheeban Kandasamy, EUC Staff Customer Success Architect, VMware
Frankie Gold, Member of Technical Staff, VMware Office of the CTO

Add programming to your sysadmin arsenal. Join this session to see how we use two popular languages to explore building your very first event-driven functions. AWS Lambda and other functions as a service (FaaS) have revolutionized microservices by providing a serverless and scalable solution that enables writing reusable functions in any language. VMware Event Broker Appliance (VEBA) brings the convenience of FaaS to VI administrators and enables seamless, event-driven VMware vCenter extensions. This session will demonstrate how to build a function using Python, and then build that same function with Go. Go is the language of choice for writing cloud computing apps as it is a pragmatic and straightforward language that aims to be beginner-friendly, and has a syntax that is as readable as Python and boosts the performance of Java or C++.

Golang for vCenter Admins – 10 Quick Steps [HCP1263]

Daniel Linsley, Staff Engineer, VMware Office of the CTO
Frankie Gold, Member of Technical Staff, VMware Office of the CTO

Since its initial release in 2009, VMware vCenter has gained features and complexity, putting every aspect of managing large collections of infrastructures (e.g., CPUs, storage and networking) under the care and control of VMware vSphere administrators. There are many ways to monitor and manage vCenter, including through the graphical UI and console scripts via the command line. Another not-often-used way is utilizing a vSphere API called govmomi. This session will bring the power of programming out of the darkness and into the light with 10 straightforward steps that aim to fire up the imagination and possibilities to what coding can do for a vSphere administrator, starting with generating tags and automatically changing configuration settings based on the tags applied to a VM.

Cloud Observability Frameworks for Modern Application Platforms [OCTO3016]

Harmen Van der Linde, Senior Director Multi-Cloud Architecture, VMware Office of the CTO
Conor Beverland, Product Line Manager, VMware

This session will cover observability, a new technology discipline that is quickly gaining industry traction as part of the enterprise migration towards the cloud. Observability, together with data analytics and automation, enables implementation of actionable feedback loops for effectively managing and optimizing cloud-native infrastructure and applications. IT agility is about deploying infrastructure and applications faster in a consistent secure, reliable, and repeatable way. To achieve this goal ongoing feedback is required to get insights into the state and health of applications and underlying IT infrastructure. Observability is focused on providing this type of feedback by collecting telemetry data from the various IT technology layers.

Disaster Recovery, Business Continuity and Kubernetes Migrations in Tanzu With Rafael Brito [OCTO3023]

Rafael Brito, Staff Engineer, VMware Office of the CTO

Join engineers from Office of CTO to learn how Velero, a product under the Tanzu portfolio, is enabling VMWare customers to solve Business Continuity Disaster Recovery for their enterprise apps running on Kubernetes. In this roundtable we will discuss you specific use cases and how we can help with disaster recovery, data migration, and data protection using Velero with integrated features with Kubernetes.

Best Practices for Enterprise Kubernetes on the VMware SDDC with Rafael Brito [OCTO3017]

Rafael Brito,  Staff Engineer, VMware Office of the CTO

Join the author of paper Best Practices for Red Hat OpenShift on the VMware SDDC and discuss your deployment of Enterprise Kubernetes on the VMware SDDC in terms of availability, interoperability, scalability, and performance. Also, learn to size Kubernetes nodes and how to manage resources using both VMware vSphere and Kubernetes scheduler.

Delivering on Application SLAs with a Multi-Cloud Runtime [OCTO3014]

Emad Benjamin, Senior Director, Chief Technologist for Cloud Application Platforms, VMware Office of the CTO
Mark Schweighardt, Director, Tanzu Service Mesh, VMware

Over the past several years, developers have leveraged public clouds for native application services to accelerate feature releases as a primary benefit. However, this has resulted in multiple operational silos—with each cloud having its own tools, processes, policies and site reliability engineering (SRE) teams—making it difficult to achieve overall application service-level agreements (SLAs). In this session, we will demonstrate how these multi-cloud operational challenges can be mitigated with a common distributed runtime that gathers appropriate telemetry and inter-service call graphs. With this information, the multi-cloud runtime can calculate when to issue specific control actions for applications running across any cloud to meet critical application SLAs.

Delivering on Application SLAs with a Multi-Cloud Runtime [OCTO1290]

Emad Benjamin, Senior Director, Chief Technologist for Cloud Application Platforms, VMware Office of the CTO
Mark Schweighardt, Director, Tanzu Service Mesh, VMware

Over the past several years, developers have leveraged public clouds for native application services to accelerate feature releases as a primary benefit. However, this has resulted in multiple operational silos—with each cloud having its own tools, processes, policies and site reliability engineering (SRE) teams—making it difficult to achieve overall application service-level agreements (SLAs). In this session, we will demonstrate how these multi-cloud operational challenges can be mitigated with a common distributed runtime that gathers appropriate telemetry and inter-service call graphs. With this information, the multi-cloud runtime can calculate when to issue specific control actions for applications running across any cloud to meet critical application SLAs.

See you all at VMworld!