The AI race is heating up, and access to cutting-edge infrastructure is a major hurdle for businesses of all sizes. Large companies often monopolize resources, leaving startups and SMEs struggling to compete. GreenNode, however, is changing the game in Southeast Asia.

In June 2024, GreenNode became the first unit in SEA to successfully launch the GPU cluster specializing in HPC applications, AI training, and integration. In this blog post, we'll take you behind the scenes of our groundbreaking journey to deploy one of the region's first NVIDIA-powered AI Cloud clusters. It wasn't easy, but through sheer determination and teamwork, we truly achieved something remarkable.

Successful deployment of Cloud GPU will provide a ready-made resource at a preferential cost suitable for the limited resources of small and medium enterprises in the Asian region. This also marks the acceleration of many new GPU clusters in the future to serve the growing needs of the Asian AI industry.

Stay tuned!

The AI Bottleneck in The APAC Region: Cost, Complexity, and Uncertainty

The AI revolution is transforming industries at an unprecedented pace, and businesses of all sizes are eager to harness its power. According to Market Data Forecast, The Asia-Pacific AI market is predicted to grow at a CAGR of 39.93% from 2024 to 2029 and the regional market size is expected to be valued at USD 356.13 billion by 2029 from USD 66.38 billion in 2024. In this race, pioneering technology and infrastructure are the core fuels needed to participate.

Blog 10.jpg — The AI bottleneck in the APAC region: cost, complexity, and uncertainty

However, joining the AI racing can cost you a treasure. NVIDIA CEO Jensen Huang recently predicted that organizations will spend $1 trillion in the next four years to upgrade data centers for AI. With resources and technology concentrated in the hands of the big boys of technology, AI startups, especially those in areas that lag behind in this technology, find it very difficult to create an explosive change.

In addition, the current AI landscape presents a significant challenge for newcomers. Access to AI infrastructure is often limited, expensive, and shrouded in uncertainty. Setting up a complex AI Cloud cluster can be a manual and frustrating process, filled with troubleshooting and delays.

How To Set Up A GPU Cluster from Scratch?

According to Forbes, the first-pass yield (the percentage of usable, high-quality units assembled on a production line on the first attempt) for manually integrating a GPU into a server can fall below 50%. 

Successfully setting up and deploying a GPU cluster is far from simple. This process requires countless working hours with engineers with various technical backgrounds.

Beyond the core GPU, constructing a GPU-accelerated cluster within your on-premises data center necessitates careful consideration of the devices and equipment, such as hardware, network, architecture, and so on.

Blog 9.jpg — Set up a GPU cluster from scratch is challenging

Building an “AI Factory” might require a dedicated facility or can be integrated into an existing high-power data center. As most of the general DC was built in 2 years and is not designed for that, finding the place to put the high-performance compute cluster is not an easy task.

Building a GPU cluster is a complex task that demands specialized expertise and meticulous planning. In this case, partnering with a reliable AI cloud provider like GreenNode allows you to bypass these challenges and focus on harnessing AI's power to quickly unlock your business's potential and opportunity.

GreenNode AI Cloud GPU Cluster Specifications

GreenNode's unwavering commitment to innovation has driven us to embark on an ambitious project: constructing the first large-scale AI Cloud cluster in Southeast Asia. Our GPU Cluster solution, partnered with NVIDIA as an NCP to build the latest GPU Cloud architecture, provides the ideal infrastructure for SMEs to leverage AI training and inference on an hourly basis.

With the latest models optimized for performance from leading chip supplier NVIDIA, we are dedicated to supporting businesses eager to participate in the AI race. While the challenge is significant, our goal is to level the playing field for Southeast Asian enterprises.

STT Blog-1.jpg — GreenNode supports businesses in the AI race

GreenNode's GPU cluster includes 128 bare-metal servers equipped with 1024 NVIDIA H100 Tensor Core GPUs. These next-generation GPU clusters are designed to meet customer demands for high performance and scalability in AI/ML and HPC workloads. They offer up to a 9x reduction in training time compared to previous-generation GPU-based instances, reducing training from days to hours.

Each GPU server provides:

8 x NVIDIA H100 Tensor Core GPUs with 900GB/s NVLink
640 GB of high-bandwidth GPU memory
3.35 TB of Memory bandwidth
GreenNode AI cloud GPU cluster specifications

Our GPU cluster is also qualified for most advanced global DC infrastructure standards, such as:

LEED Gold certification, TIA 942 Rating-3 DCDV, and Uptime Tier III standards.
Dedicated 20 MW capacity from our data center in Bangkok, Thailand.
The latest InfiniBand network technology from NVIDIA offers up to 3.2 Tbps of bandwidth.
Unique multi-tenant hyper-scale storage platform with options for 1, 2, 4, or 8 NVIDIA H100 Tensor Core GPUs per instance.
The high-speed storage from VAST Data which is certified by NVIDIA for different storage requirements, including checkpoint storage for the LLM training workload.

Once deployed, the GPU cluster, built on NVIDIA's latest architecture, will empower users to set up robust AI training models. GreenNode's strength lies in optimizing AI infrastructure specifically for LLM training models. To learn more about GreenNode LLM model applications, see: LLM Diary: GreenNode Makes Striking Debut with Exceptional Results at VLSP 2023.

GreenNode's Journey To AI Excellence With The First Large-Scale GPU Cluster In SEAs

Even with a team of over 200 expert AI/ML engineers, completing the hyper-scale GPU Cluster was a formidable challenge. We mobilized a specialized team, reaching out to colleagues from Israel and China for additional support. By collaborating across time zones and overcoming language barriers, we tackled technical hurdles head-on. Many nights were spent in data centers in Thailand, facing tight deadlines and navigating the complexities of this new technological frontier.

STT Blog-3.jpg — GreenNode's journey to AI excellence with the first large-scale GPU cluster in SEAs

Launching the cluster to meet customer demands involved intensive setup, rigorous testing, and meticulous troubleshooting. Despite the backing of our in-house engineers at VNG Digital Business and partners from NVIDIA, VAST Data, and STT Data Center, establishing a stable and high-availability infrastructure was incredibly challenging due to its novel nature.

However, GreenNode's spirit of collaboration and technical prowess shone through. Our unwavering commitment and technical expertise were the cornerstones of this success. Leveraging our experience in deploying cloud infrastructure for over 1,000 enterprises, GreenNode has tackled every challenge head-on with a resolute focus on operational excellence. This dedication to meticulous planning and flawless execution ensured a smooth launch and ongoing self-operation of this groundbreaking solution.

After six months of intensive work and dedication, we are thrilled to announce the general availability of our high-performance GPU clusters. Powered by thousands of NVIDIA H100 Tensor Core GPUs, GreenNode's GPU Cluster is engineered to create the most scalable, on-demand AI infrastructure. It is optimized for training complex large language models (LLMs) and developing generative AI applications, pushing the boundaries of what's possible in AI innovation.

STT Blog-5.jpg — GreenNode's AI Cloud cluster is now a reality

GreenNode's AI Cloud cluster is now a reality, positioning us as frontrunners in Southeast Asia's AI revolution. This achievement extends beyond technology; it embodies teamwork, dedication, and relentless pursuit of excellence. The successful deployment of the cluster marks a new era of AI training for Asian businesses and AI startups alike.

And Many More Innovations to Come...

The launch of Thailand's first hyper-scale AI GPU infrastructure will pave the way for AI innovation platforms across other regions. GreenNode, leveraging its pioneering position, will provide comprehensive support for businesses and partners in deploying AI infrastructure tailored to their needs. GreenNode is dedicated to delivering exceptional service reliability, and technological excellence, and meeting the increasing global demand for AI.

As NVIDIA's cloud provider and a partner in the NVIDIA Inception program, GreenNode is committed to supporting AI applications in the APAC region. With a team of experienced AI engineers, GreenNode is equipped to assist startups with infrastructure, architecture, and practical experience in developing and applying cutting-edge technologies.

Contact us today or schedule an appointment for a free consultation on GreenNode solutions.

GreenNode & NVIDIA Unveil Thailand's First Hyper-scale AI GPU Cluster at STT Data Center